CHWP B.19 Kibbee, "Baret's Alvearies"

3. Database Production and Distribution

All this brings us to the problems of representing these features of the dictionary, distributing them, and making possible the addition of layers of description after the original database has been produced. The minimum description needed reproduces the physical nature of the text, and at this a number of systems are adequate. Our ultimate goal in approaching this dictionary is to make it available over the network, with a marked-up base text linked to additional mark-up and commentaries from any scholars who wish to share their contributions.

When I and my research assistants approached this text in the mid-1980s, we were pointing towards a specific product, that would appear in print format: a kind of concordance of all French and English lexical materials from the Middle Ages and Renaissance. Each English word would be matched with the French equivalents proposed, listed in chronological order. At that point the database management options were limited, and we had never heard of SGML. As a result we culled from individual entries the French and English equivalents, to the extent that that was possible given the complexities of multiple entries of the kind described above. These we placed into an inadequate database program, and ultimately even the project we wanted to do proved impossible. Worse yet, anyone wanting to find some other kind of information from these dictionary sources (including ultimately ourselves) was simply out of luck.

Now much better full-text database managers and search programs are available, but interaction with SGML, often promised, is still difficult to come by. Creation of SGML documents is still quite a cumbersome affair, and taking full advantage of documents so prepared is not built in to most applications. Such efforts are underway, however. In conjunction with the National Center for Supercomputing Applications at the University of Illinois, we are pursuing projects that entail the integration of SGML-tagged documents in full international network distribution, with layers of tagging and reasonably powerful search mechanisms. One team is proposing the development of tools to be used within the WAIS (Wide-Area Information Server) format developed by Thinking Machines Corporation. Another is trying to extend the capabilities of Mosaic, a program created at the NCSA for networked distribution of documents and hypertext linking within such documents.

