Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Quality control of bibliographic data is essential in assuring ETDs are both discoverable and accurately described. This poster session outlines semi-automated methods for achieving accurate transcription of bibliographic data in theses and dissertations for both digital repository and catalog records. Earlier methods involved digital repository staff manually inputting bibliographic data then cataloging staff copying that data from the digital repository website into a MARC template along with additional information from the PDF. With the new workflow, manual transcription of bibliographic information found in the PDF is replaced with automated extraction of the PDF data. The extracted PDF data and the ProQuest metadata are used as common source data for XSLT (eXtensible Stylesheet Language Transformations) programs to generate metadata for both the digital repository (Bepress) and the catalog (MARC21XML). Common modular XSLT programs are included in both transformations, and common XML reference tables provide an index of shared data values. The advantage of this new method is that it combines the quality control of transcription taken directly from the PDF with the time efficiency of automated repurposing of ProQuest metadata. By using common data sources, transformations, and reference tables, the metadata is accurate and consistent for both the digital repository and the catalog. And since staff time is greatly reduced, the ETDs are made available to patrons more quickly.