Home > Background > Phase I Partners > Content provider partner institutions

Content provider partner institutions

Content provider partners share the following project activities:

  • Participate in development of selection rationale that is foundation for tools suite (lead, Arizona State Library and other state library partners)

  • Provide feedback to guide the development of requirements for the tools suite

  • Test and provide feedback on the tools suite

  • Test and evaluate how content is ingested and represented in repositories

  • Donate batch-loaded and Web-harvested content to the repository testbed for use in repository evaluation and long-term preservation research activities

State Library partners

Arizona State Library, Archives and Records
Phoenix, Arizona

Connecticut State Library
Hartford, Connecticut

Illinois State Library
Springfield, Illinois

State Library of North Carolina
Raleigh, North Carolina

Wisconsin Department of Public Instruction
Division for Libraries, Technology and Community Learning

Partners providing batch collections for the digital repository testbed

Michigan State University Library
Digital Audio Files from the Vincent Voice Library


The Voice Library was established at the MSU Libraries in 1962 under the direction of sound archivist G. Robert Vincent, and later named the Vincent Voice Library in his honor. The collection he established has grown into one of the nation's largest voice archives, with more recordings of more than 50,000 speakers from all walks of life. Speeches, lectures, performances, interviews, and broadcasts are all represented among its holdings.

Digitization of major portions of the Vincent Voice Library are being accomplished through a National Science Foundation Digital Library Initiative grant, the "National Gallery of the Spoken Word." Highlights of the digital collection include:

  • U.S. presidents of the 20th century, from an 1896 campaign speech by William McKinley to interviews with George W. Bush.

  • Icons of popular culture including Jack Benny, Abbott & Costello, and Katherine Hepburn.

  • Examples of early sound recording, such as Big Ben striking the evening hours in 1890.

  • Leaders of the civil rights movement, from W.E.B. DuBois to Rosa Parks, from Booker T. Washington to Malcolm X.

  • Distinguished jurists, including Oliver Wendell Holmes, Thurgood Marshall, and Sandra Day O'Connor.

  • Extensive coverage of the major events of the 20th century

Tufts University
Images, tools and TEI documents from the Perseus Project


The Perseus digital library is a heterogeneous collection of SGML/XML texts and digital images pertaining to the Archaic and Classical Greek world, late Republican and early Imperial Rome, the English Renaissance, and 19th Century London. The texts are integrated with morphological analysis tools, student and advanced lexica, and sophisticated searching tools that allow users to find all of the inflected instantiations of a particular lexical form. The current corpus of Greek texts contains approximately four million words by thirty-three different authors. Most of the texts were written in the fifth and fourth centuries B.C.E., with some written as late as the second century C.E. The corpus of Latin texts contains approximately one million five hundred thousand words mostly written by authors from the republican and early imperial periods. Collections of English language literature from the Renaissance and the 19th century were added in the fall of 2000, increasing the collection by 10 million words and 10,000 images, and bringing significant new mapping and modeling facilities online, including a London Atlas, with maps of London from 1780 to the present, and 3D VRML models of multiple sites in Egypt, England, Greece, and Italy. The entire Perseus Digital Library contains more than 60,000 images, thousands of maps, and roughly 160 million words in five languages.

University of Illinois
WILL Broadcasting Service of the University of Illinois

Sound and video recordings from WILL, a local public radio and television

WILL-TV.s digital archives include hundreds of locally-produced television programs: documentaries, public affairs programs, election interviews, debates, and town hall meetings, music and arts programs, educational and children.s programs. Most are currently archived in DVC Pro format; some are in DV format.

WILL Radio's digital archives include locally-produced public affairs programs, talk shows, documentaries, news features, agricultural programs, cultural specials, musical performances, election specials, candidate interviews and debates, and science features. These are currently in the form of several thousands of .wav audio files stored on CDs; hundreds of Digital Audio Tape (DAT) recordings; an ENCO digital audio broadcast system with 60 Gigabytes of audio material in MPEG2 format; a RealServer archive containing 85 Gigabytes of RealMedia (.rm) files; and additional local audio material and work product stored on Minidisc, JAZ, and hard drives.

University of Illinois
Division of Management Information

University administrative data from legacy databases from the 1960s

This unit provides reports for public consumption, for administrators, college partners, individual faculty members, and student theses. They have 35 yrs of data on a mainframe and are in the process of having a Mainframe Data Transition (MDT) project move the operational systems to a server under the domain of the Office of Planning and Budgeting.

University of Illinois Library
Digitized aerial photographs of the State of Illinois

Based on benchmarks previously established by the Map and Geography Library, the Illinois State Geological Survey, and Scantech Color Systems, Inc., of Champaign, the pilot worked toward developing protocols and an interface to make scanned photography of Illinois flown between 1935 and 1955 available through the Web. Visitors are now able to view JPG image surrogates resized from archival TIFFs which were produced by scanning the photographs at approximately 720 dpi.

By the end of 1997 a test database of 270 photographs flown in 1939 and 1954 of central Will County became available for access and evaluative purposes. The pilot project was being supported by the Illinois State Library, and Scantech Color Systems, Inc. of Champaign, Illinois. During the summer of 1998, the project was expanded to cover a wider area, including portions of Champaign, Cook, Fulton, Mason, and Peoria counties. Also, the interface was redesigned to allow more precise searching.

University of Illinois
National Center for Supercomputing Applications

Color Digital Orthophoto Quadrangle (DOQ) images of Lake County, IL

These images are 1'x1' ground spatial resolution, natural color digital orthoimagery received by NCSA from the USGS site. These images present a particular challenge to digital preservation in that while they are publicly available for download, access time is unacceptable to many scholars interested in working with this type of information. We have available as a batch-dump images representing Lake County, IL stored on 47 DVDs.

University of Illinois Library
Full-text scientific and technical journals from the Grainger DLI-I test bed

The DLI-I test bed is made up articles marked up in SGML and representing materials from over 50 journals from 1995 to the present published by the following societies and institutes: American Institute of Physics, American Physical Society, American Society of Civil Engineers, and the Institute of Electrical Engineers. These materials represent a corpus of material that is fully governed by Copyright law and of great interest to scientific scholars.