The Practice of Digitization
The UNO Library undertakes digitization projects for one of three primary purposes: preservation, public access, or exhibition. These purposes are not mutually exclusive, of course, but the primary purpose significantly impacts the selection of materials and procedures followed during the digitization process.

Example of scanned text
Preservation is the primary purpose for digitization projects in the Arthur Paul Afghanistan Collection. Preservation should be understood as a scholarly activity, intended to preserve rare or unique items that would otherwise be unavailable to researchers.
Digital preservation copies are not automatically available to the public. Copyright law allows libraries to make up to three backup copies of print materials only if replacement copies cannot be obtained. The same law allows public distribution of texts only if they have passed into the public domain.
The focus of digital preservation is on creating, storing and maintaining high-resolution digital images of each element of the text from cover to index. At the University Library, preservation images are stored in the Uncompressed Tagged Image File Format ("TIFF" or "TIF") at the highest possible scan resolution (generally 400 - 600 dpi). This results in very large files, called "master files," of extreme fidelity to the original.
Each master file -- usually corresponding to a single page of text -- is logged to a database, stored on a network server named vault, and burned to compact disk. Each disk is cataloged in the Library catalog.

Shaista at the scanning station
If the Library has only the legal right to preserve the item, a second compact disk is generally created (making three copies in total). The compact disks may be used by researchers in order to protect the original from further damage.
If the Library has legal right to redistribute the text, several additional steps are required. After the master file is created, it is used to generate additional formats. These additional formats usually include:
- a lower-resolution image file suitable for distribution over the Internet,
- a plain text version created using Optical Character Recognition (OCR) software, and
- an Adobe Acrobat (PDF file) configured to allow text searching and multi-page printing.
These additional files are loaded on the Library's digital repository server, Stax. An important aspect of digital preservation is called "persistence." This involves insuring that the digital files remain accessible as software and hardware change through time. Master files are all flagged with a "refresh interval" in years. This means that the files are evaluated and replaced or supplemented with new formats on a regular basis -- usually three to six years.
The governing policies and procedures followed in the digitization process are contained in a 30-page Digitization Manual written by Library staff.




