| Location: | District of Columbia |
|---|---|
| Posted: | Jun 17, 2025 |
| Agency: | LIBRARY OF CONGRESS |
| Type of Contract: | Awards |
| Type of Government: | Federal |
| Category: |
|
| Solicitation No: | 2025-LGC-0006 |
| Publication URL: | To access bid details, please log in. |
The Library of Congress (LC) collects information resources from around the world and makes them known and available to potential users via bibliographic records that describe the resources. The bibliographic descriptions contain typical bibliographic data such as titles, names of creators and other associated entities, publication information such as publisher and place of publication, and other information. This information is presented in the publications in various scripts, including many non-Latin scripts such as Chinese, Korean, Cyrillic, Arabic, Greek, Hebrew, and over 30 additional scripts. The Library of Congress records descriptive information in the original script of the item being cataloged (technology allowing) but needs to transliterate certain data elements of the description into the Latin script for various processing components and in some cases to assist end users. When technology does not support non-Latin scripts, LC staff must manually transliterate into the Latin script more of the descriptive information.
The Library of Congress maintains transliteration tables for over 75 languages and scripts, ALA-LC Romanization Tables: Transliteration Schemes for Non-Roman Scripts and makes them available on its web site: http://www.loc.gov/catdir/cpso/roman.html. These tables are jointly maintained by LC and the American Library Association (ALA). These tables are used by United States libraries and many libraries outside the United States. The Library of Congress catalog contains several million bibliographic records for resources in non-Latin scripts and collects over 75,000 additional non-Latin resources each year. The Library of Congress requires a utility that can transliterate between non-Latin and Latin scripts using the transliteration tables approved by LC and ALA.
A utility, called Scriptshifter, has been developed for transliteration of 20+ scripts in the Balkan/Caucasian, Slavic, Turkic, and Chinese script families, and for Korean, Greek, Arabic, and Hebrew. Scriptshifter needs to be continually enhanced to incorporate additional scripts and improve the tool for very complex scripts like the Arabic, Southeast Asian, and several Asian scripts like Japanese in which the Library receives resources. Continual updating of the software framework to improve efficiency is necessary as the technical possibilities change and the transliteration tables change.
The Contractor shall design, code, test, and document the additions to the Scriptshifter transliteration utility capability for non-Latin data into the Latin alphabet according to the ALA-LC Romanization Tables, and where possible the conversion of Latin script transliteration to non-Latin script. The utility will focus on research and improvement of Indic and related languages such as Devanagari and Brahmi scripts, Southeast Asian scripts such as Thai, Laotian, Khmer, Burmese, Tibetan, and Arabic scripts such as Kurdish, Sindhi, Persian, Pushto, Urdu, and Mophah. In addition, research on developing a Japanese transliteration tool and refinement of other Asian scripts such as Korean and Chinese will be done. More specifically, the contractor shall:
The utility must also carry out reverse transliteration, converting Latin transliterated strings into non-Latin strings, where feasible. The utility must remain adjustable as transliterations change.
The contractor will review the software utility as a whole and make general improvements to the framework. One area of focus will be the Aksharamukha tool that has been incorporated for Asian scripts. Work will be done related to authentication and external use of the tool.

With GovernmentContracts, you can: