Four years ago, I published a story in Technology Review about the National Virtual Translation Center, an FBI project to use technology to link contract linguists and give them tools to do translation and analysis jobs for the 16 intelligence agencies. I recently discovered that this piece (which to date is the only piece of journalism written about the NVTC) is finally available for free online here. I also looked at the FBI’s 2009 budget justification, which contains more info about the NVTC than documents from previous years (even Secrecy News calls the document “remarkably detailed”), which provide a scope of how much the NVTC has grown:
–“Since 2003, the NVTC has accepted over 1,200 requirements in 60 difference languages. In FY 2007 NVTC translated over 350,000 text pages and 350 hours of audio material, a 40% increase over FY 2006. With regard to translations, in 2003 the NVTC performed 20 translation jobs for its customers. In FY 2007 performed approximately 5,500 translation jobs for its customers. A 64 percent increase is expected between FY 2007 and FY 2008 based on the volume of incoming material from active military campaigns and the expansion of incoming Asian- and African-language materials.”
–“Over 73 percent of material collected by the IC and stored in the HARMONY database is untranslated. HARMONY is the IC’s centralized database for foreign military, technical
and open-source documents and their translations. Overall, the IC backlog of untranslated material is growing exponentially, with an estimated five petabytes backlogged.”
–There are 54 full and part-time independent contract linguists all over the country, connected virtually to the NVTC HQ and working through backlogged material. In 2009 the FBI wants to add 8 more contractors.
–These linguists are linked through TONS (Translator Online Network Support), which is an enterprise-scale computer system that gives them access to A variety of language processing capabilities to language software (automatic optical character recognition, machine translation, named entity extraction, and transliteration) and other language tools for translators. The FBI wants $1.2 million to support TONS.
–Though the NVTC is under the FBI, the director is paid by the NSA, and four other employees are either paid by the NSA or CIA. Five others are FBI employees, which makes for a total management staff of 9, but in 2009 they want to add 2 more. These employees do “outreach, coordination, quality control and…provide agency-unique
expertise in supporting IC clients.
–The NVTC was mandated by Congress to be a “clearinghouse” for language resources for the intelligence community; the portal was built in 2006 but it has no content. the FBI wants $166,000 to support the portal.
All of which is interesting, and more detail than I was able to get four years ago (obviously, since they’d just opened their doors, they had no track record to refer to). In the budget justification, there’s no reference to past successes, only to imminent gaps if funding isn’t grown. (The 2008 justification is more direct: “Failure to fund this initiative could cause serious harm to national security should actionable intelligence remain un-translated.”)
A Google search of TONS turned up a more detailed description of TONS and NVTC vendors from 2004:
The FBI will be acquiring an estimated total of $7,000,000 in language software products, as identified by Lockheed Martin, directly from the following companies. BBN, Basis, Stellent, BlueShoe Technologies and Abbysoft will provide ingest, prioritization and retrieval capabilities, including language identification, audio processing, and optical character recognition; Virage will provide video processing; Trados will provide translation memory, translation tools and collaboration; Global Sight will provide task tracking, quality control tools, and workflow management.
Another tidbit from the budget justification: In 2007, the FBI reported over 21,000 “positive encounters” with suspected terrorists. “”A positive encounter is one in
which an encountered individual is positively matched with an identity
in the Terrorist Screening Data Base.”