Multilingual Access (WP1)
The goal of this work package is the configuration of a standard information retrieval system (search engine) in order to allow the user to type some words in his/her language and retrieve all relevant books/texts in all available languages. The system must be able to take advantage of all the information contained in the catalogues/texts. It must not confuse the user by proposing irrelevant translation. It must be able to detect proper nouns in order to suspend the translation in case they are ambiguous with a common noun.
From the infrastructure point of view it must allow an easy integration with several commercial OPAC.
The objectives will be achieved by implementing the following actions:
- Query enrichment via thesauri.
- Query enrichment via corpus based expansion.
- Monolingual disambiguation via Part Of Speech tagging.
- Cross-lingual translation disambiguation via catalogue classification.
- Proper Noun identification.
Work package leader:
Deliverables:
- D 1.1. Configuration of CLIR (first release): M3 [confidential]
- D 1.2. Definition of the structure and programmatic interfaces for components access: M6 [public report]
- D 1.3. Integration of CLIR with enrichment/disambiguation/translation/identification modules: M9 [confidential]
- D 1.4. Fully integrated CLIR system: M12 [confidential]



