Search results
Results from the WOW.Com Content Network
Language resource management – Lexical markup framework (LMF; ISO 24613), produced by ISO/TC 37, is the ISO standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. [1] The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.
Different standards for the machine-readable edition of lexical resources exist, e.g., Lexical Markup Framework (LMF) an ISO standard for encoding lexical resources, comprising an abstract data model and an XML serialization, [2] and OntoLex-Lemon, an RDF vocabulary for publishing lexical resources as knowledge graphs on the web, e.g., as Linguistic Linked Open Data.
OntoLex-Lemon is widely used for lexical resources in the context of Linguistic Linked Open Data.Selected applications include OASIS Lexicographic Infrastructure Data Model and API (LEXIDMA), a framework for internationally interoperable lexicographic work [14]
The Moby Project is a collection of public-domain lexical resources created by Grady Ward. The resources were dedicated to the public domain, and are now mirrored at Project Gutenberg . As of 2007 [update] , it contains the largest free phonetic database, with 177,267 words and corresponding pronunciations.
To facilitate the interoperability between lexical resources, linguistic annotations and annotation tools and for the systematic handling of linguistic categories across different theoretical frameworks, a number of inventories of linguistic categories have been developed and are being used, with examples as given below.
UBY-LMF [3] [4] is a format for standardizing lexical resources for Natural Language Processing (NLP). [5] UBY-LMF conforms to the ISO standard for lexicons: LMF, designed within the ISO-TC37, and constitutes a so-called serialization of this abstract standard. [6]
The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations .
In a narrower sense, language resource is specifically applied to resources that are available in digital form, and then, "encompassing (a) data sets (textual, multimodal/multimedia and lexical data, grammars, language models, etc.) in machine readable form, and (b) tools/technologies/services used for their processing and management". [1]