Search results
Results from the WOW.Com Content Network
In a narrower sense, language resource is specifically applied to resources that are available in digital form, and then, "encompassing (a) data sets (textual, multimodal/multimedia and lexical data, grammars, language models, etc.) in machine readable form, and (b) tools/technologies/services used for their processing and management". [1]
Different standards for the machine-readable edition of lexical resources exist, e.g., Lexical Markup Framework (LMF) an ISO standard for encoding lexical resources, comprising an abstract data model and an XML serialization, [2] and OntoLex-Lemon, an RDF vocabulary for publishing lexical resources as knowledge graphs on the web, e.g., as Linguistic Linked Open Data.
Language resource management – Lexical markup framework (LMF; ISO 24613), produced by ISO/TC 37, is the ISO standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. [1] The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.
The Moby Project is a collection of public-domain lexical resources created by Grady Ward. The resources were dedicated to the public domain, and are now mirrored at Project Gutenberg . As of 2007 [update] , it contains the largest free phonetic database, with 177,267 words and corresponding pronunciations.
lexical form: surface form of a particular lexical entry, e.g., its written representation; lexical sense: word sense of a particular lexical entry. Note that a OntoLex-Lemon senses are lexicalized, i.e., they belong to exactly one lexical entry. For elements of meaning that can be expressed by different lexemes, use lexical concept.
UBY also inspired other projects on automatic construction of lexical semantic resources. [12] Furthermore, lemonUby was used to improve machine translation results, especially, finding translations for unknown words.
Simple word-sense induction algorithms boost Web search result clustering considerably and improve the diversification of search results returned by search engines such as Yahoo! [13] Word-sense induction has been applied to enrich lexical resources such as WordNet. [14]
Many techniques have been researched, including dictionary-based methods that use the knowledge encoded in lexical resources, supervised machine learning methods in which a classifier is trained for each distinct word on a corpus of manually sense-annotated examples, and completely unsupervised methods that cluster occurrences of words, thereby ...