enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. BLEU - Wikipedia

    en.wikipedia.org/wiki/BLEU

    BLEU (bilingual evaluation understudy) is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Quality is considered to be the correspondence between a machine's output and that of a human: "the closer a machine translation is to a professional human translation, the better it is" – this is the central idea behind BLEU.

  3. Evaluation of machine translation - Wikipedia

    en.wikipedia.org/wiki/Evaluation_of_machine...

    The METEOR metric is designed to address some of the deficiencies inherent in the BLEU metric. The metric is based on the weighted harmonic mean of unigram precision and unigram recall. The metric was designed after research by Lavie (2004) into the significance of recall in evaluation metrics.

  4. NIST (metric) - Wikipedia

    en.wikipedia.org/wiki/NIST_(metric)

    It is based on the BLEU metric, but with some alterations. Where BLEU simply calculates n-gram precision adding equal weight to each one, NIST also calculates how informative a particular n-gram is. That is to say when a correct n-gram is found, the rarer that n-gram is, the more weight it will be given. [1]

  5. LEPOR - Wikipedia

    en.wikipedia.org/wiki/LEPOR

    Since IBM proposed and realized the system of BLEU [1] as the automatic metric for Machine Translation (MT) evaluation, [2] many other methods have been proposed to revise or improve it, such as TER, METEOR, [3] etc. However, there exist some problems in the traditional automatic evaluation metrics. Some metrics perform well on certain ...

  6. METEOR - Wikipedia

    en.wikipedia.org/wiki/METEOR

    METEOR (Metric for Evaluation of Translation with Explicit ORdering) ... The metric was designed to fix some of the problems found in the more popular BLEU metric ...

  7. ROUGE (metric) - Wikipedia

    en.wikipedia.org/wiki/ROUGE_(metric)

    ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, [1] is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced ...

  8. Paraphrasing (computational linguistics) - Wikipedia

    en.wikipedia.org/wiki/Paraphrasing...

    Metrics specifically designed to evaluate paraphrase generation include paraphrase in n-gram change (PINC) [21] and paraphrase evaluation metric (PEM) [22] along with the aforementioned ParaMetric. PINC is designed to be used with BLEU and help cover its inadequacies.

  9. Europarl Corpus - Wikipedia

    en.wikipedia.org/wiki/Europarl_corpus

    Koehn uses the BLEU metric by Papineni et al. (2002) for this, which counts the coincidences of the two compared versions—SMT output and corpus data—and calculates a score on this basis. [4] The more similar the two versions are, the higher the score, and therefore the quality of the translation. [1]