✍️ 🧑‍🦱 💚 Autor:innen verdienen bei uns doppelt. Dank euch haben sie so schon 418.243 € mehr verdient. → Mehr erfahren 💪 📚 🙏

Using Roget's Thesaurus to Determine the Similarity of Texts

Using Roget's Thesaurus to Determine the Similarity of Texts

von Jeremy Ellman
Softcover - 9783838338408
79,00 €
  • Versandkostenfrei
Auf meine Merkliste
  • Hinweis: Print on Demand. Lieferbar in 5 Tagen.
  • Lieferzeit nach Versand: ca. 1-2 Tage
  • inkl. MwSt. & Versandkosten (innerhalb Deutschlands)

Autorenfreundlich Bücher kaufen?!

Beschreibung

This thesis addresses the problem of extracting a representation of text''s meaning from its content. The solution investigated is based on the use of Roget''s thesaurus as an external knowledge source and can be used to analyse texts of any length or complexity. The resulting document representation can then be compared to others, producing a new method for text similarity assessment. All coherent texts contain embedded sequences of words that are related in meaning. These sequences can be detected by identifying simple relationships between the relevant thesaural entries in which the words are found. The identification of initial sequences drives the addition of further related words into conceptually related ¿lexical chains¿. Every coherent text contains many lexical chains of different lengths and strengths. These may be used to represent the broad subject matter of a text. By identifying the key concept of each chain, and relating this to its presence we may produce an attribute value vector of concepts and their strengths. This may then be used to identify other texts as closer or further away in meaning.

A Thesis in Computational Linguistics

Details

Verlag LAP LAMBERT Academic Publishing
Ersterscheinung 14. Januar 2010
Maße 22 cm x 15 cm x 1.4 cm
Gewicht 358 Gramm
Format Softcover
ISBN-13 9783838338408
Seiten 228