Automatic construction of labeled clusters of named entities for IR

von Henock Tilahun Teffera

Softcover - 9783844334722

49,00 €

Versandkostenfrei

Auf meine Merkliste

Hinweis: Print on Demand. Lieferbar in 5 Tagen.

Lieferzeit nach Versand: ca. 1-2 Tage
inkl. MwSt. & Versandkosten (innerhalb Deutschlands)

Autorenfreundlich Bücher kaufen?!

Beschreibung

In this study we have tried to harvest labeled clusters of semantically similar named entities which can be used as a first step for web document clustering. We first collect ~44,000 named entities from a thesaurus which is constructed by Dekang Lin applying a word similarity measure based on their distributional pattern. Using their similarity metrics and CLUTO clustering software, we create 2000 semantically similar clusters of the named entities. Then we collect ~305,500 label-instance pairs from the 2007 English Wikipedia dump and implement a labeling algorithm presented by Benjamin Van Durme and M.Pasça (2008) to assign a label to the clusters. This automatic lableing task is able to assign a label which describes the majority of the named entities in 924 of the clusters, which is 46.2% of the total clusters. Finally we evaluate both the clustering and labeling tasks taking 86 randomly selected clusters and on the bases of two native English speaker evaluators¿ subjective judgment. According to these evaluators, the clustering task has a purity score of 0.7 and 55% of the labels are acceptable with different degree of accuracy.

Thesis for European Master's in Language and Communication Technology

Details

Verlag	LAP LAMBERT Academic Publishing
Ersterscheinung	23. Mai 2011
Maße	22 cm x 15 cm x 0.4 cm
Gewicht	113 Gramm
Format	Softcover
ISBN-13	9783844334722
Seiten	64

Schlagwörter

Computerprogrammierung und Softwareentwicklung, Informatik

Automatic construction of labeled clusters of named entities for IR

von Henock Tilahun Teffera

Autorenfreundlich Bücher kaufen?!

Beschreibung

Thesis for European Master's in Language and Communication Technology

Details

Schlagwörter

Sinn-volles Banking mit

Verantwortungseigentum

Mitglied im

Gefördert durch

Kontakt

Shop-FAQ

Autorenprogramm

Signieraktionen

Versand und Zahlung

Datenschutz

Shop-AGB

Impressum

Widerruf

Vertrag widerrufen

Automatic construction of labeled clusters of named entities for IR

von Henock Tilahun Teffera

Autorenfreundlich Bücher kaufen?!

Beschreibung

Thesis for European Master's in Language and Communication Technology

Details

Schlagwörter

Bekannt durch

Widerrufsantrag einreichen