✍️ 🧑‍🦱 💚 Autor:innen verdienen bei uns doppelt. Dank euch haben sie so schon 418.243 € mehr verdient. → Mehr erfahren 💪 📚 🙏

Vision-Based Deep Web Data Extraction For Web Document Clustering

Vision-Based Deep Web Data Extraction For Web Document Clustering

von M. Lavanya
Softcover - 9786204956060
79,90 €
  • Versandkostenfrei
Auf meine Merkliste
  • Hinweis: Print on Demand. Lieferbar in 5 Tagen.
  • Lieferzeit nach Versand: ca. 1-2 Tage
  • inkl. MwSt. & Versandkosten (innerhalb Deutschlands)

Autorenfreundlich Bücher kaufen?!

Beschreibung

The VDEC approach comprises of two phases: 1) Vision-based web data extraction, and 2) Web document clustering. In phase 1, the web page information is segmented into various chunks from which, surplus noise and duplicate chunks are removed using three parameters, such as hyperlink percentage, noise score and cosine similarity. To identify the relevant chunk, three parameters such as Title word Relevancy, Keyword frequency-based chunk selection, Position features are used and then, a set of keywords is extracted from those main chunks. Finally, the extracted keywords are subjected to web document clustering using Fuzzy C-Means clustering (FCM). The proposed vision based deep web data extraction is implemented and tested using synthetic dataset. The results are compared with existing two algorithms, the one is Vision-based Data Record Extraction (ViDE) and another is Mining Data Region (MDR) algorithm. From the experimental results that has been performed on two different synthetic datasets, the results showed that the proposed VDEC method can achieve stable and good results of about 99.2% and 99.1% precision value in both datasets with different threshold values provided.

Approach to vision-based deep web data extraction for the clustering of the web document (VDEC)

Details

Verlag LAP LAMBERT Academic Publishing
Ersterscheinung 22. Juni 2022
Maße 22 cm x 15 cm x 1.1 cm
Gewicht 274 Gramm
Format Softcover
ISBN-13 9786204956060
Seiten 172