{"product_id":"unsupervised-information-extraction-by-text-segmentation-von-eli-cortez-altigran-s-da-silva","title":"Unsupervised Information Extraction by Text Segmentation","description":"\n                                \n                \u003cp\u003e\n                                        A new unsupervised approach to the problem of Information Extraction by Text Segmentation (IETS) is proposed, implemented and evaluated herein. The authors’ approach relies on information available on pre-existing data to learn how to associate segments in the input string with attributes of a given domain relying on a very effective set of content-based features. The effectiveness of the content-based features is also exploited to directly learn from test data structure-based features, with no previous human-driven training, a feature unique to the presented approach. Based on the approach, a number of results are produced to address the IETS problem in an unsupervised fashion. In particular, the authors develop, implement and evaluate distinct IETS methods, namely \n                    \n                    \u003ci\u003eONDUX\u003c\/i\u003e\n                                        , \n                    \n                    \u003ci\u003eJUDIE\u003c\/i\u003e\n                                         and \n                    \n                    \u003ci\u003eiForm\u003c\/i\u003e\n                                        .\n                \n                \u003c\/p\u003e\n                                \n                \u003cp\u003e\n                                        \n                    \u003ci\u003eONDUX\u003c\/i\u003e\n                                         (On Demand Unsupervised Information Extraction) is an unsupervised probabilistic approach for IETS that relies on content-based features to bootstrap the learning of structure-based features. \n                    \n                    \u003ci\u003eJUDIE\u003c\/i\u003e\n                                         (Joint Unsupervised Structure Discovery and Information Extraction) aims at automatically extracting several semi-structured data records in the form of continuous text and having no explicit delimiters between them. In comparison with other IETS methods, including \n                    \n                    \u003ci\u003eONDUX\u003c\/i\u003e\n                                        , \n                    \n                    \u003ci\u003eJUDIE\u003c\/i\u003e\n                                         faces a task considerably harder that is, extracting information while simultaneously uncovering the underlying structure of the implicit records containing it.\n                    \n                    \u003ci\u003e iForm\u003c\/i\u003e\n                                         applies the authors’ approach to the task of Web form filling. It aims at extracting segments from a data-rich text given as input and associating these segments with fields from a target Web form.\n                \n                \u003c\/p\u003e\n                                \n                \u003cp\u003eAll of these methods were evaluated considering different experimental datasets, which are used to perform a large set of experiments in order to validate the presented approach and methods. These experiments indicate that the proposed approach yields high qualityresults when compared to state-of-the-art approaches and that it is able to properly support IETS methods in a number of real applications. The findings will prove valuable to practitioners in helping them to understand the current state-of-the-art in unsupervised information extraction techniques, as well as to graduate and undergraduate students of web data management.\u003c\/p\u003e\n                            \n            \u003cdiv class=\"aw-variant-hidden-subtitle-div\" id=\"aw-variant-subtitle-9783319025964\"\u003e\u003ch3\u003e\u003c\/h3\u003e\u003c\/div\u003e","brand":"Libri","offers":[{"title":"Softcover - 9783319025964","offer_id":39426180350045,"sku":"9783319025964","price":53.49,"currency_code":"EUR","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0940\/0622\/files\/96656227-0d95-43b8-b9bf-cbe8e1db0fd2.jpg?v=1772258254","url":"https:\/\/shop.autorenwelt.de\/en\/products\/unsupervised-information-extraction-by-text-segmentation-von-eli-cortez-altigran-s-da-silva","provider":"Autorenwelt Shop","version":"1.0","type":"link"}