Das Buch ist derzeit nicht auf Lager

Duplicate detection in XML data

Autoren

Melanie Weis

Mehr zum Buch

Duplicate detection consists in identifying multiple representations of a real-world object. Fast and correct duplicate detection on large amounts of data is essential in many applications, and allows avoiding errors such as sending multiple copies of a catalog to the same customer or making critical business decisions based on the wrong number of sold products. Performing automatic duplicate detection is far from being trivial, and the major issues duplicate detection methods need to consider are the quality of the result and runtime. So far, duplicate detection research concentrated on detecting duplicates in a single table of a relational database. However, with the growth of the World Wide Web and XML as the de facto standard for data publishing and data exchange on the Web, we face the problem of identifying duplicates in data structures being more complex than a single relational table. This book that essentially consists of four parts describes duplicate detection solutions for XML data that obtain highly satisfactory results both in terms of quality and in terms of runtime. First, the XML duplicate detection problem is formalized before various solutions to the quality problem are presented and evaluated. In the third part, different algorithms for fast duplicate detection are described and evaluated. Finally, two systems using duplicate detection are presented.

Parameter

ISBN: 9783865532633
Verlag: WiKu

Buchvariante

2008, paperback

Buchkauf

Dieses Buch ist derzeit nicht auf Lager.

Autoren

Mehr zum Buch

Parameter

Kategorien

Buchvariante

Buchkauf