MIKRO KNJIGA
    od 1984. god.
    Knjige▹FilozofijaNauka
    Representations and Metrics in High-Dimensional Data Mining
    Autor: Miloš Radovanović
    Strana: 243
    Ostali detalji
    Veličina slova: A A
    „Informatičko doba" u kom se nalazimo predstavlja brojne izazove svakodnevnom životu i radu. Jedan od ozbiljnijih izazova je velika količina informacija sa kojom treba „izaći na kraj", što je posledica veoma brzog razvoja tehnologije i sveopšte kompjuterizacije. Iako nam, s jedne strane, računari pomažu da se izborimo s informacijama bitnim za nas (putem, na primer, elektronske pošte, društvenih mreža, internet pretraživača, itd.), s druge strane na računarima se gomilaju ogromne količine informacija u „sirovom" stanju, odnosno obliku u kom ih je teško analizirati, upotrebiti za namene drugačije od prvobitne, ili iz njih nešto naučiti. Informatička disciplina „data mining" (analiza podataka) omogućava da se iz velike količine sirovih podataka izvuku zanimljive pravilnosti i korisno znanje. Ova knjiga se bavi problemima koji proizilaze iz velikog broja atributa u modernim bazama podataka, često nazivanim „prokletstvom dimenzionalnosti", i njegovim uticajem na različite tehnike i aspekte analize podataka.

    The current "information age" presents numerous challenges in everyday life and work. One of the more serious challenges is the great volume of information that needs to be processed and "defeated," which is a consequence of the rapid development of technology and ubiquitous computerization. Although, on one hand, computers help us to cope with information we find relevant (for example, by means of e-mail, social networks, search engines, etc.), on the other hand large volumes of information are amassed on computers in "raw" form, that is in a form which is difficult to analyze, use for means other than initially intended, or learn from. The computer-science discipline of data mining enables the extraction of interesting patterns and useful knowledge from large volumes of raw data. This book deals with problems stemming from large numbers of attributes in modern data bases, often referred to as "the curse of dimensionality," and its influence on different techniques and aspects of data mining.


    CONTENTS

    PREFACE, 5

    I PRELlMINARIES, 7
    1 INTRODUCTION, 9
    1.1 Book Outline, 11
    1.2 Contributions, 12

    2 MACHINE LEARNING, DATA MINING, AND INFORMATION RETRIEVAL, 14
    2.1 Data Representation, 17
    2.2 Distance and Similarity Measures, 23
    2.3 Classification, 27
    2.4 Semi-Supervised Learning, 45
    2.5 Clustering, 47
    2.6 Outlier Detection, 55
    2.7 Information Retrieval, 58
    2.8 Dimensionality Reduction, 62
    2.9 Summary, 72

    II METRICS, 73
    3 THE CONCENTRATION PHENOMENON, 75
    3.1 Concentration of Distances, 75
    3.2 Concentration of Cosine Similarity, 78
    3.3 Proofs of Theorems 7 and 8, 81

    4 THE HUBNESS PHENOMENON, 85
    4.1 Related Work, 86
    4.2 Observing Hubness, 87
    4.3 Explaining Hubness, 90
    4.4 Proof of Theorem 9, 95
    4.5 Discussion, 104

    5 HUBNESS AND MACHINE LEARNING, 112
    5.1 Related Work, 112
    5.2 Observing Hubness in Real Data, 113
    5.3 Explaining Hubness in Real Data, 116
    5.4 Hubs and Outliers, 117
    5.5 Hubness and Dimensionality Reduction, 119
    5.6 Impact of Hubness on Machine Learning, 120
    5.7 Summary and Future Work, 136

    6 HUBNESS AND TIME SERIES, 138
    6.1 Related Work, 140
    6.2 Observing Hubness in Time Series, 141
    6.3 Explaining Hubness in Time Series, 141
    6.4 Hubness and Dimensionality Reduction, 144
    6.5 Impact of Hubness on Time-Series Classification, 146
    6.6 Experimental Evaluation, 151
    6.7 Summary and Future Work, 155

    7 HUBNESS AND INFORMATION RETRIEVAL, 156
    7.1 Observing Hubness in Text Data, 157
    7.2 Explaining Hubness in Text Data, 160
    7.3 Hubness and Dimensionality Reduction, 166
    7.4 Impact of Hubness on Information Retrieval, 167
    7.5 Summary and Future Work, 171

    III DOCUMENT REPRESENTATlON AND FEATURE SELECTlON, 175
    8 TERM WEIGHTING FOR TEXT CATEGORIZATION, 177
    8.1 Related Work, 178
    8.2 The Experimental Setup, 178
    8.3 Results, 181
    8.4 Summary and Future Work, 191

    9 TERM WEIGHTING AND FEATURE SELECTION, 193
    9.1 The Experimental Setup, 194
    9.2 Results, 196
    9.3 Summary and Future Work, 204
    9.4 A Note on Hubness, Feature Selection, and Generation, 206

    10 CONCLUSION, 209
    A TERM WEIGHTING IN THE BOW REPRESENTATION, 213
    Al Term Weighting Without Stemming, 213
    A2 Term Weighting With Stemming, 214

    BIBLIOGRAPHY, 217
    ABOUT THE AUTHOR, 235
    O AUTORU, 236
    SAŽETAK, 237


    Detaljni podaci o knjizi
    Naslov: Representations and Metrics in High-Dimensional Data Mining
    Izdavač: Izdavačka knjižarnica Zorana Stojanovića
    Strana: 243 (cb)
    Povez: meki
    Pismo: latinica
    Format: 22.5 x 14 cm
    Godina izdanja: 2011
    ISBN: 978-86-7543-231-9
    Naručite
    Cena: 1.200 RSD
    Cena za inostranstvo:
    12,00 EUR
    Kom.:
    ili
    Naručite telefonom:
    Nije radno vreme
    nismo dostupni na telefonu.





    Ocene i mišljenja čitalaca
    Budite prvi koji će svoje mišljenje podeliti sa drugima (morate biti prijavljeni)


    Pitanja, odgovori, mišljenja...
    Prijavite se ovde i pošaljite vaša mišljenja i pitanja našim urednicima i čitaocima

    Poruku poslaoPoruka
    MIKRO KNJIGA D.O.O.
    Kneza Višeslava 34, 11030 Beograd, Srbija
    e-pošta: prodaja(а)mikroknjiga.rs
    Komercijalna banka: 205-33117-65
    Matični broj: 07465181
    Šifra delatnosti: 5811
    PIB: 100575773
    Dokumenti o identifikaciji

    © Mikro knjiga 1984-2024