Synopses & Reviews
Focuses on a few of the important clustering algorithms in the context of information retrieval.
Synopsis
As digital libraries and the World Wide Web continue to grow exponentially, the ability to find useful information increasingly depends on the indexing infrastructure or search engine. There is a growing need for a more automated system of partioning data sets into groups, or clusters. Clustering techniques can be used in these information retrieval applications, as well as to discover natural groups in data sets without having any background knowledge of the characteristics of the data. This book focuses on a few of the most important clustering algorithms, both classical algorithms and recent research, for computer scientists working in data-intensive areas.
About the Author
Jacob Kogan is an Associate Professor in the Department of Mathematics and Statistics at the University of Maryland, Baltimore County. Dr. Kogan received his PhD in Mathematics from Weizmann Institute of Science, has held teaching and research positions at the University of Toronto and Purdue University. His research interests include Text and Data Mining, Optimization, Calculus of Variations, Optimal Control Theory, and Robust Stability of Control Systems. Dr. Kogan is the author of Bifurcations of Extremals in Optimal Control and Robust Stability and Convexity: An Introduction. Since 2001, he has also been affiliated with the Department of Computer Science and Electrical Engineering at UMBC. Dr. Kogan is a recipient of 2004-2005 Fulbright Fellowship to Israel. Together with Charles Nicholas of UMBC and Marc Teboulle of Tel-Aviv University he is co-editor of the volume Grouping Multidimensional Data: Recent Advances in Clustering.
Table of Contents
1. Introduction and motivation; 2. Quadratic k-means algorithm; 3. BIRCH; 4. Spherical k-means algorithm; 5. Linear algebra techniques; 6. Information-theoretic clustering; 7. Clustering with optimization techniques; 8. k-means clustering with divergence; 9. Assessment of clustering results; 10. Appendix: Optimization and Linear Algebra Background; 11. Solutions to selected problems.