Synopses & Reviews
Advances in technology are making massive data sets common in many scientific disciplines, such as astronomy, medical imaging, bio-informatics, combinatorial chemistry, remote sensing, and physics. To find useful information in these data sets, scientists and engineers are turning to data mining techniques. This book is a collection of papers based on the first two in a series of workshops on mining scientific datasets. It illustrates the diversity of problems and application areas that can benefit from data mining, as well as the issues and challenges that differentiate scientific data mining from its commercial counterpart. While the focus of the book is on mining scientific data, the work is of broader interest as many of the techniques can be applied equally well to data arising in business and web applications. Audience: This work would be an excellent text for students and researchers who are familiar with the basic principles of data mining and want to learn more about the application of data mining to their problem in science or engineering.
Table of Contents
Foreword. List of Contributors. List of Reviewers. Preface. 1. On Mining Scientific Datasets; C. Kamath. 2. Understanding High Dimensional and Large Data Sets: Some Mathematical Challenges and Opportunities; J. Chandra. 3. Data Mining at the Interface of Computer Science and Statistics; P. Smyth. 4. Mining Large Image Collections; M.C. Burl. 5. Mining Astronomical Databases; R.M. Humphreys, et al. 6. Searching for Bent-Double Galaxies in the First Survey; C. Kamath, et al. 7. A Dataspace Infrastructure for Astronomical Data; R. Grossman, et al. 8. Data Mining Applications in Bioinformatics; N. Ramakrishnan, A.Y. Grama. 9. Mining Residue Contacts in Proteins; M.J. Zaki, C. Bystroff. 10. KDD Services at the Goodard Earth Sciences Distributed Archive Center; C. Lynnes, R. Mack. 11. Data Mining in Integrated Data Access and Data Analysis Systems; R. Yang, et al. 12. Spatial Data Mining for Classification, Visualisation and Interpretation with Artmap Neural Network; W. Liu, et al. 13. Real Time Feature Extraction for the Analysis of Turbulent Flows; I. Marusic, et al. 14. Data Mining for Turbulent Flows; E.-H. Han, et al. 15. Evita-Efficient Visualization and Interrogation of Tera-Scale Data; R. Machiraju, et al. 16. Towards Ubiquitous Mining of Distributed Data; H. Kargupta, et al. 17. Decomposable Algorithms for Data Mining; R. Bhatnagar. 18. HDDI®: Hierarchical Distributed Dynamic Indexing; W.M. Pottenger, et al. 19. Parallel Algorithms for Clustering High-Dimensional Large-Scale Datasets; H. Nagesh, et al. 20. Efficient Clustering of Very Large Document Collections; I.S. Dhillon, et al. 21. A Scalable Hierarchical Algorithm for Unsupervised Clustering; D. Boley. 22. High-Performance Singular Value Decomposition; D.B. Skillicorn, X. Yang. 23. Mining High-Dimensional Scientific Data Sets Using Singular Value Decomposition; E. Maltseva, et al. 24. Spatial Dependence in Data Mining; J.P. LeSage, R.K. Pace. 25. Sparc: Spatial Association Rule-Based Classification; J. Han, et al. 26. What's Spatial About Spatial Data Mining: Three Case Studies; S. Shekhar, et al. 27. Predicting Failures in Event Sequences; M.J. Zaki, et al. 28. Efficient Algorithms for Mining Long Patterns in Scientific Data Sets; R.C. Agarwal, C.C. Aggarwal. 29. Probabilistic Estimation in Data Mining; E.P.D. Pednault, C. Apte. 30. Classification Using Association Rules: Weaknesses and Enhancements; B. Liu, et al.