Database Mining References
KDD, Data Mining - overview
- Data Mining Techniques ,
M. Berry and G. Linhoff,
John Wiley, 1997
- a readable, if manager-oriented, overview of data mining
- or their second book: Mastering Data Mining : Art and Science of Customer Relationship Management, Wiley and Sons, 1999
- KDNuggets: the
best data mining site
has more pointers
- Data Preparation for Data Mining,
Morgan Kaufmann, 1999.
- Data Mining Solutions,
C. Westphal and T. Blaxton,
John Wiley, 1998
- E. Tufte,
The Visual Display of Quantitative Information,
Envisioning Information and
his other books, (Graphics Press).
- These are wonderful books about how to present data graphically.
- Visual Revelations,
Clustering and Collaborative Filtering
- Recommender Systems
- Pointers to many companies and classic papers
Cluster Analysis, 3rd Edition,
Halsted Press, 1993.
- A very readable short overview of clustering methods.
- "Locally Weighted Learning",
C. G. Atkeson, S. A. Schaal and A. W. Moore,
AI Review,Volume 11, Pages 11-73 (Kluwer Publishers) 1997
- a detailed overview of K-nearest neighbor and related methods
- k-means clustering code
- with a cumbersome input format, but it runs well
- standard packages like R, Matlab, and all data mining software have many more options
Decision trees, CART and MARS
- Classification and regression trees,
Leo Breiman ... et al.,
Wadsworth International Group, 1984.
- The original CART book; a bit dated, but still a classic
- C4.5: Programs for Machine Learning,
- A modern presentation of decision tree methods. Very readable and
comes with code.
- "Multivariate adaptive regression splines,"
Annals of Stat. 1991, 19(1) 1-141.
- A technical paper describing MARS
- CART and MARS software
- free Fortran version is apparently no longer available from Statlib
- commercial version available from
- Other versions (which I have not tested) include ones from
- A good free package for decision trees
- decision trees are also available in most statistics packages
- Neural Networks for Pattern Recognition,
Oxford Press, 1995.
- An excellent overview of multilayer perceptron and radial basis
function neural networks from a statician's point of view.
- Neural Networks A Comprehensive Foundation,
- A good overview of Neural Nets from an electrical engineers viewpoint;
covers a wide range of neural network types
- The Neural network FAQ
- overview of neural nets and pointers to software
- is one of the better free packages
More Neural net pointers [postscript]
- stepwise regression
- logistic regression
- Linear Statistical Methods,
- logistic regression is nicely covered on pp. 307-310.
- Statistical Models in S, Chambers and Hastie, Wadsworth, 1992
- covers a range of advanced statistical methods
Bayesian Belief Nets
Charniak, Eugene, "Bayesian Networks without tears", AI Magazine
12(4):50-63, Winter 1991.
- Intro to Bayesian networks for beginners.
Neapolitan, Richard E., "Probabilistic Reasoning in Expert Systems:
Theory and Algorithms", John Wiley and Sons, 1990.
- Practical guide to implementation.
- Finn V. Jensen, "Introduction to Bayesian Networks" 1996,
Springer Verlag; ISBN: 0387915028
available at amazon
Pearl, Judea, "Probabilistic Reasoning in Intelligent Systems:
Networks of Plausible Inference", Morgan Kaufmann, San Mateo,
- Theoretical framework for Bayesian networks - The book that got the whole field going
- Lots more references
- Bayesian networks
- What are belief nets good for and where to get code.
- other good free software: Netica
Belief Network software
- "Genetic Algorithms.",
Scientific American. July 1992. pp. 66-72.
- a nice overview of genetic algorithms
- Genetic Algorithms in search, optimization, and machine learning,
- An introduction to Genetic Algorithms,
MIT Press, 1996
Hidden Markov Models and Speech
- Statistical Methods for Speech Recognition,
MIT Press, 1998
- Information Theory, T.M. Cover and J. A. Thomas.
- a solid introduction to Information theory
Database Mining Companies
Many of these products - and others - are briefly described in an article by Two Crows.
Current Research - General