Data Mining with Python

Is python that useful?


Data mining is the process of obtaining valuable data reports from unorganized and organized sets of databases. It helps in discovering predictive information which could then help in summarizing a report to the upper management.

The desired outcome from data mining is to create a model from a given data set that can have its insights generalized to similar data sets. A real-world example of a successful data mining application can be seen in automatic fraud detection from banks and credit institutions. 

There are 5 techniques of data mining namely

1. Regression
2. Classification
3. Cluster analysis
4. Association & correlation analysis
5. Outlier analysis

Applications of data mining include genomic sequencing, social network analysis, or crime imaging – but the most common use case is for analyzing aspects of the consumer life cycle. Companies use data mining to discover consumer preferences, classify different consumers based on their purchasing activity, and determine what makes for a well-paying customer – information that can have profound effects on improving revenue streams and cutting costs.

All these applications can be prepared with help of few python based libraries and frameworks. The frameworks used to mine data are

1. NumPy
2. SciPy
3. Pandas
4. Seaborn