top of page
Consumers-want-to-buy-from-brands-who-share-their-values-Euromonitor-on-global-trends-with

CONSUMER SEGMENTATION

Nowadays corporate companies are spending millions on Business Intelligence for accurate decision-making in order to identify the behavior of their customer, target new customers, and sell their products and services more.

getty_1211574810_lcocre.jpg

It also helps companies to define their marketing campaigns. In offline shopping retail chains often use market-basket analysis to know their customers. It is very important for online sellers and E-commerce companies to segment their customers in order to generate maximum revenue.

This project aims to categories online customers into 5 clusters based on transaction per user.

Apart from transactions, other eight parameters are taken into consideration.

InvoiceNo: Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c', it indicates a cancellation.


StockCode: Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product.


Description: Product (item) name. Nominal.


Quantity: The quantities of each product (item) per transaction. Numeric.


InvoiceDate: Invice Date and time. Numeric, the day and time when each transaction was generated.


UnitPrice: Unit price. Numeric, Product price per unit in sterling.


CustomerID: Customer number. Nominal, a 5-digit integral number uniquely assigned to each customer.


Country: Country name. Nominal, the name of the country where each customer resides.


Sample dataset:

Screenshot (39).png

These models are used to classify customers into 5 categories.

LOGISTIC REGRESSION

For this algorithm, the accuracy achieved in prediction is: 86.29 %. The learning curve is also plotted.

cs_LR.png

K-NEAREST NEIGHBORS

Accuracy: 79.78 %
Learning curve:

CS_KNN.png

DECISION TREE

For this algorithm, the accuracy achieved in prediction is: 83.24 %. The learning curve is also plotted.

CS_DT.png

RANDOM FOREST

Precision: 89.61 %

CS_RF.png

ADABOOST

Precision: 54.57 %
The relation between training examples and training score and cross-validation score are showed using the line chart.

CS_ADABOOST.png

GRADIENT BOOSTING CLASSIFIER

Precision: 89.47 %
Learning Curves:

CS_gradient boosting.png

At this level, Random Forest, Gradient Boosting and k-Nearest Neighbors are mixed for predictions because this leads to a slight improvement in predictions:
Precision: 75.46 %

bottom of page