1 Answers
- Some exploratory data analysis (get a first insight)
- Transactions by date
- Count of customers Vs number of items bought
- Total items Vs total basket per customer
- Total items Vs total basket per area
- Create new features (per customer):
Counts:
- Total baskets (unique days)
- Total items
- Total spent
- Unique product id
Distributions:
- Items per basket
- Spent per basket
- Product id per basket
- Duration between visits
- Product preferences: proportion of items per product cat per basket
- Too many features, dimension-reduction? PCA?
- Clustering:
- PCA
- Interpreting model fit
- View the clustering by principal component axis pairs PC1 Vs PC2, PC2 Vs PC1.
- Interpret each principal component regarding the linear combination it’s obtained from; example: PC1=spendy axis (proportion of baskets containing spendy items, raw counts of items and visits)
Your Answer