Business Analytics: Customer Predictive models

Introduction

Customer campaigns are one of the top expenses in Marketing any product or service. Specifically, for CPG and FMCG companies the frequency and share of such spends are even higher. With most traditional Brick and mortar companies taking their businesses online and explosion of digital interconnect, more and more Consumers rely on the internet for Product research before a purchase decision. This trend is prominent in Consumer durable, Apparel, Grocery etc., and in service areas such as Beauty, Health care, Rental to name a few.

With advancement in all spheres, either Technology or the delivery mechanisms in the 21^st century connected world, Customer Analytics as a service is increasingly at the reach of Mom and Pop boutiques, huge retail chains and corporations alike.

One of the core Analytics technique that is used in Campaign planning and management is Predictive Analytics. Even though there are multiple ways of Predicting customer responses for a given campaign like Logistic regression, Bayesian statistics, CHAID etc., Business heads and Entrepreneurs face the dilemma of how to best use the results thrown out by these models and how to use them in deploying their future campaigns.

Let us explore some of the dilemma faced by Marketers, not just in interpreting the Predictive model outputs, but also the intangible components that is critical for optimizing Campaign plans.

Prediction model outputs

Almost all traditional Prediction models require decent historical data, based on which the mathematical models can be trained to predict the Customer behavior. To be effective, the Historical data used in Analysis need to be atleast 2 full Time series cycles. If the minimum Time interval considered in the Analysis is a month, we need atleast 24 months of history and preferably 36 months.

Let us consider a Binary Predictive model like a “Customers propensity to rent in the next 30 days”. This model has a yes or no decision to make, for example in predicting either a Customer will return in the next 30 days or not. For building such models, the data set we require will be of the following form:

A unique identifier for identifying each customer

For each customer identified, a set of characteristics like the Customer demography, past buying pattern etc.,

The actual response behavior of the customer, in our case simply either the customer responded - yes or no.

The formulation of the final equation for the predictive model will look like,

Response by the customer ~ Independent variables that describe the customer demography

+ Independent variables for past customer buying pattern

Using the above data, a Prediction model built by an Analyst will give out the Predicted return propensity. The predicted return propensity from any model is expressed as a Probability or a percentage chance that a given customer will respond, given the traits as described by the variables.

To evaluate the performance of the model itself, we need the actual response of the customers and the predicted response of the customers as predicted by the Predictive model both expressed in binary format 1 – Return / 0 – Non Return.

Using the above data, a summary table as below can be constructed to evaluate the performance of the Predictive model.

The Science behind Predictive model selection

Reading the Confusion matrix or Classification matrix

Measuring the prediction model behavior

Return Prediction accuracy:

Sensitivity = True positives / Identified positives by the model

= 11,824 / 140,652

= 91.59%

Non-Return Prediction accuracy:

Specificity = True Negatives / Identified negatives by the model

= 8,119 / 11,292

= 71.90%

Overall Accuracy of the model = (True positives + True Negatives) / Total sample size

= (128,828 + 8,119) / 151,944

= 90.12%

Let us consider for illustration purposes, two valid Predictive models built using two different sets of variables.

Sample Model 1: Variables significantly contributing to Prediction

Sl	Model metrics
1	Percentage times online forms used for Reservation past 12 months
2	Recency in days
3	Unique rental location counts past 12 months
4	Percentage out of state rentals past 18 months

Model 1 Confusion matrix @ prediction probability cutoff of 50%

		Actual			Model performance: Sensitivity = 38.08% Specificity = 94.7% Accuracy = 81.14%
		1	0
Predicted	1	13914	22618	36532
	0	6029	109383	115412
		19943	132001	151944

Sample Model 2: Variables significantly contributing to Prediction

Sl	Model metrics
1	Percentage times online forms used for Reservation past 12 months
2	Recency in days
3	Unique rental location counts past 12 months
4	Percentage out of state rentals past 18 months
5	Rental count past 24 months
6	Medium time between rentals in days past 12 months

Model 2 Confusion matrix @ prediction probability cutoff of 50%

		Actual			Model performance: Sensitivity = 71.90% Specificity = 91.59% Accuracy = 90.12%
		1	0
Predicted	1	8119	3173	11292
	0	11824	128828	140652
		19943	132001	151944

Model comparison

	Sensitivity	Specificity	Accuracy
Model 1	38.08%	94.7%	81.14%
Model 2	71.90%	91.59%	90.12%

Model selection Dilemma

A Marketer responsible for rolling out campaigns needs to decide between the above models. The process for such a decision is not straight forward and is highly contextual. It is driven by the specific Business objectives the company sets.

In model 1, the sensitivity is 38.08%, meaning that if we take 10 customers whom the prediction model has identified as return customers, we can be confident that only 4 (rounding off 38.08%) of them are identified correctly on the other hand the remaining 6 of the customers might not return (or identified incorrectly as return customers by the model).

Now consider model 2, a sensitivity of 71.90% means out of 10 customers predicted by the model as returning customers, 7 are predicted correctly and 3 are incorrect.

If a campaign is launched targeting at customers who are more likely to return, model1 is 38% effective while model 2 is 71% accurate. Model 2 might be preferred over model 1 in this context.

Now, let us look at the Specificity of the model.

In model 1, a specificity of 94.7% means, that out of 10 customers that the model has predicted as Non-returnees, the prediction accuracy of the model is close to ~95%. While, a 91.59% specificity for model 2 means that model 2 would predict 9 out of 10 Non-return customers correctly.

Supposedly a campaign is targeted at Customers who are unlikely to return, then model 1 with a 94.7% effectiveness might be a better option compared to model 2 with 91.59% effectiveness.

The difference in Sensitivity between models 1 and 2 is so wide that we are able to make a binary decision in model selection. But there will be other areas to be considered as well.

One of them being, to what extent the explanatory variables used in the model explain the outcome, and if the relationship between the two as explained by the model make any Business sense. The evaluation of the contributing variables and causation effect on the outcome is subject to judgment and requires the Analyst to possess Domain expertise. That brings us to the art part of Model evaluation.

The Art behind Predictive model selection

For models to be effective, Business objectives needs to be spelt out as a Precise and Specific Low level statement. For example, what Business objective the model is expected to achieve for a specific segment of Customers.

Common areas that will dictate the model selection process

1.Business objective: What is the pressing Business driver that lead to the initiation of the model build. Is the objective targeted at bringing back Inactive customers or to improve business with Cross sell customers or enhancing the service for gold class customers.

2.Spend per Customer: If a specific campaign has to be rolled out, what will be the Marketing dollars to be spent per individual customers. The spend per customer for different campaigns needs to be known or estimated upfront.

3.Return per customer: For every marketing dollar spent, what is the estimated return per customer? This area becomes contextual as the return will highly depend on the demography of the Customer segment. This leads us to the need for Customer segmentation prior to building the Predictive model which is a exciting and separate topic of discussion. It will make it worthwhile in most contexts to commence Customer segmentation with a RFM analysis and/or Clustering techniques.

4.Customer base: This represents the segments within the customer base targeted with the intended campaign(s)

5.Marketing Budgets and the preference of allocation of these budgets: Organizations don’t have unlimited Marketing budgets for campaigns, and so when there is a constraint on the overall budget what campaigns take precedence over others need to be decided. If there are is a mix of different Campaigns targeting different Customer segments, Budgetary constraints at Region, Brand and Product Category levels, it calls for a structured allocation of funds using Optimization techniques like Linear programming.

If the customer base is too large, say a million plus, then a more accurate model needs to be considered as even a small variation in Accuracy of the model means a large difference in ineffective Marketing spends for targeting customers.

If the target Customer base is relatively small, the Business can maximize the number of customers to be reached out, provided the Cost benefit is substantial, typically a strategy with higher value Cross-sell / Up-sell customer

Some of the typical Customer segments are as below:

1.Bring home: Customers who were once loyal but don’t have any recent transaction

2.Retention: These are fence sitters who are engaging with the Business adequately but constantly looking out for alternatives. They are yet to be fully bought into the benefits of the Brand.

3.Up sell / Cross sell: Typically these customers are regulars to the Business but with Average returns. Sops can be offered to stimulate the Customers into transacting more or increasing the depth of each transaction by spending more per visit.

4.Gold class customers: These are the class of loyal customers who bring in highest lifetime contribution to business. They might continue patronizing the business in spite of absence of any traditional Campaigns because of their past experience and loyalty to the Brand and services. Depending on the type of Business, monetary incentive alone won’t be a motivating factor for these class of customers but an enhanced Gold class service might be

5.Active / Inactive: Even though this sounds innocent as it is, a much deeper effort needs to be spent in understanding the threshold time period in days or months or years before which a Customer can be classified as Inactive. These customers also need to be treated differently. Inactive is a class of customers who still have some chance of returning and very different from dormant customers who are written off.

Focused targeting

A detailed segmentation exercise preceding the Prediction model build will give a deeper understanding of the Customers and enable classification of Customers as above. Armed with such insights of their Customers, this will help Businesses take contextual decision with respect to different class of Customers. Meaningful understanding of the Customers comes not only with understanding of Demography but also with analyzing their past behavior. A wide set of well thought through metrics to measure the past behavioral traits of Customers is the key in a successful Prediction model.

If Short term Monetary Return maximization is the primary Business objective, given certain budgetary constraints, it will be easy to see that the investment money will go into top x deciles and the bottom (n-x) deciles will be omitted from the campaign.

But most of the time, a Company’s Marketing objective will cover more than short term Return maximization like Retention of New customers, Enhanced Customer experience, Up sell, Cross sell etc.

The overall objective will take the shape of optimizing the budget allocation across campaigns with both Tangible and Intangible results rather than maximizing short term monetary returns.

Most businesses also need to focus on Long term benefits to the Business, for example in a Wrist watch or Apparel industry, the Brand affinity needs to be built among the teenager segment who might not be currently the top contributing segment to the Business. But at some point in time in the future they would eventually, and it will be difficult to build Brand loyalty from scratch at that age group. Similarly, if a Segment is not among the current Top contributors, it cannot be written off as yet and needs to be carefully considered in campaigns for longer term returns.

Business Analytics

Tuesday, September 10, 2013

Customer Predictive models - A Marketers’ Dilemma