Reducing Customer Churn: A Telecom Case Study Using Logistic Regression in SPSS
A mid-sized telecom provider was losing roughly 26% of its customers every year. Marketing kept spending on win-back offers, but nobody could say who was about to leave or why. This is the analysis we ran to answer both questions — and the steps are reproducible on any churn dataset you have.
The goal was simple: build a model that scores each active customer with a probability of leaving, so retention budget goes to the people actually at risk rather than to everyone.
The Dataset
We worked with 7,043 customer records exported from the billing system. The relevant fields:
- Churn (target): 1 = left in the last quarter, 0 = stayed
- Tenure: months the customer has been with the company
- MonthlyCharges: current monthly bill
- Contract: month-to-month, one-year, two-year
- TechSupport: yes / no
- PaymentMethod: electronic check, mailed check, bank transfer, credit card
Overall churn rate in the data was 26.5%, so the classes were imbalanced but workable.
Why Logistic Regression
The outcome is binary (churned or not), so ordinary linear regression is the wrong tool — it can predict probabilities below 0 or above 1, and its assumptions are violated by a 0/1 outcome. Binary logistic regression models the log-odds of churning and returns a clean probability between 0 and 1 for every customer. It also gives odds ratios, which translate directly into a sentence a manager understands: "month-to-month customers are 3× more likely to leave."
Step 1: Prepare the Variables
Before running anything, two things mattered.
Recode categorical predictors. SPSS handles this inside the procedure, but you must tell it which variables are categorical. We also set a sensible reference category — for Contract we used two-year as the reference so every odds ratio reads as "compared to the most loyal customers."
Check multicollinearity. Tenure and Contract are related (long-tenure customers tend to be on longer contracts). We ran a quick linear regression first just to read the VIF values from Analyze > Regression > Linear > Statistics > Collinearity diagnostics. All VIFs were under 2.5, well below the common threshold of 10, so we kept both.
Step 2: Run the Logistic Regression
In SPSS:
Analyze > Regression > Binary Logistic
- Dependent: Churn
- Covariates: Tenure, MonthlyCharges, Contract, TechSupport, PaymentMethod
- Click Categorical and move Contract, TechSupport, and PaymentMethod into the categorical box. Set the reference category (we used First for Contract after reordering, Last for the others).
- Under Options, tick Hosmer-Lemeshow goodness-of-fit, Classification plots, and CI for exp(B) 95%.
Step 3: Read the Output (The Tables That Matter)
Omnibus Tests of Model Coefficients
This table tests whether the model as a whole beats a model with no predictors. Our Chi-square = 1,184.7, df = 7, p < .001 — the predictors collectively explain churn far better than chance.
Model Summary
- -2 Log likelihood: 5,803.1
- Cox & Snell R²: 0.18
- Nagelkerke R²: 0.26
Nagelkerke R² of 0.26 means the model explains about 26% of the variation in churn. That is respectable for behavioral data — people are not perfectly predictable, and anything above 0.20 here is useful in practice.
Hosmer-Lemeshow Test
Chi-square = 7.9, p = .44. For this test you want a non-significant result — it means predicted probabilities match observed churn across risk groups. p = .44 says the model fits well.
Variables in the Equation
This is the table you build the story from. The key column is Exp(B) — the odds ratio.
| Predictor | B | Exp(B) (Odds Ratio) | p |
|---|---|---|---|
| Tenure | -0.034 | 0.97 | <.001 |
| MonthlyCharges | 0.029 | 1.03 | <.001 |
| Contract (month-to-month) | 1.12 | 3.06 | <.001 |
| TechSupport (no) | 0.62 | 1.86 | <.001 |
| PaymentMethod (electronic check) | 0.48 | 1.62 | <.001 |
How to read these:
- Tenure, Exp(B) = 0.97: each additional month with the company cuts the odds of churn by about 3%. A two-year customer is dramatically safer than a two-month one.
- Contract month-to-month, Exp(B) = 3.06: holding everything else constant, month-to-month customers have three times the odds of churning compared to two-year customers. This was the single biggest driver.
- TechSupport = no, Exp(B) = 1.86: customers without tech support were 86% more likely to leave.
Step 4: Check Predictive Accuracy
The Classification Table showed 80.4% of cases correctly classified at the default 0.5 cutoff. But because churners are the minority, accuracy alone is misleading — the model was good at spotting "stayers" and weaker on "leavers."
So we lowered the cutoff to 0.30 (Options > Classification cutoff). This caught 72% of actual churners (sensitivity) at the cost of more false alarms — exactly the right trade-off when a retention call costs $4 but a lost customer costs $900.
What the Business Did With It
The model produced two clear, actionable findings:
- Contract type was the lever. The company launched a campaign offering a one-month bill credit to move month-to-month customers onto one-year contracts. The predicted odds drop alone justified the offer.
- Bundling tech support reduced risk. Customers without support churned far more, so support was bundled free into the mid-tier plan.
Six months later, quarterly churn among the targeted high-risk segment fell from 42% to 29%. The model did not "stop" churn — it told the team where to aim, which is the entire point of predictive analytics.
Mistakes to Avoid (We Saw All of These)
- Reading B instead of Exp(B). The raw coefficient is in log-odds and means nothing to stakeholders. Always report the odds ratio.
- Trusting overall accuracy on imbalanced data. Look at sensitivity and specificity separately, and tune the cutoff to the business cost.
- Skipping the Hosmer-Lemeshow test. A model can have significant predictors and still fit poorly across risk bands.
- Including future information. Make sure no predictor "leaks" the outcome (e.g., a cancellation reason field). That inflates accuracy and is useless for prediction.
Reporting It in APA Format
A binary logistic regression was performed to predict customer churn from tenure, monthly charges, contract type, tech support, and payment method. The model was statistically significant, χ²(7) = 1,184.7, p < .001, and explained 26% of the variance (Nagelkerke R²). Contract type was the strongest predictor; month-to-month customers were 3.06 times more likely to churn than two-year contract customers (95% CI [2.6, 3.6], p < .001).
Need help building a churn or prediction model on your own data? At Insighter Digital we run logistic regression, decision trees, and survival models in SPSS and Python, and we deliver the output as a plain-language recommendation your team can act on — not just a table. Get in touch.
Keep Reading
Get More Guides Like This
Free tutorials on SPSS, Excel, Python, and research methods delivered to your inbox.
Need Professional Data Analysis Services?
Save time and get accurate results. Our experts provide statistical analysis services using SPSS, Excel, and Python — from hypothesis testing to APA-formatted reports.