This is a real HR Analytics project I developed and published on my GitHub. The goal: analyse turnover for a company with 1,470 employees, identify the factors that most influence employee attrition, and quantify the financial impact — all using Python, Jupyter Notebook, and Power BI.
The Problem: Turnover as a Hidden Cost
Turnover is one of the largest hidden costs in any organisation. Recruitment, selection, onboarding, learning curve, knowledge loss — each departure costs between 50% and 200% of the replaced employee's annual salary. The challenge is that few businesses can see this cost clearly and proactively.
The dataset used in this project contains data on 1,470 employees across different departments, salary ranges, and demographic profiles. Of the 1,470, 237 left the company — an attrition rate of 16.1%. The analysis was built to answer: who leaves, why, and what does it cost?
Methodology: ETL, EDA, and Dashboard
The project was structured in three integrated stages:
Stage 1 — ETL with Python
A Python script using Pandas handles extraction, transformation, and loading of the raw data. Key transformations applied:
- Standardisation of data types and handling of missing values
- Creation of derived variables: age groups, salary brackets, tenure groups
- Encoding of categorical variables for statistical analysis compatibility
- Export of the clean dataset for consumption by Power BI and the EDA notebook
Stage 2 — Exploratory Data Analysis (EDA) in Jupyter Notebook
Using Pandas, Matplotlib, and Seaborn, I analysed distributions, correlations, and attrition patterns by segment. This stage generated the insights that underpin the dashboard.
Stage 3 — Power BI Dashboard with DAX
The interactive dashboard consolidates all analyses into visuals that allow filtering by department, age group, job role, and period. DAX measures were created to calculate dynamic attrition rates and financial impact estimates.
Results: The Real Numbers
Attrition by Department
Turnover rates vary significantly by area:
- Sales: 20.6% attrition — the most critical department
- Human Resources: 19.0%
- R&D: 13.8%
This data alone changes the focus of retention actions: there is no reason to implement generic measures across the entire company when the problem is concentrated in specific areas.
Attrition by Age Group
- Up to 25 years: 35.8% — more than 1 in 3 young employees leave
- 36 to 45 years: only 9.2%
Younger employees have an exit probability almost four times higher. This points directly to the need for development programmes and career planning aimed at early-career stages.
Attrition by Tenure
- Up to 1 year: 34.9% attrition — the most critical period
- 10 years or more: 8.1%
Nearly 35% of employees with less than one year at the company leave. This reinforces that onboarding is where retention begins — or fails.
Impact of Overtime
This was one of the most significant insights from the analysis:
- With overtime: 30.5% attrition
- Without overtime: 10.4%
Employees who work overtime have an attrition rate nearly three times higher. The relationship is not merely statistical — it signals overload as a real burnout factor.
Salary Impact
- Low salary quartile: 29.3% attrition
- High salary quartile: 10.3%
- Average monthly salary of those who left: $4,787
- Average monthly salary of those who stayed: $6,832
The salary gap between those who stay and those who leave is nearly $2,000/month. The data confirms what retention literature has long indicated: below-average compensation is a direct predictor of attrition.
The Real Financial Cost
Based on the 237 departures and an estimated replacement cost of 50% of annual salary, the total estimated turnover cost reaches $6.8 million.
The analysis went further: with a reduction of just 15% in the attrition rate — an achievable number with targeted actions — the potential savings would be $1.02 million. This is the type of information that transforms an HR report into a conversation with the executive committee.
Project Deliverables
The complete project includes:
- ETL script in Python for data cleaning and transformation
- EDA notebook (Jupyter) with full analysis and visualisations
- Power BI dashboard (.pbix) with interactive filters and DAX measures
- Executive report in PDF available in Portuguese and English
All code and documentation are available in the GitHub repository.
What This Project Demonstrates
Three principles that guided this work and that I apply to every project:
- Clean data before any analysis: ETL consumed most of the time, but without it the visuals would be fast and misleading
- Segmentation is where the value lies: the global rate of 16.1% hides completely different patterns — young employees in Sales with overtime have an exit risk above 35%
- Translate data into decisions: the $1.02M potential saving is not decoration — it is the argument that justifies investment in retention
If your company faces similar challenges around turnover, absenteeism, or HR data analysis, the data analytics services at PC Data Insights include an initial diagnostic of your current process. See more projects in the portfolio and get in touch via the form or WhatsApp to discuss your case.