Most data science tutorials use clean, pre-packaged datasets. The Iris flower dataset. The Titanic passenger list. Carefully curated examples where the data behaves itself, the columns are labeled clearly, and the interesting patterns emerge on cue.
Real data doesn’t work like that.
Real data is messier, richer, and far more interesting – especially when it comes from something as personal and dynamic as your own body. Fitness data collected from a wearable device sits at a fascinating intersection: it’s granular enough to reveal genuine patterns, personal enough to make those patterns meaningful, and messy enough to require real analytical thinking.
This article walks through a complete, end-to-end analysis of personal fitness data – the kind you might export from an Apple Watch or similar device. We’ll move through every stage: understanding the dataset, checking quality, exploring distributions, investigating relationships, and extracting insights that actually mean something.
No toy examples. No pre-cleaned inputs. Just real data, real questions, and real analytical thinking from start to finish.

The Dataset: What We’re Working With
For this walkthrough, we’re working with cycling workout data exported from Apple Health – the kind of data our Apple Health Cycling Analyzer processes directly in your browser.
Imagine a dataset with the following structure:
- 180 rows – one per recorded cycling session over approximately eight months
- 14 columns – covering workout metadata, physiological measurements, and performance metrics
The columns include:
| Column | Type | Description |
| date | Date | Date of the workout |
| duration_min | Numerical | Total ride duration in minutes |
| distance_km | Numerical | Distance covered in kilometers |
| avg_heart_rate | Numerical | Average heart rate during ride (bpm) |
| max_heart_rate | Numerical | Peak heart rate during ride (bpm) |
| avg_power_w | Numerical | Average power output in watts |
| elevation_gain_m | Numerical | Total elevation gained in meters |
| calories | Numerical | Estimated calories burned |
| avg_speed_kmh | Numerical | Average speed in km/h |
| temp_celsius | Numerical | Ambient temperature during ride |
| hr_drift_pct | Numerical | Heart rate drift percentage |
| efficiency_factor | Numerical | Power-to-heart-rate ratio |
| training_load | Numerical | Composite training stress score |
| performance_level | Categorical | Subjective rating: Low / Medium / High |
This is a realistic structure for exported Apple Health cycling data with derived metrics added. Before we touch a single analysis, let’s ask the first EDA question: what do we actually have, and does it match what we expected?
Step 1: First Look – Inventory and Sanity Check
The first thing any data scientist does with a new dataset is look at it. Not analyze it. Just look.
Scanning the first few rows immediately reveals a few things:
- Dates run from early spring through late autumn – a natural cycling season. This makes sense.
- Some avg_power_w values are zero. That’s a red flag – zero average power on a cycling ride isn’t physically possible unless the power meter failed to record.
- The temp_celsius column has a handful of values at -99. This is almost certainly a sensor or export error – a placeholder for missing temperature data.
- performance_level contains three values: “Low”, “Medium”, “High” – but also a handful of “Med” entries. Inconsistent formatting of the same category.
In the first five minutes, we’ve already found four data quality issues. This is normal. This is why you look before you analyze.
Shape Check
- 180 rows × 14 columns – a small but workable dataset
- No immediately obvious duplicate rows
- Missing values present in avg_power_w (zeros) and temp_celsius (-99 placeholders)
Action Items from Step 1:
- Flag and investigate zero-power rides
- Replace -99 temperature values with proper missing value markers
- Standardize “Med” to “Medium” in performance_level
Step 2: Data Quality – Finding What’s Broken
Now we go deeper on quality. Rather than just noting issues, we quantify them.
Missing and Invalid Values
Zero power readings: 11 out of 180 rides show avg_power_w of zero. Investigating the corresponding duration_min values reveals these are all under 8 minutes – likely accidental recordings or device startup glitches. Decision: remove these 11 rows. They don’t represent real rides.
Temperature placeholders: 23 rows contain -99 in temp_celsius. These should be treated as missing, not as data. Decision: replace with null. We can still use these rows for analysis – we just won’t include temperature in analyses that require it.
Formatting inconsistencies: 7 rows have “Med” instead of “Medium.” Decision: standardize to “Medium.”
After cleaning: 169 valid ride records remain.
Outlier Check
A quick scan of the numerical ranges surfaces one interesting case: a single ride shows avg_heart_rate of 201 bpm. For an adult cyclist, this is at the absolute physiological ceiling – possible during a maximal sprint but suspicious as an average for an entire ride. Cross-referencing with duration_min shows it was a 4-minute ride. This is plausible – a very short, maximal effort. Decision: keep it, but note it as an extreme case.
The efficiency_factor column shows a range from 1.1 to 2.8. Domain knowledge tells us that efficiency factor (power divided by heart rate, normalized) typically sits between 1.2 and 2.0 for trained recreational cyclists, with values above 2.0 indicating exceptional aerobic fitness. The upper values here are worth investigating further – are they genuine peak performances, or calculation artifacts?
Step 3: Univariate Analysis – Understanding Each Variable
With clean data in hand, we examine each variable individually. What does the distribution of each measurement look like?
Ride Duration
The duration histogram shows a bimodal distribution – two distinct humps. One cluster sits around 45-60 minutes, another around 90-120 minutes. This is immediately interpretable: the rider has two distinct ride types. Shorter weekday sessions and longer weekend rides. The data is reflecting real behavior, not random variation.
Insight: This dataset contains two meaningfully different types of ride. Any analysis that treats all rides the same may be mixing apples and oranges.
Average Heart Rate
Heart rate distribution is roughly bell-shaped and centered around 142 bpm, with most rides falling between 128 and 158 bpm. The distribution is slightly left-skewed – there are more rides at lower heart rates than at very high ones. This is typical of a training-conscious rider who keeps most sessions in moderate aerobic zones.
Average Power Output
Power output shows a right-skewed distribution. Most rides cluster between 140 and 200 watts, but there’s a tail of higher-power rides stretching toward 280+ watts. These are almost certainly the harder efforts – interval sessions or races. The skew makes sense: hard efforts are rarer than easy ones in a balanced training program.
Performance Level
The categorical distribution:
- Low: 31 rides (18%)
- Medium: 94 rides (56%)
- High: 44 rides (26%)
Moderate class imbalance – “Medium” dominates. This is worth noting for any classification modeling: a naive model could achieve 56% accuracy by predicting “Medium” every time, which would be meaningless.
Heart Rate Drift
HR drift – the percentage increase in heart rate from the first half to the second half of a ride at similar power output – is a key indicator of aerobic fitness and fatigue. Lower drift means the cardiovascular system is coping well with the exercise load.
The distribution here is right-skewed, with most values between 2% and 8%, but a tail extending to 18%+. High drift values cluster in summer months, suggesting heat and accumulated fatigue as contributing factors.
Step 4: Bivariate Analysis – How Variables Relate
This is where the real insights start to emerge. We move from looking at variables individually to examining how they interact.
Power vs. Heart Rate
The scatter plot of average power against average heart rate shows a clear positive relationship – as expected. Higher power output requires more cardiovascular effort.
But the scatter isn’t uniform. At higher power values, the relationship tightens considerably – high-power rides reliably produce high heart rates. At lower power values, there’s more spread – some low-power rides have surprisingly high heart rates. These are likely recovery rides done in hot conditions, or rides early in the season when fitness was lower.
Insight: The power-to-heart-rate relationship is real but context-dependent. Efficiency factor – the ratio of these two variables – captures this context more cleanly than either variable alone.
Efficiency Factor Over Time
Plotting efficiency factor as a line chart over the eight-month period reveals one of the most satisfying patterns in the entire dataset: a clear upward trend from spring to midsummer, followed by a plateau, then a slight decline in autumn.
This is the classic fitness arc of a cycling season. The rider built aerobic fitness through spring and early summer training, reached peak condition around July-August, then began the natural decline as volume reduced in autumn.
This pattern validates the efficiency factor as a meaningful metric. It’s tracking something real – not noise.
Temperature vs. Heart Rate Drift
The scatter plot of ambient temperature against HR drift shows a moderate positive relationship: hotter rides produce more heart rate drift. This makes physiological sense – heat stress adds cardiovascular load independently of exercise intensity.
The practical implication: high HR drift readings in summer should be partially attributed to heat, not solely to fitness or fatigue. Context matters when interpreting any single metric.
Performance Level vs. Key Metrics
Using box plots to compare distributions of key metrics across performance levels reveals:
Efficiency factor by performance level:
- Low performance: median ~1.45
- Medium performance: median ~1.65
- High performance: median ~1.85
A clear, consistent separation. Efficiency factor is a strong discriminator of performance level – the higher the ratio of power to heart rate, the better the performance outcome.
HR drift by performance level:
- Low performance: median ~9.2%
- Medium performance: median ~5.8%
- High performance: median ~3.1%
An equally clean separation, in the opposite direction. Lower drift consistently associates with higher performance. These two metrics – efficiency factor and HR drift – appear to be the most informative signals in the dataset for understanding performance quality.
Step 5: Putting It Together – What the Data Actually Says
After working through every stage of this analysis, a coherent picture emerges. This isn’t a collection of disconnected statistics – it’s a story about a real rider’s season.
Finding 1: Two distinct ride types exist in the data.
Weekday sessions (45-60 min) and weekend rides (90-120 min) have different characteristics and probably deserve separate analysis tracks for any performance modeling.
Finding 2: Fitness followed a classic seasonal arc.
Efficiency factor rose steadily from March through July, plateaued through August, and declined in September-October. The data validates the metric and confirms the rider was training purposefully.
Finding 3: Efficiency factor and HR drift are the two strongest performance signals.
Both variables cleanly separate Low, Medium, and High performance rides. If you were building a model to predict performance level, these two features would be your most valuable inputs.
Finding 4: Heat confounds heart rate metrics in summer.
Summer HR drift readings need temperature context to be interpreted correctly. A 9% drift in 32°C heat tells a different story than 9% drift at 15°C.
Finding 5: Data quality issues were real and consequential.
The 11 zero-power rides and 23 invalid temperature readings weren’t edge cases – they represented 19% of the original dataset. Without catching them, any analysis would have been partially corrupted.
From Analysis to Action
This is where data science earns its value – not in the analysis itself, but in what the analysis enables.
For this rider, the findings translate directly into training decisions:
- Monitor efficiency factor weekly as the primary fitness indicator – it’s the single most informative signal in the dataset
- Adjust HR drift interpretation by temperature – don’t panic about high drift readings on hot days
- Treat weekend and weekday rides separately when assessing fatigue and load
- Target the high-performance conditions: the data shows what a high-performance ride looks like – use that profile to design training sessions intentionally
This is the full journey of applied data analysis: from raw export, through quality checking and exploration, to insights that change how you train.
The dataset was 180 rows and 14 columns. The insights are actionable, specific, and grounded in evidence. That’s what good EDA delivers – not just description, but direction.
Summary: The End-to-End EDA Workflow
| Stage | What We Did | What We Found |
| Inventory | Checked shape, types, first rows | 4 immediate data quality flags |
| Quality check | Investigated zeros, placeholders, duplicates | 11 invalid rides, 23 bad temps, formatting errors |
| Univariate analysis | Distributions of each variable | Bimodal duration, seasonal power skew, class imbalance |
| Bivariate analysis | Relationships between key variable pairs | Efficiency arc, heat-drift link, performance separators |
| Synthesis | Connected findings into a coherent picture | 5 actionable insights about the rider’s season |

Leave a Reply