What Pro Data Analysis can learn from Strava

 

If you are a cyclist or a runner, chances are that you use or have at least heard of Strava. Strava is a platform for athletes to analyze and share their activities’ data and virtually compete with one another.

With the rise of sports tracking devices tracing position and pacing via GPS and additionally measuring heart rate and elevation, Strava leverages this data that users upload to create a frame of reference for several types of activities. First founded in 2009, Strava has more than 10 Mio active members (in fact they emphasize not to call them users) in more than 195 countries. In 2017 alone, cyclists shared over 7.3 Bio km worth of rides.

In 2009, when they began as a typical Californian data start up, they were highly dependent on the hardware vendor Garmin: In fact, in the beginning uploading data to Strava was only possible directly from a Garmin device leaving early Strava at the mercy of Garmin. Today, the power dynamics have changed a lot: It is now Strava-compatibility that drives hardware sales. Automatic data synchronisation to Strava or even live Strava powered analytics during an activity enable not only Garmin but also their competitors like Polar and Wahoo to sell their newest generations of devices.

How do they make money

Strava’s main sources of revenue is first of all their premium membership options (59,99€ a year or 7,99€ a month).

Secondly, industry partners can sponsor challenges, that is specific goals on a specific time frame, Strava members can commit to to motivate themselves. One example is the Rapha 500 Challenge [1] of the bike vendor Rapha challenging its participants to ride 500km between Christmas Eve and New Year’s Eve.

As Big Data company, the selling of data to third party is also part of Strava’s business. For now, they are committed to share data only in an aggregated and therefore anonymized form with partners that are aligned with Strava’s vision of enabling and helping athletes. Notably, the project Strava Metro [2] aims to partner with city planners around the globe to make e.g. bike paths and most frequent bike tracks safer. On their website, you can find a case study of the partnership with the Seattle Department of Transportation.

By 2018, Strava has yet to become profitable.

My personal use cases

To navigate on my bike following pre-planned tracks, I bought a Garmin Edge 800 a few years ago. For this, I create a GPX, short for GPS exchange format, file of my route and upload it onto my Garmin device.

GPX track covering both the Gampen Pass and Mendel Pass in italy planned on GPSies

 

GPX is an XML schema which can be used to track GPS based waypoints  and routes together with timestamps. In this format, one can both store pre-planned routes which can be later used for navigation as well as recording timestamps when passing these waypoints on a bike ride or a run. I use my Garmin both for navigation as well as recording, but any smartphone can do the trick as well (within its battery limitations).

When I ride, I have my track as a purple line embedded into a map [3] that I can follow to pick the right turns. After my ride, I use Garmin’s own software Garmin Express to read out the recorded GPS/time data as well as my heart rate. It is automatically transferred to the Garmin platform Garmin Connect. Garmin connect offers similar features as Strava while being restricted to its own devices. In my opinion, their dashboard composition used to be a bit messy. The new modern look has improved matters quite a lot, however this was too late and many users like myself went to look for alternatives.

Garmin exposes data of newly created activities to Strava via an API, automatically uploading any rides making them visible to my community of friends and acquaintances. Over there, I get an instant analysis of my ride: How I performed on pre-defined segments during my ride:  Did I hit a personal best? Have I been able to score top 10 for women? How do I rank compared to my friends that have done this particular segment as well? Subsequently, my activity becomes visible to my friends (or to the world if I chose so).

Activity on Strava

 

From a data analysis perspective, Strava does a few things well from which the pro data analysis world could benefit as well.

 

Easy and powerful visualizations and tracking tools geared for its user base yield a powerful Business Intelligence

One of the main challenges for most amateur athletes is to keep up motivation to continue with one’s sport. On the one hand, it’s the community part that allows sharing your passion, but as well your challenges with your friends in real life like your bike club or with other like minded people that you know only online. 

On the other hand, you can follow your own progress and try to beat your past self. I particularly like the feature of tracking the number of weekly activities, the overall length and elevation gain to motivate myself to keep up my rides and training.

Strava, of course, benefits from the fact that their are a lot of canonical KPIs for sports activities such as distance covered, speed, heart rate, elevation gain etc. that quite easily open the door to make a sports tracking platform’s insights relevant and meaningful for the user.

Neatly visualizing this data adapted to the needs of the particular type of sport is on the other hand much more difficult. In my opinion, Strava’s success is mainly due to its strength there outperforming Garmin with a little-cluttered interface and visuals.

Well incentivized community dynamics keep the platform and its data relevant

The heart of Strava’s data analysis capabilities are segments. Segments are short tracks of variable length between two points which cover a part of a road or a route. A typical example would be the start of a slope of a mountain to its highest point. Users can create segments themselves, but also flag segments as duplicates or irrelevant (sprints of only a couple of meters).

Even though Strava has recently invested in getting rid of most obvious duplicates in segments, it mostly relies on the communities to do their own clean ups: A lot of Strava members develop quite some enthusiasm to curate the most relevant segments that appear on their routes in order to track and showcase their performances.

The same is true for fraudulent data: If you track your “performance” on an e-Bike or a motorcycle in order to score a good ranking on an at least moderately frequented segment, you can be sure that other Top 10-candidates will be quick to report the activity to get a “fair” ranking.

This principle of community police as curator allows to avoid one of the most common threats of any Big Data endeavour, namely the loss of meaning of data due to spam and irrelevance.

Gather your stars as marketers

On the one hand, some KOMs and QOMs, short for King of the mountain and Queen of the mountain denoting the respective leader on a segment, are pretty much completely out of reach if a major competition has traversed one’s territory, it is invariably cool that one can also follow people like Romain Bardet (who is competing in the Tour de France at the moment) and see how they perform on your favourite segment.

Below is a screenshot of a segment that I just rode. – A segment that has been part of this year’s edition of the Giro d’Italia race allowing to compare world class cyclists Romain Bardet and Vicenzo Nibali.

Having the industry stars on ones platform is a great marketing coup to showcase ones functionalities and gives professional athletes a platform to interact with their fans.

A thought-through Premium membership principle

Currently, I use the free membership option of Strava. The premium option would allow me to get more detailed analysis such as power meter analysis, live feedback and personalized coaching to reach more customized goals.

One fun example of what could be gained from a premium membership is the possibility to get live segment information during my ride: I would see exactly how I’d need to perform to score a good ranking on say my favorite hill. Strava’s philosophy is that most people will sign up for the free option and quite steadily go for the premium option once they have been with Strava for a while.

And even while you are not paying, you are still contributing to the richness of data accumulated and curated in Strava. While Strava has yet to reach profitability, this balance seems to be quite powerful for Strava to generate value to both members and partners.

The big key word for the future of Strava is ‘Discovery’: Assume you travel to a new city and you want to go for a run. Strava knows your typical distance and whether you like hilly terrain or flats and can recommend you routes that other athletes just like you do in this particular city. To which extent this will be part of the premium part of Strava, is not yet clear, but to me, these kinds of recommendations would be very valuable and something, I would definitely consider paying for.

Grow with challenges

As a data company/social media platform, you are under constant public scrutiny. In the beginning of the year, a story broke of a secret US military air base [5] being exposed on a Strava heat map: Soldiers had been recording their training as ‘public’ on Strava. Even though the data was anonymized, having a well-frequented running course in an Afghanistan desert left not much room for speculation.

Even though one can clearly argue that this incident was largely due to the carelessness of the people uploading their data publicly without second thought, this still is a challenge for a community to educate its members on the consequences of privacy. This holds both for Strava itself, but also for mainstream journalism who mistakenly called this a ‘data leak’ or ‘data breach’ which it most definitely wasn’t.

Strava itself took action to highlight in detail the opting out possibilities in order to avoid these including the introduction of a minimum numbers of activities for a path to show up on any heat map. Furthermore, heat maps are refreshed regularly so that activities that are later made private no longer show up. This means that even if a group of soldiers mistakenly uploads an activity of a secret location, they can still take action to have it be hidden and further damage can be avoided.

Another ongoing discussion that concerns a far greater base of users is the possibility to opt out of certain aspects of data sharing. Unfortunately, many athletes, in particular women, are not comfortable sharing timed location data of their runs publicly, since it could be very easy for stalkers or even attackers to guess patterns and pose a serious threat. For now, the only option is to not share an activity at all publicly. Strava has said that they currently explore of how to make only parts of an activity publicly visible while still integrating the other relevant parts of the activity.  

 

Notes

[1] Rapha 500 Challenge, see e.g. the 2016 version here:https://www.strava.com/challenges/rapha-festive500-2016

[2] Strava Metro: Insights business for city planning https://metro.strava.com/

[3] For planning, I use www.gpsies.com. Strava offers its own planning tool, Strava routes https://www.strava.com/routes. Out of say a bit arbitrary historic reasons, this tool however has not yet gained much traction in my friend cycle after the tool recommended to one of our more passionate road cyclist to use a less than optimal gravel path for his precious road bike.

[4] Open Street Map https://www.openstreetmap.org/ is an open source mapping database permitting to download any map selection in Garmin-compatible formats

[5] BBC article on the Strava Military Base incident from 28/01/2018 https://www.bbc.com/news/technology-42853072