Quantcast
Channel: Hacker News
Viewing all articles
Browse latest Browse all 10943

How to Model Viral Growth: Retention & Virality Curves | LinkedIn

$
0
0

Comments:"How to Model Viral Growth: Retention & Virality Curves | LinkedIn"

URL:http://www.linkedin.com/today/post/article/20130402154324-18876785-how-to-model-viral-growth-retention-virality-curves


This post is the third in a series in which I show you how to model viral growth. We started with the simplest possible model, and we are working our way up to a model that simultaneously accounts for non-viral channels, how you retain users over time, and even how a user's virality changes over time. A model like this can arm you with realistic expectations, and can give you a predictive tool that you can keep up to date with real data.

You can read the first post here and you can read the second post here.

Recap

In the first post, we built a hybrid model of growth that accounts for both viral and non-viral channels. We saw that when our viral factor v is less than one, we can interpret it as the amplification factor a = 1/(1-v). To calculate our total number of users, all we have to do is multiply the number of users acquired through non-viral channels by our amplification factor.

In the second post, we rebuilt the model with a simple representation of user loss: we assumed that there was a fixed chance of losing a user every month. In this model, we saw that our user base grows to a fixed number U = a•g/l = g/(l•(1-v)), where g is our non-viral growth rate and l is rate at which we lose users.

In the first model, we never lost any users. In the second model, we rapidly lose them all. Reality, as ever, lies somewhere inbetween.

Let's rebuild our model once again. This time, we'll create a much more detailed representation of our users' virality and retention.

Follow along and experiment for yourself with this Excel spreadsheet: "How to Model Viral Growth.xlsx"

The Retention Curve

Let's assume that we've made a product that's really great. People use it regularly, find it indispensable, and would be extremely disappointed if it were to disappear tomorrow. It's the kind of product that people keep for using months, or even years, after they start. I personally would put products like Uber, Dropbox, and Evernote in this category.

For a really great product, our loss model is too harsh. Sure, we might have a 15% chance of losing a user in their first month. And yes, we might even have a 15% chance of losing that user in their second month. But as users continue to use our product, we tend to better retain them. Why? Several self-reinforcing effects can happen:

  • Users invest more data in our product, and it can become harder to switch to a competitor (e.g. Dropbox, Evernote — see this talk by Phil Libin, CEO of Evernote)
  • Users invest more time in our product, and can develop a reflexive habit to use it (e.g. Uber)
  • As a result of both the above, users can develop an emotional connection with our product (e.g. Rapportive — see our wall of love)

In practice, our users will exhibit a retention curve. The retention curve describes how likely a user is to be still using our product at a given point in time.

The retention curve will be determined by the type and quality of our product, as well as by how well we target our marketing channels. For example, let's consider browser extensions. From working in the space and benchmarking with other entrepreneurs, I've learnt that good browser extensions have retention curves which look a lot like this:

After the first week, we've retained 80% of our users. After the first month, we've retained 65% of our users. After two months, we've retained 55% of our users. And in the long term, we retain about 40% of our users, with a very gradual monthly decline.

The Virality Curve

Before we update our model to account for a retention curve, let's first consider the effect of a retention curve on virality.

So far, we've assumed that our users will only invite people in their first month. But if 40% of our users continue to use our product in the long term, and if through using the product our users continue to invite people, then our users will be viral over time!

In other words, our users will also exhibit a virality curve. The virality curve describes how the viral factor of an average user changes over time.

Why would a user's viral factor change over time? The answer depends heavily on the product, but consider the following story:

  • Initially, the user hesitates to invite people, because they are still testing our product
  • Once the user is in love with our product, they quickly invite a bunch of people
  • Soon, the user runs out of people they know who they want to invite
  • Occasionally, the user invites new people that they happen to meet

In this example, we might expect a short initial delay, followed by a rapid spike, then a quick decline to a steady but low rate.

We could try and model each part of this curve, but let's focus on what is most likely the main effect: the viral factor will reduce over time because users will run out of people they want to invite. Let's model this with a geometric decay: each month, the viral factor is half that of the previous month. For example, the viral factor may be 0.2 in the first month, 0.1 in the second month, 0.05 in the third month, and so on:

If we sum these viral factors over the lifetime of a user, we get the lifetime viral factor v'. In this case, the lifetime viral factor is 0.2 + 0.1 + 0.05 + … = 0.4.

Rather helpfully, our previous intuitions continue to apply:

  • For a consumer internet product, a sustainable lifetime viral factor v' of 0.15 to 0.25 is good, 0.4 is great, and around 0.7 is outstanding
  • Our amplification factor a is now 1/(1-v')

The Combined Model

We've updated our model to combine the effects of non-viral channels, retention curves, and virality curves. Check it out on the sheet labelled "7. Hybrid+Retention+Virality". The formulae are a little more complex than before, but after some inspection it should all be straightforward.

As well as the usual graph of user growth, we've also included a graph that compares our various growth channels, and stacks them up against user loss:

The best way to gain intuition for how these factors interact is to play with the numbers and watch the graphs. Whilst watching the comparison of growth channels and loss, let's try a few things:

1. Let's improve the retention curve. Set the month 1 retention to 90%, the month 2 retention to 80%, and the month 6 retention to 60% (the intervening months will automatically interpolate).

Not only do we see loss decrease, but we also see viral growth increase. This happens because when users stick around longer they can send more invitations.

2. Let's improve the virality curve too. Set the month 1 viral factor 0.35; this will give us a lifetime viral factor of around 0.7.

This has a dramatic effect on our viral growth channel, which almost doubles from about 20,000 users per month to about 40,000 users per month. It has a less dramatic effect on the total size of the user base, because in the long term we we still lose 40% of our users.

3. Let's add another big press spike. Set month 6 under the "Launch press" column to 100,000.

We can clearly see the spike on the channel comparison, where it causes a corresponding spike in loss. Shortly afterwards, we see a quick jump in viral growth, which then decays over time as we lose our new users and as our new users run out of people to invite.

Incorporating Historical Data

The spreadsheet is designed to form the basis of a predictive tool that easily incorporates historical data. For example, if you have 6 months of past data, enter your real growth and loss figures as the first 6 months in the "Output" section, and then enter the virality and retention curves that your users exhibit in the "Variables" section. This will help you visualize your trajectory, and could help you discover high impact changes.

Limitations, Further Work & Further Reading

We should never fall in love with any model, as they all have their limitations. The areas where our model could be improved include the following:

  • We assume our ongoing non-viral channels are constant over time. This isn't the case: platform growth, new competitors, and word of mouth can all have a big effect.
  • We deliberately consider a very limited number of channels. In practice, we would likely have several more non-viral channels. Our product may also have more viral channels.
  • We assume that we stop losing users after 6 months. Sadly, we will always lose users, due to everything from natural attrition through to users switching to competitors. Fortunately, this is easy to model as we acquire the data: all we have to do is extend our retention curve beyond 6 months.
  • We assume, somewhat conservatively, that our users are not viral after 6 months. Again, this is easy to model as we acquire the data: all we have to is extend our virality curve.
  • We assume that the virality and retention curves themselves will not change over time. This hopefully isn't the case: as we improve our product through testing and iteration, our virality and retention curves will also improve over time. As mentioned above, I don't think historical virality and retention curves need to live in the model: we should describe the past with data, and use our current curves to model the future.

Whilst we've covered a lot of ground, we haven't touched on some very important topics, such as what makes a great viral product, and the magical effect of a short viral cycle time. If you're interested in learning more, and if you haven't already, do checkout these articles:

Summary

  • In practice, the retention and virality of your users will both change over time. This can be modeled with retention and virality curves
  • The virality curve describes how the viral factor of your average user changes over time
  • The sum of the virality curve over time (i.e. integrating it with respect to time) is the lifetime viral factor v'
  • For a consumer internet product, a sustainable lifetime viral factor v' of 0.15 to 0.25 is good, 0.4 is great, and around 0.7 is outstanding
  • In practice, your retention and virality curves will also change over time! Use a spreadsheet like the final model to help tease this apart — enter real data for the past, and your current retention and virality curves to visualize the future

If you like this post, then follow me on LinkedIn! In addition to this series on virality, I'll share interesting links about the web, technology, and entrepreneurship :)

Thank you to Andrew Chen, George Zachary, Elliot Shmukler, and Reid Hoffman for reading drafts and debating these ideas.

Til' next time!


Viewing all articles
Browse latest Browse all 10943

Trending Articles