Sundar’s experiMENTAL

Hello experiMENTAList, it’s Sundar 👋

I’m a former Head of Marketing Science at Uber where I optimized $1Bn+ in spend across Brand, Performance, and Lifecycle.

Now, I share weekly playbooks top Marketers use to prove, optimize, and scale ROI.

Thank you for being a valuable part of this growing community of 2.6K+ marketing leaders from Uber, DoorDash, Google, Spotify, and many more.

Now let’s step into the lab!

PS: If you know a colleague or friend that might benefit from this newsletter, send them here.

What we’ll dive into today

Everyone knows the benefits of segmentation:

  • stronger personalization

  • better customer experience

  • improved profitability.

But what most people don't recognize is that there are three types of segmentation, each with its own pros and cons.

Let's dive in.

The 3 types of segmentation

  1. Demographic

  2. Lifecycle + value

  3. Algorithmic

All segmentation is based on data. Pretty obvious statement, but the data you have available determines the segmentation that you can build.

Let’s start with Demographic.

Demographic

Demographic is a wide-ranging segmentation system that uses demographic information. What constitutes demographic information is quite broad, but it's an attribute about the user themselves. It doesn't describe their actions; it simply describes the user.

The classic attributes are:

  • Age

  • Gender

  • Household income

What often happens with demographic segmentation is when we create personas. We use demographic information to create a structure or system that encompasses who a user is, to then represent what they'll do and why

Let’s take a quick look at the example above. Now what can you assume about Stacy?

She’s in her early 30s.

She’s female.

She’s got a bachelor’s degree and makes $75-100K.

Based on this you can likely make some inference about her lifestyle, her likes, dislikes, to a reasonable degree because we have lived experiences. We are likely able to articulate where she is in stage of life, what her experiences might be as a female, and what she can buy or not buy with $75,000 to $100,000. For example, we know she likely can't afford a Lamborghini, but she's also likely not buying a completely broken-down junk car.

“But Sundar, what if her parents are super wealthy and she drives a Lambo?!”

“Cool. That’s a hyper edge case.”

Pros

  1. More personal information

  2. More targetable with ads

Cons

  1. Expensive data to acquire

  2. Leads to normalizations not related to behavior

The meme above is a great example of where demographic segmentation can break down. Yes, it’s an edge case but it is reflective of the fact that it doesn’t say anything about behaviors.

Now, let’s move on to Lifecycle segmentation.

Lifecycle + value

Lifecycle segmentation looks at where in the journey a customer is .

Here users are bucketed in relationship to each other based on behavior alone and doesn’t assume any attributes about the user.

A good example of this is what we did at Uber .

As customers took more or less trips, they progressed through our lifecycle segmentation and we would send appropriate comms and promotions. The advantage to this approach was to be able to do this at scale across many countries. It was also easy to understand where we had labels for each segment.

A negative about this approach is that it uses a broad hammer approach and ignores key attributes about a user (if we had them). For example, a user taking 5 trips to the airport vs 5 trips to a bar is different but they’d be in the same lifecycle.

Pros

  1. Easy to set up and configure

  2. Easy to understand

Cons

  1. Doesn’t touch on the why

  2. Uses a broad hammer approach

Now a value based segmentation can also be confined with lifecycle.

A “value” based segmentation looks only at the value a customer brings. It doesn’t matter where in their journey they are. We also took this approach at Uber where for active customers, we segmented them based on value

If you look at “Monetary” (last row) then you’ll see how we defined low medium high and made a distinction per user based on that. It’s a simple framework again but it ignores where they are in their journey. A person spending $100 in their first week vs $100 in their 3rd year are wildly different. This second user sounds like a High LTV power user and should be protected at all costs!

Now let’s move on to the last type.

Algorithmic

Algorithmic is a catch all term I’m using for all segmentation that uses Machine Learning or other Data Sciency type activities to segment users.

The reason it’s different than the other 2 is instead of starting with the attributes you want to use to identify users (either Demographic or their Lifecycle ) you start with the outcome you want and then look at how to segment users.

Here are two examples that commonly come up:

  1. K means clustering

  2. Scoring

K-means clustering

K-means clustering is a way to discover natural customer segments in your data without pre-labeling them. Instead of saying “segment by age or channel”, you say: “Let the data tell us which customers behave similarly.”

The algorithm:

  1. Looks at customer features (e.g., frequency, AOV, discount usage, channel mix)

  2. Groups customers so that:

    1. Customers within a cluster are as similar as possible

    2. Clusters are as different as possible from each other

    3. Each customer belongs to one and only one cluster

Each cluster then becomes it’s own segmentation and you can then determine how to label these customers. An important note on the visualization is that you do the algorithmic clustering based on a variety of factors BUT to visualize it you pick 2.

To be honest, I find it a confusing visualization, but alas it is what it is.

Scoring

Another type of algorithmic segmentation is known as scoring. It’s essentially a scoring algorithm and then the scores are split into deciles.

Let’s break it down a bit more.

Example

We want to send out a promotion to our users to increase their conversion. Now, we could send the promotion to everyone BUT that would also be incentivizing users who would normally purchase anyway. Instead, we want to create an algorithm that will identify the users that are more likely to purchase.

Let’s look at this chart above and use it as an example. On the X axis are the deciles and on the Y axis are the scores ranked on likelihood to use the promotion. As you can see decile 8,9,10 have significantly higher likelihoods than the rest.

So, we decide to cut off the users and only send to users with a score in the 8th, 9th, and 10th decile.

What often happens with these type of scoring algorithms is that you become more efficient. You get the same impact in terms of increased conversion while spending way less on promotions because you’re cutting out the users who would have converted anway. At scale, this incrementality becomes huge.

We see this type of algorithmic segmentation very commonly when we have to “score” users. Qualifying leads (common in B2B) is using this type of algorithm.

Pros

  1. More data driven

Cons

  1. Requires strong data foundations

  2. More complex to explain

Was this article helpful?

Login or Subscribe to participate

That’s it for this week!

Stay experiMENTAL,

Sundar

Reply

or to participate