What I learned from Netflix about A/B Testing

Every user experience (UX) designer wants to be able to know that their decisions are impacting the movement of a user to an interaction. Too often, however, we tend to rely too heavily on our own instinct, and not validate how actually effective the design of the UX actually is.

Todd Yellin is VP of Product innovation at Netflix, and spoke to a packed session at South by Southwest Interaction (SXSW) about the lessons Netflix has learned in 10 years of A/B testing. While the session didn’t dive into any powerful new mechanisms for A/B testing, Yellin did provide a lot of practical examples and advice on how Netflix has used A/B testing to make strategic decisions based on quantified user behavior, rather than design instinct.

What is A/B testing? 

The concept of A/B testing is to simply test design changes toward specific results. The testing often happens with random users with measurements put in place to determine which sample was the most successful in moving towards a defined goal, or taking a specific action.

As Yellin talked through their testing practices, I came to a few conclusions.

1 – Always know what you are measuring toward:

When A/B testing, the Netflix design team always measured against 2 metrics:

  • What impact did this decision have on our user accounts? (If it damaged user retention… if we lost users, it was not a good design decision).
  • What impact did this decision have on user viewing (we have paid accounts… now are they watching more? Increased user viewing was a success)

In some cases, Netflix would identify a tertiary measurement. But it was only put into place after those two measurements were flat.

2 – A/B Testing is the great democratizer

Disagreements come up in the design process. And quite often, the loudest or most senior person in the room will win the design disagreement. Quite often the loudest, or most senior person in the room is the least qualified to make the design decision. A/B testing of your design decisions will allow the users to have a voice at your table. 

3 – Leverage the data you collect (and collect only what you’ll leverage):

Data, Yellin explains, is “Piles of excrement, with a little bit of gold”. Rather than collecting mounds and mounds of user data, they focused on:

  • Age
  • Gender
  • Location

But even then, they increasingly found that Age and Gender were less important than actual viewing habits. Age and gender were demographics, but became somewhat useless for content discovery.

Organizations should determine what data ACTUALLY matters to their user experience, and personalization, and make that data easy to collect. Netflix actually made their key demographic data a part of their credit-card form. They were clear that it was not for credit card purposes, and were transparent on how it would be used. But they realized that making it a part of a larger, already painful, process made it easier to collect.

4 – Don’t listen to your users… watch them…

Yellin shared a real-life example from the Netflix design table. Many passionate users were writing… calling… pleading for the ability to give ratings in 1/2 star increments. To that point, a users could only give 1 – 5 stars. They heard from thousands of users who said the ability to do a 1/2 star rating would really help the accuracy of their decisions.

So…. they tested it.

While the loud users appreciated it, the silent majority did not.

Netflix looked at its core metrics of user retention, and view time, and saw no statistical impact from this decision. This is where they added a 3rd metric. Actually completing the review process. They saw a significant drop in completion of a review process, among those given the ability to do 1/2 star increments. They dropped the 1/2 stars.

5 – The smartest mind at your design table, is still an idiot.

Yellin showed a very specific example of 3 treatments of cover art for the Breaking Bad series. The 3rd option was a very compelling close up of the main character – Walter White. The other two… just weren’t as compelling. Yellin asked the room at SXSW which they though would perform the best. The room overwhelmingly agreed that the compelling face shot of White would win. This is a room of design professionals from around the world.  Yellin said his design team agreed. They A/B(/C) tested the artwork and found that it wasn’t even close. The winner was a far less compelling image of an RV in the desert.

The smartest person on your design team, is still less smart than your user behavior.

Yellin is quick to point out that it isn’t wise to test EVERY design decision. Small incremental changes are probably not worth the investment and potential user frustration to test. There is a point at which designers still need to be empowered to make design decisions in the absence of empirical evidence. But continued testing and analysis of user behavior can help those designers make better decisions when the data isn’t there.

Photo: Flickr: Mike K: CC

The U.N.’s play on Internet control

The UN may be poised to take over Internet regulation.

Much of the success of the Internet can be attributed to it’s open nature. No single government controls it. Since it’s inception, the Internet has been self regulated, with some of the earliest engineers still involved in setting standards.

At it’s purest, it can cross geographic and idealogical boundaries. That can cause a problem if you run a nation that tries to set some pretty strict idealogical boundaries (or even tax boundaries).

A number of UN nations would love to see that change.

From the Wall Street Journal:

For more than a year, these countries have lobbied an agency called the International Telecommunications Union to take over the rules and workings of the Internet. Created in 1865 as the International Telegraph Union, the ITU last drafted a treaty on communications in 1988, before the commercial Internet, when telecommunications meant voice telephone calls via national telephone monopolies.

 

Having the Internet rewired by bureaucrats would be like handing a Stradivarius to a gorilla. 

via: Crovitz: The U.N.’s Internet Sneak Attack – WSJ.

What “Do not track” really means

When you say “Do not track”… the web hears… well… nothing apparently.

Yahoo joined a list of, well, a lot of companies not paying heed to browser “Do not track” settings. Yahoo initially stopped honoring Internet Explorer DNT requests when Microsoft turned the feature on by default.

Even Google, whose Chrome Browser includes DNT settings says “At this time, most web services, including Google’s, do not alter their behavior or change their services upon receiving Do Not Track requests.”

From Ars Technica:

This probably shouldn’t be much of a surprise. Lorrie Faith Cranor, who led development of P3P more than a decade ago, told Ars in 2012 that “every time we come up with a technical solution that protects privacy, the websites come up with something they want to do that is broken by this privacy protection.”

Via: Yahoo is the latest company ignoring Web users’ requests for privacy | Ars Technica.

TV is Dying (And so are Internet TV Subscriptions)

tv_livingroomFrom Business Insider: TV providers lost 113,000 subscribers last month INCLUDING Internet subscribers.

  • Increasingly, prime time is being given to the tablet.
  • 40% of all YouTube traffic comes from mobile.
  • Ad revenue is up (and that’s hiding the trends)
  • Less than 1/2 of broadband subscribers subscribe to cable TV (For the first time ever)

Probably the most interesting trend is the decline in paid broadband subscriptions. Increasingly, consumers are relying on the free connections at work or at the coffee shop to deliver their content.

So if fewer people are watching cable TV and fewer people are paying for Internet service, does that mean that we just don’t care about watching our favorite shows anymore?

Not necessarily.

Free wifi — at work, in coffee shops, and on campuses — is making it easier for consumers to get the shows, movies and videos they want without subscribing to any kind of cable or broadband service

via: Business Insider

(Photo: Flickr: KB35)

Your next phone may have a curved screen

Photo:Photo: Flicker - Janitor - Creative Commons License
Photo:Photo: Flicker – Janitor – Creative Commons License

Bloomberg is citing a source indicating a couple new technologies to be introduced in future Apple iPhone models.

The first is a curved screen, and the second is a pressure sensitive touch interface.

Two models planned for release in the second half of next year would feature larger displays with glass that curves downward at the edges, said the person, declining to be identified because the details aren’t public. Sensors that can distinguish heavy or light touches on the screen may be incorporated into subsequent models, the person said.

 

Via: Bloomberg

Online Video Usage – The Numbers

U.S. consumers watched 42.6 Billion online videos in October, according to comScore Video Metrix data.

Quick Numbers:

  • 42.6 Billion Videos in October
  • 20 Billion content videos through Google properties
  • 184 Million Internet consumers watched online video
  • Average consumption of 21.1 hours per user
More online

Where You At? Adding Simple Geo-targeting to Your Site

I am working on a site that needs to target 7 specific markets with unique content. I was looking for a simple way to target content by geographic location.

GeoPlugin.com has a simple API that makes it easy to plug geo-targeted content into your application or site. I was able to get started by playing with their examples.

Passing an IP address to their API returns some great information Continue reading “Where You At? Adding Simple Geo-targeting to Your Site”

7 (Missing) Things That Will Make Me Love Google+

I’m playing around a bit with Google+ this week.

As I’m writing this, it’s still in a rapidly expanding private beta.

Let me start by saying I will never rip on a products stability or functionality while it’s still in beta.

That being said, there are a few things that could tip me over the edge and make me fall in love to G+

1 – iOS app

It is, apparently, on its way! The mobile-web version is nice, but I’m just missing too much functionality. I’m seeing a lot more interaction from my Android friends (…wait… friends with android devices… I don’t actually have any android friends… that’d be cool)

2 – API

Also… on it’s way. The beauty of Facebook and Twitter right now is my ability to leverage other applications (Tumblr, Instagram, SocialCam, WordPress, etc…) to organize my social media flow. Right now that development really hasn’t happened for G+. When it does… I’ll be a little more on board.

3 – Mobile Hangouts

My phone has a better camera than my laptop. I’d love to hang-out… on the go.

4 – Broader friend finder

I’ve exhausted my Google Contacts, and really don’t have any on Yahoo and Hotmail. I’d love to mine my other socnets.

The other issue I’ve come across is that most of my G+ friends are of the “tech-set”. The more of them I add to my circles, the more my suggested follows center around that area.

5 – +1 Trends

Google has added the items I’ve “+1’d” across the Internet, to my profile. I’d love to see what’s trending. Not only globally, but in my circles!

6 – A place for brands

Again, Google says this is coming. I’ve seen a few brands (primarily content publishers) jump on the account creation process already.

I have just been a part of migrating too many brands from Facebook profiles, to groups, to pages – to want to jump on this TOO quickly. I am glad it’s being acknowledged, and hopefully we will avoid significant brands trying to fit their round persona into the square hole of a personal profile.

I’m glad they are recognizing this need, and are developing a place for it.

7 – My friends

Most of you who read this are still following a link from Twitter or Facebook. Frankly… I have no loyalty to any social network… I just hang out wherever YOU do.

What about you? If you are on Google+, what will prevent this from being “just another status update I have to worry about”?

My Twitter Follow Algorithm


Every week I take a look at my new Twitter followers to see who I want to follow back. I want to follow as many genuine followers as possible.

So I realized I go through a simple assessment each time I look at a new follower. It goes something like this:

  • Do I know who you are? (+10 pts)
  • Do I like you? (+5 pts)
  • Do you know who I am? (+5 pts)
  • Do you live in within 50 miles of my home? (+10 pts)
  • Have we had a meal together (+5 pts)
  • Was it at one of our homes? (another +10 pts)
  • Does your Twitter description include the word “coach” (-10 pts)
  • Wait… do you actually coach an athletic team (+10 pts)
  • Does your Twitter description include any of the following: “Global”, “Strategist”, “Thought-leader”, “Client” (-15 pts each)
  • Did we attend the same conference or seminar (+5 pts)
  • Do you @reply people? (+5 pts)

Score Assessment

45-65 – We’re buds! I probably have you in a Twitter list, and we text message each other more than tweet.

20-45 – My network! We’re probably not getting the kids together this weekend, but we’d drop each other a DM if we needed an opinion. I will probably follow your links. You just make Twitter better.

0-20 – You were at that thing I went to that one time.

Less than 0 – Sorry. I wish you the best in your upwardly mobile endeavors. I just can’t come along for the ride.

 

Anything else? What is it that makes a good Social follow for you?

 

(photo: flickr)