Very often your current users get used to your design and hate it when you change something. Is there a way to do A/B testing on your current users, or should you try it only on new ones?
You can, theoretically, only A/B test things against new users. Most A/B testing tools do not support this behavior out of the box.
The one time I made a change which was drastic enough to consider doing that, I just put a revert button on the interface (on both sides) and got ready to tell people how to use it. It turns out nobody asked.
In general, though, many of the things which are most valuable to test border on imperceptible for long-term users of the site. For example, do you use Dropbox? (Picking a well-known example locally.) Can you identify the H1 on dropbox.com? Can you identify the button copy on dropbox.com? Most seasoned users can't, and won't notice changes to these elements, yet they're strongly influential on free trial signups.
Can you freehand sketch the Dropbox credit card form? How about identifying the button copy on it? These have substantial impact on purchases at the margins. People only see them once (well, typical case for software), and nobody remembers them for more than a few minutes.
Think back to your first run experience with Dropbox. How many steps did it have? What were their names? I'm going to bet you that the onboarding process for Dropbox has had more dedicated optimization effort than everything else the company does combined, but the median number of exposures per user is one.
Great insight! Though I'd expand on your bet and claim that at almost every major site, a majority of testing is channeled into onboarding. It's certainly the case at Twitter and Facebook that a lot of work goes into optimizing a user's first steps.
Great question. It's a long answer and it gets sort of involved.
1) It is easier to get adoption of A/B testing -- which many people have heard of, which many agree that they should be doing, and which captures substantially all of the benefits of bandit testing -- than bandit testing, at the typical company. e.g. If I go to your software company CMO and say "Do you know what A/B testing is?", if the answer is "No", then the CMO is not quite top drawer. "No, and there is no reason why I should know that" is a perfectly acceptable answer for bandit testing.
2) There are some subtleties about actually administering bandit tests, for example in how tests interact which each other or with exogenous trends in your traffic mix, which sound like they could cause operational nightmares. A/B testing does not have 1-to-1 analogues to these problems, and many of the theoretical problems with A/B testing are addressable in practice via e.g. good software and good implementation practices, both of which exist in quantity.
3) A/B testing has vastly better tool support than bandit testing, which currently has one SaaS startup and zero OSS frameworks which I am personally aware of.
4) On a purely selfish note which I'd be remiss in not mentioning, I'm personally identified with A/B testing in a way that I am not with bandit testing.
5) Again, convincing people to start A/B testing will be better 100 out of 100 times than failing to convince people to start bandit testing, which is the default result. Consider the operational superiority of A/B testing for software companies in August 2013, then look at the empirical results: very few companies actually test every week.
(There is also a zeroth answer, which is "I have reviewed the arguments for doing bandit algorithms over A/B testing and frankly don't find them all that credible" but for the purpose of the above answers I assumed that we both agreed bandit was theoretically superior.)
A/B testing is a catch-all term for multi-variant experimentation. Multi-armed bandit is a specific approach to testing[1], and even though most frameworks provide A/B/N testing -- that is, not necessarily just two variants -- it is easier to say 'A/B' instead of 'A/B/N'.
Indeed Google Analytics has added the multi-armed bandit approach in Content Experiment. It's quite slick btw, but definitely more difficult to implement than traditional split testing.
My 2 cents:
Do you have something that isn't quite as deep technically as the slides you linked to but still explain the math behind a/b testing? Statistics were never my strong side.
Also where did you learn all this A/B testing stuff? I have never heard of it before I started reading^H^H^H^H^H^H^H^H stalking you.
Like most things, I learned the first 10% by reading on the Internet (A/B testing is very much not the new hotness that was just discovered by software companies in 2008) and the next 90% by throwing stuff at the wall and taking notes on what sort of stuff tended to stick.
> 60% yearly revenue increase in 2012 on the strength of a brief series of A/B tests...
I appreciate your transparency about Bingo Card Creator, but sometimes you make it sound like such easy money, do you ever worry about your openness encouraging direct competitors?
It had competitors before it launched (about a dozen of them) and has been cloned at least 3 times due to my forum participation over the years.
At the risk of stating the obvious:
a) If one is sufficiently skilled to duplicate e.g. the Bingo Card Creator SEO strategy, all one has to do is apply it to a higher-value niche like e.g. distressed real estate, like one of the gents at the Bootstrapped With Kids podcast did, and you'll make radically more money.
b) There are probably easier competitors in the world to take money from than me.
c) Even supposing BCC revenue were to be materially impaired, I'd barely notice that. In point of fact, it is (down 40% or so this year) and I am still not more than peripherally aware of that.