Nor me, to collect a 1% change: > power.prop.test(p1=0.1, p2=0.11, power=0.8,sig...

noelwelsh · on Aug 5, 2013

I normally consider relative minimum discernable effect. Your 5% absolute change is a 50% increase is the base rate. I also typically go for a higher power (e.g. 0.9). Under these conditions 60K samples is more typical.

Sample size calculator here: http://www.evanmiller.org/ab-testing/sample-size.html

You'll need to change the defaults to match the above to get the figures I mention.

throwawayg99 · on Aug 5, 2013

You're right. That's a bit more reasonable, so back to my 10% change in base rate (1% abs.) but with a 90% power:

  > power.prop.test(p1=0.1, p2=0.11, power=0.9,sig.level=0.05)

     Two-sample comparison of proportions power calculation

              n = 19746.62
             p1 = 0.1
             p2 = 0.11
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

  NOTE: n is number in *each* group

Requires about 40,000 samples per test. I would strongly recommend anyone serious about doing this look in to MAB testing, as A-B testing is way too expensive for reasonable scale testing (unless you have a strong a priori hypothesis to test).