Does anyone have a reference for solving multi-armed bandit problems with a fini... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		moultano on May 27, 2019 \| parent \| context \| favorite \| on: The multi-armed bandit problem (2012) Does anyone have a reference for solving multi-armed bandit problems with a finite time horizon? I would like something that derives rules or heuristics for how your explore/exploit tradeoff changes as the horizon approaches. This seems like an obvious extension, and something that someone should have worked on given how long this problem has been around, but I've been unable to find anything on it. Any pointers?

bhl on May 28, 2019 [–]

What do you mean? Most analyses of multi-armed bandit algorithms assume a finite time horizon. And if not, they use the doubling trick for infinite time horizons.

moultano on May 29, 2019 | [–]

Thank you, now I realized that I had misunderstood the notation.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact