Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are correct. MCTS + UCB and other variants were state of the art leading up to AlphaGo. And even then, MCTS was also used in AlphaGo.

The main change in AlphaGo was using a deep learning network to encode a value network for fast rollouts and a policy network for move selection (rather than using the UCB rule). They later removed the value network and rollouts entirely, but even AlphaZero uses MCTS.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: