More

MereGurudev · on May 17, 2024

Thanks for the feedback! It's a simple modification to guarantee things like this – for this prototype I thought it interesting though to expose the "raw" output of the neural network, including crazy things like sometimes placing two kings of the same color. I even force the neural network to place more pieces than it intends, if you put the piece slider on max, to see what happens (and often the result is very bad)

MereGurudev · on May 17, 2024

Thanks! Ah, sorry for that. Just to clarify, do you mean for the semi-transparent pieces (the generated ones without "locking in") or even the manually placed pieces?

impendia · on May 17, 2024

I tried it out, and also couldn't distinguish the colors.

> Just to clarify, do you mean for the semi-transparent pieces (the generated ones without "locking in") or even the manually placed pieces?

Are these visually distinguished somehow? Are there four different colors of pieces?

The piece color also seems to be different depending on the background square color, and I wasn't able to distinguish what is what.

MereGurudev · on May 17, 2024

Yes, they're distinguished: the pieces you place there manually ("fixed pieces" are opaque and thus should have the same color, off-white or dark-grey, no matter the background square.

The pieces the AI adds when you press generate, are translucent (and as such will be affected by the background color). Then you can press "Lock in" to convert the generated pieces to fixed pieces, in which case they will become opaque.

I realize that for anyone less used to the color scheme than me, this distinction is problematic since there will be no less than six different piece colors (white, black, translucent-white-on-light, translucent-white-on-dark, translucent-black-on-light, translucent-black-on-dark)

MereGurudev · on May 17, 2024

Hi, thanks for trying Noctie!

If you did the rating test, Noctie tries to adapt to your strength while you're playing, so if you play at 2574 level for a while, eventually Noctie will also play at that level. Since Noctie has no idea about your rating when the game starts, it might be that it took some time for the rating to adapt and therefore you found that the AI played much weaker.

The max strength of the AI is about 2700–2800 FIDE (level "Queen 4" inside the app if you have an account).

Per this site, https://chessgoals.com/rating-comparison/#lichessotb, 2809 Lichess is equal to around 2550 FIDE so if that's your lichess rating (wow btw!) maybe Noctie wasn't so far off. (EDIT: Ah I see, that's not your rating, sorry)

Obviously, you might get a different result next time – one game is very little information to make an accurate estimation off of, especially when the AI has to adapt it's playing strength as we go.

I don't use accuracy for the rating estimation BTW, I use custom neural networks that observe patterns in how humans at various rating levels play chess.

somenameforme · on May 27, 2024

I found an interesting example of an equal but opposite problem. After this game:

1. e4 c5 2. Nc3 g6 3. f4 Bg7 4. Nf3 d6 5. Bb5+ Bd7 6. Bc4 Nc6 7. O-O Qb6 8. d3 Nf6 9. e5 Ng4 10. Bxf7+ Kxf7 11. e6+ Bxe6 12. Ng5+ Kf6 13. Nxe6 Kxe6 14. Qxg4+ Kf7 15. f5 Ne5 16. fxg6+ Ke8 17. Qe6 Rf8 18. Nd5 Rxf1+ 19. Kxf1 Qd8 20. Bg5 Nxg6 21. Re1 Kf8 22. Nxe7 Nxe7 23. Bxe7+ Qxe7 24. Qxe7+ Kg8 25. Qxb7 Rf8+ 26. Kg1 Be5 27. Rxe5 dxe5 28. Qd5+ Kh8 29. Qxe5+ Kg8 30. Qxc5 Rf7 31. d4 Rf8 32. d5 Rf5 33. Qe7 Rf7 34. Qe8+ Rf8 35. Qe6+ Kg7 36. d6 Rf6 37. Qe7+ Rf7 38. Qe5+ Kg6 39. h4 Rf5 40. Qe8+ Kf6 41. d7 Rd5 42. d8=Q+ Rxd8 43. Qxd8+ Kf5 44. Qd7+ Kf4 45. Kf2 Ke4 46. Qxh7+ Kd4 47. Qd3+ Kc5 48. Ke3 Kb6 49. Qd6+ Kb7 50. Kd4 Kc8 51. Qe7 Kb8 52. Kc5 Ka8 53. Kc6 a6 54. Qb7#

The LLM has decided I'm rated 1784 in what was probably the most one-sided game I've played against it.

somenameforme · on May 26, 2024

Okay, I played it a few more times and it definitely does not seem to be scaling up properly. Here is a game it was getting outplayed pretty substantially in the opening with equal material, but it severely misplayed even in tactical situations once things started to explode later on when it should have long since ramped up.

---

1. e4 e6 2. d3 c6 3. Nf3 d5 4. Nbd2 Nf6 5. e5 Nfd7 6. Be2 Be7 7. O-O O-O 8. Re1 f6 9. d4 fxe5 10. dxe5 Qc7 11. Bd3 Bc5 12. Nf1 Qb6 13. Qe2 Na6 14. a3 Nc7 15. b4 Be7 16. Bg5 Bxg5 17. Nxg5 h6 18. Nh7 Rf7 19. Bg6 Re7 20. Kh1 Nb5 21. f4 Nf8 22. Nxf8 Kxf8 23. Ng3 Bd7 24. Qg4 Nd4 25. Bd3 Be8 26. c3 Nb5 27. Bxb5 cxb5 28. f5 exf5 29. Nxf5 Rf7 30. e6 Rxf5 31. Qxf5+ Ke7 32. Rf1 Qxe6 33. Rae1 Qxe1 34. Rxe1+ Kd6 35. Qe6+ Kc7 36. Qxd5 Bc6 37. Re7+ Kb6 38. Qc5+ Ka6 39. c4 Rf8 40. h3 Rf1+ 41. Kh2 Rf5 42. cxb5+ Bxb5 43. Re6+ Bc6 44. Rxc6+ b6 45. b5+ Kb7 46. Rc7+ Kb8 47. Rc8+ Kb7 48. Qc7#

---

somenameforme · on May 17, 2024

Interesting! I just played it again and can definitely see what you mean. But it keeps running into the same issue. It ends up giving itself lost positions early on which it's not really capable of defending. I was about to ask why you didn't go the other way (strong at first then gradually handicapping) but on the other hand I've never had anywhere near this much fun playing a bot, and maybe this is part of the reason why?

Well another obvious factor is that it plays in an extremely human-like fashion. I'm a relatively strong player and have been the reason for plenty of (C) labels but I would never, in a million years, think I was playing a bot here. Anyhow, awesome job.

MereGurudev · on May 17, 2024

Hey, you're right, adjust piece count is a local modification, so opening the same dreamId in multiple windows would let you view the position with different piece counts (and the link to Noctie uses the local state)

MereGurudev · on May 17, 2024

Thanks! Good idea, I could add FEN / PGN export in addition to lichess. I haven't tried conditioning the network on opening but given my training data, that should be possible. For now, you can simulate openings somewhat, by trying to fix pawns and pieces in their typical positions, but it would be better if I could do it like you suggested "QGA, 15–20 moves" etc. I might try this out!

MereGurudev · on Nov 23, 2022

Hey, that's really cool, will check it out! The backend was experiencing some issues leading to laggy play by Noctie, I hope it's working better now.

MereGurudev · on Nov 23, 2022

Thanks! Yes, the traffic from here exposed a few issues & bottlenecks in the backend infrastructure. I have resolved most of them and it seems to be running OK now (?).

To answer your questions: Noctie takes every move into account. Especially in short games or very one-sided games, a lot of the rating estimate will come from your opening play. If you play a bad opening, Noctie adapts and you will get an easier game, which may lead to simpler positions where you have less risk of making a mistake that would lower your rating. So what opening you play definitely has a big impact when trying to judge the rating from one game only.

If you create an account and log in, Noctie will accumulate information from several games to give a more balanced difficulty and more accurate rating estimate (and show how it changes over time).

Noctie's involvement isn't strictly required for the rating estimate although it creates good conditions for it by making the game balanced, not dominated by time trouble or trying to flag the opponent, etc.

desmosxxx · on Nov 24, 2022

Awesome work & thank you for the replies!

MereGurudev · on Nov 23, 2022

Good suggestions, thanks.

MereGurudev · on Nov 23, 2022

Thanks for the valuable feedback!

MereGurudev · on Nov 23, 2022

If the opening is rated as sufficiently bad, it would adjust its play accordingly and perhaps not challenge you sufficiently to be able to change its rating estimate. I.e. if you get a very easy, tactical game after that, the game might not be testing your abilities enough to revise the estimate from one game.