Thanks for the feedback! It's a simple modification to guarantee things like this – for this prototype I thought it interesting though to expose the "raw" output of the neural network, including crazy things like sometimes placing two kings of the same color. I even force the neural network to place more pieces than it intends, if you put the piece slider on max, to see what happens (and often the result is very bad)
Thanks! Ah, sorry for that. Just to clarify, do you mean for the semi-transparent pieces (the generated ones without "locking in") or even the manually placed pieces?
Yes, they're distinguished: the pieces you place there manually ("fixed pieces" are opaque and thus should have the same color, off-white or dark-grey, no matter the background square.
The pieces the AI adds when you press generate, are translucent (and as such will be affected by the background color). Then you can press "Lock in" to convert the generated pieces to fixed pieces, in which case they will become opaque.
I realize that for anyone less used to the color scheme than me, this distinction is problematic since there will be no less than six different piece colors (white, black, translucent-white-on-light, translucent-white-on-dark, translucent-black-on-light, translucent-black-on-dark)
If you did the rating test, Noctie tries to adapt to your strength while you're playing, so if you play at 2574 level for a while, eventually Noctie will also play at that level. Since Noctie has no idea about your rating when the game starts, it might be that it took some time for the rating to adapt and therefore you found that the AI played much weaker.
The max strength of the AI is about 2700–2800 FIDE (level "Queen 4" inside the app if you have an account).
Per this site, https://chessgoals.com/rating-comparison/#lichessotb, 2809 Lichess is equal to around 2550 FIDE so if that's your lichess rating (wow btw!) maybe Noctie wasn't so far off. (EDIT: Ah I see, that's not your rating, sorry)
Obviously, you might get a different result next time – one game is very little information to make an accurate estimation off of, especially when the AI has to adapt it's playing strength as we go.
I don't use accuracy for the rating estimation BTW, I use custom neural networks that observe patterns in how humans at various rating levels play chess.
Okay, I played it a few more times and it definitely does not seem to be scaling up properly. Here is a game it was getting outplayed pretty substantially in the opening with equal material, but it severely misplayed even in tactical situations once things started to explode later on when it should have long since ramped up.
Interesting! I just played it again and can definitely see what you mean. But it keeps running into the same issue. It ends up giving itself lost positions early on which it's not really capable of defending. I was about to ask why you didn't go the other way (strong at first then gradually handicapping) but on the other hand I've never had anywhere near this much fun playing a bot, and maybe this is part of the reason why?
Well another obvious factor is that it plays in an extremely human-like fashion. I'm a relatively strong player and have been the reason for plenty of (C) labels but I would never, in a million years, think I was playing a bot here. Anyhow, awesome job.
Hey, you're right, adjust piece count is a local modification, so opening the same dreamId in multiple windows would let you view the position with different piece counts (and the link to Noctie uses the local state)
Thanks! Good idea, I could add FEN / PGN export in addition to lichess. I haven't tried conditioning the network on opening but given my training data, that should be possible. For now, you can simulate openings somewhat, by trying to fix pawns and pieces in their typical positions, but it would be better if I could do it like you suggested "QGA, 15–20 moves" etc. I might try this out!
Thanks! Yes, the traffic from here exposed a few issues & bottlenecks in the backend infrastructure. I have resolved most of them and it seems to be running OK now (?).
To answer your questions: Noctie takes every move into account. Especially in short games or very one-sided games, a lot of the rating estimate will come from your opening play. If you play a bad opening, Noctie adapts and you will get an easier game, which may lead to simpler positions where you have less risk of making a mistake that would lower your rating. So what opening you play definitely has a big impact when trying to judge the rating from one game only.
If you create an account and log in, Noctie will accumulate information from several games to give a more balanced difficulty and more accurate rating estimate (and show how it changes over time).
Noctie's involvement isn't strictly required for the rating estimate although it creates good conditions for it by making the game balanced, not dominated by time trouble or trying to flag the opponent, etc.
If the opening is rated as sufficiently bad, it would adjust its play accordingly and perhaps not challenge you sufficiently to be able to change its rating estimate. I.e. if you get a very easy, tactical game after that, the game might not be testing your abilities enough to revise the estimate from one game.