-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mate in 1 missed tens of times in training game #644
Comments
Another examples from discord: It's clearly something broken. Maybe there is some overflow in temperature code. |
Actually I'm wrong about 0.25. With many move candidates, that 0.25 will be split between all move candidates. But even with probability of 0.01 or so, it should find it. |
Three examples were generated by three different users, all had client v0.10, |
Looked at the first game you posted..id14027814, found one position you posted where there was a mate in 1 with 2 different moves (but didnt pick it) started ./lczero -w weights_324.txt.gz -v800 --fpu_reduction=0 the selected move Kb3 is far down the list of top moves.. info string Qe6 -> 1 (V: 50.00%) (N: 2.93%) PV: Qe6 the 2 mating moves are the 2 top picks... interestingly enough.. I first tried this with network 237... info string Kb3 -> 26 (V: 98.87%) (N: 5.47%) PV: Kb3 Kb1 Qe1+ so while 237 was better at picking the 2 mating moves.. it was also more likely to pick the move this training move did? |
So NN is sane. I seems that it's either problem with temperature or maybe promoted queen?.. Could you it several times with whole move history and --randomize to check which bestmove is returned. Moves in uci format: e2e4 d7d6 d2d4 g8f6 b1c3 b8d7 g2g4 h7h6 h2h3 e7e5 g1e2 b7b6 f1g2 c8b7 e1g1 e5d4 e2d4 g7g6 f1e1 d8c8 f2f4 f6h7 e4e5 d6e5 f4e5 b7g2 e5e6 d7e5 e6f7 e8f7 d4e2 g2b7 e1f1 f7g8 e2f4 h7g5 c3d5 g8h7 f4g6 h7g6 c2c4 b7d5 c4d5 g5h3 g1g2 c8g4 d1g4 e5g4 g2h3 g4f6 c1f4 f8d6 h3h4 d6e7 a1e1 f6d5 h4g3 a8f8 f4e5 f8f1 e1f1 h8b8 f1e1 b8c8 e1d1 c7c6 g3h2 e7h4 d1a1 g6f5 e5h8 c8h8 a2a4 f5e5 h2h3 h4f6 a1f1 d5f4 h3g3 h8g8 g3f3 f6h4 f1h1 g8g3 f3f2 g3h3 f2g1 e5d4 h1h3 f4h3 g1h1 h3f4 b2b4 c6c5 h1h2 h4g5 h2g3 g5f6 b4c5 d4c5 g3f2 c5d4 f2f3 f4e6 f3f2 e6c7 a4a5 b6a5 f2f3 c7a8 f3g2 f6d8 g2f2 d4e5 f2e3 e5d5 e3d2 a5a4 d2d3 d8b6 d3d2 d5c4 d2e1 b6d4 e1d1 a7a5 d1c2 h6h5 c2b1 d4a1 b1a1 c4d4 a1a2 d4c3 a2a3 a8b6 a3a2 c3d4 a2b2 a4a3 b2b3 b6d7 b3a4 d7c5 a4b5 d4d5 b5a5 d5e4 a5b6 e4e5 b6a7 c5d3 a7b6 d3b4 b6c7 e5f4 c7d8 f4e5 d8c8 h5h4 c8d7 e5d5 d7d8 d5c5 d8c8 c5d4 c8d8 b4a2 d8e8 d4e4 e8d7 a2c3 d7e6 e4f4 e6f6 f4e4 f6e7 c3a2 e7e6 e4f3 e6d7 f3e4 d7c7 e4e5 c7d7 e5f6 d7d8 f6f5 d8e7 f5g6 e7d8 g6f7 d8c8 a2b4 c8b7 b4d3 b7c7 f7g6 c7d7 d3b2 d7c7 g6f6 c7d6 f6f5 d6e7 h4h3 e7f8 f5f6 f8g8 b2d3 g8h7 h3h2 h7h8 f6e6 h8g7 h2h1 g7g8 h1e1 g8h7 e1h4 h7g8 e6f6 g8f8 h4h6 f8e8 h6g7 e8d8 g7a7 d8c8 d3e5 c8d8 a7a8 d8c7 e5c6 c7b6 a8a7 b6b5 c6a5 b5b4 a5b3 b4c3 a7e7 c3b3 e7e6 b3a4 e6b6 a4a3 f6e5 a3a2 e5f4 a2a1 f4g5 a1a2 g5g4 a2a3 g4g3 a3a2 g3f3 a2a1 b6b7 a1a2 f3e3 a2a3 e3d2 a3a2 d2c2 a2a1 b7b5 a1a2 b5e8 a2a3 e8c6 a3a2 c6e8 a2a1 c2b3 a1b1 b3c3 b1a1 e8a8 a1b1 a8a4 b1c1 a4a1 |
training.14027814 is missing
Update: see post below. Mistake in the decode script. |
added --randomize and -n (like training games) only picked a mate move in 3 of 10 attempts on same position |
@gyathaar can you do FEN + 8 moves just to eliminate doubts about that? Also please run with |
temperature.cmd:
Extra debug code in randomize_first_proportionally
rm log.txt; cat temperature.cmd | ./lczero -w ~/lcnetworks/id317 -s1 -t1 -n --randomize -l log.txt
|
ok.. fen with 8 moves: ./lczero -w weights_324.txt.gz --fpu_reduction=0 --randomize -n -t1 -v800 -l issue644.txt Here are 20 attempts in log file grep bestmove issue644.txt | sort | uniq -c | sort -n picked check mate move 9 out of those 20 attempts |
I think the repeated numbers out of RNG must be due to the thread_local in the code below. Each search is created on it's own thread. My computer must start reusing the same thread_id, and therefore re-seeding the RNG the same every time? This is maybe not ideal but it's not the cause of this issue.
|
After deleting the thread_local from the RNG code above, I now see the moves are being picked randomly. It looks to be working as intended. I also ran with |
@killerducky I've checked training data of that move, and there are many moves with probabilities in range [0%; 1%), and I believe d7d8 is one of them. So it looks like it's an issue in your tool (something like |
decode_training.py had this code:
|
Fyi I ran this 800 times: $ cat /tmp/uci | ./lczero -w ~/my/lc0/nets/id324.gz --randomize -t 1 2> /dev/null | grep bestmove (without Dirichlet noise, but with temperature) While immediate checkmate has win probability 100%, being a queen up gives a win probability ~99.9% for other moves, which is good enough to attract many visits and pick that move. I expect that similar thing happens with QRK vs K games, it's fine to drop queen if rook is enough for win. contents of /tmp/uci
|
I did take a look at 10k training games (latest batch at the time - ~16 hours ago), and found that no moves were selected with 0 probability (i.e., 0 node counts)... One thing that is probably irrelevant, but just a little interesting, is that there seemed to be a lot of probabilities of 1/799 - not 1/800 as I'd expect for nodes visited 800 times. More rarely, there were some probabilities less than 1/800 -- I think the smallest was like 1/885, suggesting a node that was visited about 885 times... I plan on looking more closely at all of this this evening, and verifying that the distribution of selected moves matches what I'd expect based on the probabilities - but a quick check showed that it all looked reasonable. |
799 is expected. First visit extends root node and next visits are real
visits to root's children.
For 885 I know no explanation though.
…On Wed, May 23, 2018 at 9:09 PM so-much-meta ***@***.***> wrote:
I did take a look at 10k training games (latest batch at the time - ~16
hours ago), and found that no moves were selected with 0 probability (i.e.,
0 node counts)... One thing that is probably irrelevant, but just a little
interesting, is that there seemed to be a lot of probabilities of 1/799 -
not 1/800 as I'd expect for nodes visited 800 times. More rarely, there
were some probabilities less than 1/800 -- I think the smallest was like
1/885, suggesting a node that was visited about 885 times... I plan on
looking more closely at all of this this evening, and verifying that the
distribution of selected moves matches what I'd expect based on the
probabilities - but a quick check showed that it all looked reasonable.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#644 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKvpl4OMt0_D150IdrCd3VYvgz_7QJbEks5t1bPNgaJpZM4UId7g>
.
|
799 is definitely expected, its actually the 800 which have been confusing me, I didn't manage to work out why those would be happening, even when tree reuse was enabled, and a sequence of completely forced moves, it shouldn't have started accumulating beyond the limit unless there was multiple threads. |
Yeah, I do see some outliers on either side (node visits)... I'm looking at games14280000.tar.gz
The outliers do seem to happen in groups -- so most likely same user, same games, same engine... It looks like that would signal a possible engine bug, but maybe pretty minor in severity given the counts. |
Here's an example of one of the groups of outliers. It has both high and low visit counts. It's one of only 4 groups like this in the 1.3M positions from 10k games in training data I looked at. 1st number is position index, 2nd number is calculated number of visits, then the minimum and maximum of the policy distribution.
|
And here's the code I used to calculate node visits. About 70% of them can be found by just doing 1/min(policy)... But if that number turns out to be too small (suggesting multiple visits to the least visited node), it starts to get a bit more complicated. Don't know if this is right, but it's at least close. from itertools import chain
visits_list = []
eps = 0.00005
one_visit_count = 0
for idx, dist in enumerate(all_probs):
visits = round(1/min(dist))
if 799<=visits:
one_visit_count += 1
else:
arr = np.array(dist)
for visits in chain(range(799, 900),
range(700,799),
range(900,1000),
range(600, 700),
range(1000,1100),
range(500,600)):
err = np.sqrt(np.mean((np.round(arr*visits) - (arr*visits))**2))
if err<eps:
break
else:
raise Exception("oops")
if visits<799 or visits>850:
print(idx, visits)
print("==> [max,min] probs: {}".format(np.array([max(dist), min(dist)])))
visits_list.append(visits)
print("Total positions: {}".format(len(visits_list)))
print(pd.DataFrame(visits_list, columns=['visits']).groupby('visits').size()) |
Ok, more data points... The game indexes with the outliers were as follows: 4479, 4513, 7156, and 8439... That should correspond to: Although the last two look normal, the first two appear to have moves encoded in UCI, whereas everything else is SAN... I don't think this is right?? |
Those are games by http://lczero.org/user/ignore But it shows that there is error in Slightly above 800 values can also be explained with default batch size 256 (will also be changed). |
@mooskagh I did run a couple of games with --smart-pruning |
Ah nice - makes sense then :). Also, glad to hear lc0 has this type of pruning feature - though I’m curious how that might be used during training without excessively flattening out policies. And I think there is a mistake in the numbers above. Some of the ones above 1000 should actually be half of what I calculated. You can verify this simply by doing 1/(min_policy)... This keeps everything consistent with pruning and 256 batch size, I think.... Reason for this is I searched for candidate visit counts between 1000-1100 before 500-600. |
This game:
http://lczero.org/game/14027814
In the end of the game mate in one was missed like 20 times.
While I understand that with temperature it can do bad moves, it's surely too much in that game and it's certainly something wrong happening.
Would be nice to look into it and try to reproduce.
Btw, gzip file for this particular game was corrupted. Not sure if related, can be written off to faulty hardware, but seems not that plausible.
Username of a person who generated this game is
Wil54
.The text was updated successfully, but these errors were encountered: