Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate nodes in diff #1136

Closed
joto opened this issue Aug 6, 2024 · 5 comments
Closed

Duplicate nodes in diff #1136

joto opened this issue Aug 6, 2024 · 5 comments

Comments

@joto
Copy link

joto commented Aug 6, 2024

See report here https://community.openstreetmap.org/t/duplicate-nodes-in-osc-downloaded-from-osm/117107

See v21 for node 10767916505 and v13 for node 10824026405 are twice in https://planet.openstreetmap.org/replication/day/000/004/346.osc.gz .

The nodes are in two minutely diffs: https://planet.osm.org/replication/minute/006/205/197.osc.gz and https://planet.osm.org/replication/minute/006/205/198.osc.gz . This should not happen. What is conspicuous is that around that time there were no minutely diffs for a while.

198.state.txt            2024-08-05 07:49   86   
198.osc.gz               2024-08-05 07:49  2.5M  
197.state.txt            2024-08-05 06:36   86   
197.osc.gz               2024-08-05 06:36  805   
196.state.txt            2024-08-05 06:35   86   
196.osc.gz               2024-08-05 06:35  851   
195.state.txt            2024-08-05 02:24   86   
195.osc.gz               2024-08-05 02:24  8.7K 
@tomhughes
Copy link
Member

Well yes that's when the site was down due to a database issue.

@Slpy-API
Copy link

Slpy-API commented Aug 6, 2024

I'm assuming users should fix the file manually, but would appreciate an update beforehand on whether the issue will be fixed in the stream or if any other checks need to be done first. Thanks!

@tomhughes
Copy link
Member

Well if somebody can confirm what actually needs to be done then maybe we can do something...

Is the whole of 197 duplicated in 198 or just some of it?

@tomhughes
Copy link
Member

Oh in fact those are the only two changes in 197 so it looks like osm-repl-2024-08-05T06:35:46Z-lsn-124BF-C70BD160.log got included in both.

I know I had to kill the minutely replication because it go wedged when the database machine had to be restarted but that kill happened around the time of 198 and 197 was uploaded an hour before that which means that segment was finished and should have been renamed so I don't really understand how it got repeated. There is an odd delay of several minutes between the location version of 197 and the S3 version though, suggesting that the osmdbt-create-diff was quite slow to exit after generating the file.

@tomhughes
Copy link
Member

I've removed the duplicates from the minutely, hourly and daily diffs now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants