Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] onReadItem and onReadTile events #4

Open
gesior opened this issue Mar 12, 2019 · 8 comments
Open

[FEATURE] onReadItem and onReadTile events #4

gesior opened this issue Mar 12, 2019 · 8 comments

Comments

@gesior
Copy link

gesior commented Mar 12, 2019

Current version first read whole map into RAM (UP TO FEW GB RAM).
Then let me do some advanced operations on map.
Then I can save my modified map.

There are many functionalities for which I don't need whole map: read house tiles, find some item position
It would be nice, if I could run it with few MB RAM (remove 'tile' from memory just after it's loaded with items).

  1. After loading item:
(...)
const mapReader = new MapReader();
mapReader.on('readItem', function(tile, item)
{
  if(item.id == 1387) {
    console.log('found teleport', tile.pos);
  }
}
);
mapReader.process('input.otbm');
  1. After loading tile with items and properties:
mapReader.on('readTile', function(tile) {});
@Inconcessus
Copy link
Owner

Hi, reading the map chunked is quite trivial I think but transforming inlcding writing is harder. I started on this branch: https://github.com/Inconcessus/OTBM2JSON/tree/add-stream-reader. See examples/stream for a demo.

@Inconcessus
Copy link
Owner

I made some more changes using a simple synchronous callback instead of using an event emitter. The problem I encountered is that the application uses a depth-first approach. We get the top level element last and have to write it to file first. So just piping it doesn't really work out.

This transformation approach takes every OTBM_TILE_AREA, OTBM_WAYPOINTS, OTBM_TOWNS and converts it back to binary representation before continuing. In the end everything is concatenated and written to file. That should save you a lot of memory.

If you want to give it a test just clone git repository and checkout the new branch.

git checkout add-stream-reader

@Inconcessus
Copy link
Owner

Look for examples/stream for a demonstration of the transformation API. You have to define a transformation callback function that is applied for every feature encountered. Remember to return the feature after modifying it!

@Inconcessus
Copy link
Owner

I tested this on a 25MB OTBM file and the memory usage went from 300MB to 50MB. Let me know what your findings are!

@JSkalskiSBG
Copy link

JSkalskiSBG commented Mar 13, 2019

I tested this on a 25MB OTBM file and the memory usage went from 300MB to 50MB. Let me know what your findings are!

Just tested with 120MB OTBM on [email protected] [all cores], 32 GB RAM, Samsung NVME SSD (read/write 1.5-3.5GB/s).

'time' command with RAM and CPU usage measurement:

/usr/bin/time -v node big.js

Basic (for that version I had to increase default ~2GB node RAM limit: node --max_old_space_size=8000 big.js ):
CPU use: 139% (1.39 of 1 core)
Elapsed time: 48.23 sec
Peak RAM: 2987 MB

Stream:
CPU use: 113% (1.13 of 1 core)
Elapsed time: 41.61 sec
Peak RAM: 1628 MB

I added some 'global.gc()' for test, but it only increased time of execution. Almost zero RAM peak usage change.

@Inconcessus
Copy link
Owner

So there's a big improvement but not as much as I would expect. The streaming function handles one OTBM_TILE_AREA at a time (and when it is completed, it is converted back to OTBM). Within a tile area there may be many items, and perhaps (protection) zones that take a up lot of space since in the JSON representation they have very long keys:

// Read individual tile flags using bitwise AND &
return {
  "protection": flags & HEADERS.TILESTATE_PROTECTIONZONE,
  "noPVP": flags & HEADERS.TILESTATE_NOPVP,
  "noLogout": flags & HEADERS.TILESTATE_NOLOGOUT,
  "PVPZone": flags & HEADERS.TILESTATE_PVPZONE,
  "refresh": flags & HEADERS.TILESTATE_REFRESH
}

Can you add in the transformation routine:

console.log(JSON.stringify(feature).length);

Then we can get an estimate of the size of a single tile area in JSON representation.

@Inconcessus
Copy link
Owner

If memory remains a huge problem we can always write the completed features to disk and compile them afterwards using fs.createReadStream. As of now they are kept fully in memory for the sake of simplicity.

@JSkalskiSBG
Copy link

JSkalskiSBG commented Mar 13, 2019

There must be some memory leak in stream algorithm. It's not just 1 jump to 1.6GB. It starts from 130MB and grows slowly to 1.6GB.
That's why I tried 'global.gc()' every 500 TILE_AREAS to make sure that GC is run.
Is there any other array that stores some information about TILE_AREAS or TILES?

I ran Inspector and it showed that most of time app spent on copying FastBuffer and some FastBuffer 'buffer' is thing that grows to 1.6GB in RAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants