-
Notifications
You must be signed in to change notification settings - Fork 1k
Experiment: Compact LineString geometry
The proposed optimization of replacing LineString allow a graph size reduction (for graph with street data only) of approximatively 20%.
For a small representative graph, below are the changes regarding the memory usage and objects count:
Original CompactLineString
--------------------------------------------------
Graph heap size 155.756.000 124.685.000
# Envelopes 496.393 296.374
Envelope size 23.826.000 14.225.000
# double[] 200.045 -
double[] size 14.960.000 -
# LineString 200.045 -
LineString size 6.401.000 -
# CompactLineString - 101.586
CompactLineString size - 1.625.000
# byte[] (CompactLineStr) - 101.585
byte[] size - 3.064.000
--------------------------------------------------
The optimized version (called "CompactLineString") act on the following:
- The start/end points are provided externally (by the from/to vertices of the relevant edge), so they are not stored anymore.
- For straight-line geometries (no intermediate points), a singleton instance is shared (this allow the same size reduction as setting the geometry to NULL, and have the benefit of allowing NULL geometries if needed, and do not need special tests in the client code).
- The intermediate coordinates are delta-coded (only the delta from the previous point is stored) using a fixed-point value, stored using a variable-length encoding. Most values fits in 1 byte (for delta < 7m) or 2 bytes (delta < 900m). The precision is 0.11 metre (or better).
- The CompactLineString class store only one byte[] array for storing the packed coordinates. The coding/decoding conversion functions are a set of static methods, i.e. all the CompactLineString methods just return or operate on a byte[] that is stored directly in the edges themselves.
- Keeping directly the byte[] array in the edge instead of a wrapper object save around 1.3% of memory (124685k down to 123060k in the same sample as above). This is not really much, but as the code change is minimal and does not add complexity, so it still worth it.
- We are incurring extra overhead decoding the packed geometries any time they are used, but we're fairly certain that the memory bus is our bottleneck rather than processor usage, and the reduced graph size is probably worth it.
- We are still wondering if the Duglozs packed byte[] version is really worth it vs the int[] version, but the decoding is quite fast. That can change however if we switch the spatial index to a hash grid version; false positive number would grow, and each of them require a geometry decoding to be eliminated. We are currently testing to see if this is really relevant.
- All those changes are really drop-in replacements that can be reverted back quite easily. With a static method version one can choose quite easily between a packed and a non-packed version (with 4 methods, 2 for decoding, 2 for encoding). The public interface of Edge is kept the same: public LineString getGeometry(); so the compact geometry version can be easily switched on and off if needed.
For more information the code is available here: https://github.com/laurentg/OpenTripPlanner-plannerstack/compare/geometry-optimizations
The remaining Envelopes are mostly used by the STRTrees of the StreetVertexIndex. QuadTree uses a lot less, but is slower. I've made some quick tests using the custom OTP "HashGrid", it seems to be almost as fast and consume problably less memory, maybe we should give it a more serious try (I had some errors during linking using it, I did not investigated it further).
For the record, with this optimized version the remaining memory usage per class is the one below (objects retained by the graph itself, with the same small example):
Class Name | Objects | Shallow Heap
-----------------------------------------------------------------
...er.routing.edgetype.PlainStreetEdge | 200,019 | 17,601,672
char[] | 309,125 | 15,656,000
com.vividsolutions.jts.geom.Envelope | 296,374 | 14,225,952
java.util.HashMap$Entry | 422,836 | 13,530,752
...outing.util.ElevationProfileSegment | 200,019 | 12,801,216
java.lang.String | 309,125 | 7,419,000
...ons.jts.index.strtree.ItemBoundable | 274,084 | 6,578,016
java.lang.Object[] | 170,474 | 6,120,536
...ent.locks.ReentrantLock$NonfairSync | 148,111 | 4,739,552
...uting.vertextype.IntersectionVertex | 73,927 | 4,731,328
java.lang.Integer | 273,862 | 4,381,792
...til.concurrent.CopyOnWriteArrayList | 148,110 | 3,554,640
byte[] | 101,585 | 3,064,384
java.util.HashMap$Entry[] | 612 | 2,670,144
...util.concurrent.CopyOnWriteArraySet | 148,110 | 2,369,760
...util.concurrent.locks.ReentrantLock | 148,110 | 2,369,760
...r.common.geometry.CompactLineString | 101,586 | 1,625,376
java.util.ArrayList | 22,365 | 536,760
...s.index.strtree.STRtree$STRtreeNode | 22,282 | 534,768
java.util.HashMap | 636 | 30,528
java.util.TreeMap | 608 | 29,184
Total: 21 of 70 entries; 49 more | 3,375,567 | 124,685,376
-----------------------------------------------------------------
unless you are intentionally working with legacy versions of OpenTripPlanner. Please consult the current documentation at readthedocs