You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Opening this issue with the suggestion that we include support for Arrow in r5r.
As documented on their website, Arrow specifies a standardized language-independent columnar memory format for flat and hierarchical data. This would mean two most obvious advantages: (1) passing data from Java to R (from R5 to r5r) would become seamless, (2) saving outputs in .parquet format. Both of these advantages would probably make r5r substantially faster, with more efficiency gains for large scale analyses.
There are robust implementations of Arrow in Java, R and also in Python (in case we want to implement this in r5py).
I'm not sure this could be done entirely within the Java side of r5r or whether it would require some change to R5 upstream. In any case, this might be something that the @conveyal would be interested in, since this would speed improve up the process of passing R5 results to interactive visualization in Conveyal Analysis.
The text was updated successfully, but these errors were encountered:
I've been working with the csv output of travel_time_matrix for a large region and many of the csv files contain only a few lines, performance reading 30k files is poor. It would be a bit more work, but writing the matrices to parquet files, aggregating multiple from_ids would be hugely helpful.
yes, this could be a great improvment to te package. @botanize , if you're familiar with Java and would like to have a look at this, we would appreciate PR from collaborators
Opening this issue with the suggestion that we include support for Arrow in r5r.
As documented on their website, Arrow specifies a standardized language-independent columnar memory format for flat and hierarchical data. This would mean two most obvious advantages: (1) passing data from Java to R (from R5 to r5r) would become seamless, (2) saving outputs in
.parquet
format. Both of these advantages would probably make r5r substantially faster, with more efficiency gains for large scale analyses.There are robust implementations of Arrow in Java, R and also in Python (in case we want to implement this in r5py).
I'm not sure this could be done entirely within the Java side of
r5r
or whether it would require some change to R5 upstream. In any case, this might be something that the @conveyal would be interested in, since this would speed improve up the process of passing R5 results to interactive visualization in Conveyal Analysis.The text was updated successfully, but these errors were encountered: