-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Demand API #53
Comments
@jkeiser, @lemire Following up on your suggestion from here after I played with it a bit. Are there any examples of iterating through arrays of numbers, strings, bools, etc.? I can only seem to find examples accessing the scalar values in objects using their keys. As an example (I was trying this out on some GeoJSON), how would we access the individual points in the following (after accessing the coordinates with something like {
"type": "LineString",
"coordinates": [
[1, 2], [3, 4], [5, 6]
]
} For some R context, we would need to turn geojson <- '{
"type": "LineString",
"coordinates": [
[1, 2], [3, 4], [5, 6]
]
}'
parsed <- RcppSimdJson::fparse(geojson)
parsed
#> $type
#> [1] "LineString"
#>
#> $coordinates
#> [,1] [,2]
#> [1,] 1 2
#> [2,] 3 4
#> [3,] 5 6 ... and since R is column-major, we're really building the following (with attached dimension attributes): as.vector(parsed$coordinates)
#> [1] 1 3 5 2 4 6 |
I'd expect it to look like: int i=0;
for (auto point : doc["coordinates"]) {
int j=0;
for (double val : point) {
matrix[i][j] = val;
j++;
}
i++;
} |
Thank you. Is there a way to get an array's size (so we can tell how big |
@knapply rbind? |
No, for two reasons. Those things tend to be performance killers as the underlying R data structures don't grow easily. So a general Rcpp pattern is to collect everything in C++ (hello old friend STL) and then convert at end. Plus, |
I was half kidding. |
(And @lemire for example on of my all-time fave little tools in R is |
Hah! Point taken with a grin! |
:-) |
Edit... welp, I typed too slow 😬 If only. I think I over complicated the question with a matrix. Let's say we have an array... [1, 2, 3, 4, 5, 6] ... and we want to insert the data into a vector (R/Rcpp or STL) equivalent to this... std::vector<double>{1, 2, 3, 4, 5, 6}; We would need to know the size of the array... std::vector<double> out(std::size(array)); // how do we do this?
int i = 0;
for (double element : array) {
out[i++] = element;
} |
std::vector<double> out;
for (double element : array) { out.push_back( element); } |
... still half kidding? :D (I can't tell if knowing the size would even be possible with the On Demand API, but it sorta make this a non-starter). |
The above code is standard C++, although there is room for optimizations.... |
Yeah, using push back is necessary, though we could probably expose an upper bound on the array size if you are willing to risk overallocation... still, I"d test the push_back method first, the cost of vector growth may well be lower than the cost of materializing a DOM. |
For performance, you almost always want to overallocate if only temporarily. |
As discussed in #52 (review)
How best to leverage simdjson's On Demand API?
The text was updated successfully, but these errors were encountered: