-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pagination not working more than once #956
Comments
In console, |
We use python-style slicing so it is |
Yes, thanks @honzakral, I realized soon after opening this issue. I have another query relate to this. Could you tell me what your thoughts are on estimated time for |
@ivmarkp that is not supported - by default the maximum size of a page is 10K. If you want to retrieve more documents I recommend you look at the 0 - http://elasticsearch-py.readthedocs.io/en/master/helpers.html#scan |
I've actually set I tried |
Pagination over millions of documents is definitely not a good idea since the memory and computational cost of every consecutive page is more expensive than the previous (for the last page all of the millions of documents will need to be sorted in memory). That is why we have the scan API that circumvents this by storing the intermediate state between requests. You can also have a look at using |
Thanks @honzakral for your suggestions.
Are you talking about this method? Its documentation says...
I don't find it much helpful and example usage is also missing. I searched through the documentation but couldn't find anything that talks about using params method to specify additional arguments. Could show me with an example how this method may be used for my usecase? On a side note, I had another query; since we are going to export millions of docs. I was thinking if there was a way to continue scan from a particular point like we are able to do easily with pagination / slicing? (as you can imagine, continuing like this saves us a ton of time). |
For that use the |
Slicing seems to work only once for me. After debugging, I found that the
size
is set to0
ins
even though I assign it to 1000 in the while loop below. Is this a bug I just found or something's wrong with my code itself?The text was updated successfully, but these errors were encountered: