Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase speed of reading Solr data by spark #351

Open
bplaye opened this issue May 9, 2022 · 0 comments
Open

Increase speed of reading Solr data by spark #351

bplaye opened this issue May 9, 2022 · 0 comments

Comments

@bplaye
Copy link

bplaye commented May 9, 2022

Hello everyone,

I am using spark-solr to fetch 2 or 3 attributes (id and date attributes) from solr but it takes tens of seconds to fetch hundred thousands documents.

My solr collections have around 10 shards, and each of them have 4 replicas. My collections contains from ten millions documents to hundred millions of documents.
Regarding the lucidworks spark-solr connector, I set rows to 10000 and splits to true.

Is it the expected behavior ? (I mean, is Solr slow when fetching data by essence ?) Or could you help me understand how to configure solr and this lucidworks connector to increase the fetch speed please ? I hardly found answers on the internet.

Thank you for your help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant