-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New db to speed up full text queries and library updates #114
base: master
Are you sure you want to change the base?
Conversation
First, let me say a huge thank your for the work you are putting into this and the other related PR. I really like it and appreciate it. I have, however, one suggestion to make. I suggest removing We have a SQLite DB that has all the components stored in a raw format. From that, we generate the per-category JSON files. We already have a library abstraction that hides the DB and allows you to operate in "part" and "category" terms. The whole generator is in jlcparts/jlcparts/datatables.py Lines 351 to 384 in 1a41275
I have two suggestions:
What do you think? |
Most welcome. I have been using your page quite often recently, and I love it, so thank you for creating it.
If we remove the generateJsonlinesDatabaseFiles.js file, and put the processing in the python script, then there's no need for the original subcategory JSON files to be created. If they're not used by anything else, then of course, they can go. I wrote it in js, as it's easier for me, as I don't program in python. I can add it to the python script; it shouldn't be too complex.
I think the first option is the best (i.e. change output format of the generator), as it's a simple change and leverages your existing code, which has proven to work, and handles a bunch of edge cases. |
The python script has been updated to generate the single output file, without creating the *.json.gz and *.stock.json file, and the javascript file has been nuked. |
@yaqwsx Is there anything further I should do to with this PR (or the others)? |
It's my turn; I'll review them ASAP. I just ran out of my time allocated for the OSS projet's maintenance as there was a lot of work on KiKit with the recent release of KiCAD v8. |
Ok, no worries. I'm not trying to push you (I can use my test site), I just wasn't sure if I'd done it wrong. I did notice the auto-merge of the different PRs makes bad merges (they touch similar parts of the same files). If you can say which PRs you want to include, I can rebase those ones on each other, sequentially, and correct the bad merges. |
All the .json.gz and .json.stock files have been processed into three compressed tables/files, downloadable as a single 20MB tar archive; these tables are copied directly into IndexedDB without decompression. The database update now takes ~150ms + time to download a 20MB file.
Running queries on IndexedDB for full text search previously took around 15 seconds for "Select". These queries on this new db take around 2 seconds for "Select", and act on the compressed tables. Note: the query takes ~1.5 seconds, but an extra variable amount of time is lost to allow aborting the query, and yielding to the UI.
The code to process the .json.gz/.stock.json files is in javascript, and takes around 15 seconds to complete.
I'm currently using gzip for compression. I tried with lz4, which was 3x faster, until I changed any of the code to allow streaming decompression. Could be worth looking at in future, when lz4 is natively supported by browsers.