Items table population is a single transaction doing 100000 inserts #500
-
Context While building the schema, i noticed that the population of items table takes a huge amount of time ~1.5 hours. On debugging the above i found out that this is a single transaction which is doing 100000 inserts into the items table at once. Hence this is very slow, on the other hand in other table populations such as stock the commits are happening at some intervals. Please let me know if this is deliberately done in this way. Also is there any way to make it quicker ? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
OK so as it is open source, we can do a quick test by modifying the build script to time the item load:
On a test laptop, it takes 11 seconds to load the items table: Also if you look in the source you can see that it is not loading all the rows in one insert, however instead it is reporting the rows loaded only at 50K and 100K, it does in fact load each row with a separate insert.
Actually, this is the smallest table so we let the monitor virtual user take care of it as the other virtual users will take longer loading the other tables that scale in size with the warehouse count. Considering your load is almost 500X slower than the example, it is possible that you are making a basic error of attempting to load a database in the cloud from a local client. Doing this means that each and every insert needs to do a round trip from client to database, if network latency is high then the load is going to be slow. Consequently, you have 2 options. Firstly, locate your client in the cloud in the same region as your database. Secondly, do a bulk load from flat files using the datagen feature, the documentation is here https://www.hammerdb.com/docs/ch13.html and as it says: This option may be preferred for example where the target database to load is located in the cloud or where the target database has a column structure meaning that load performance using batch inserts is poor. Note, however, that if you are also planning to test from a remote location, then network latency is going to have an impact here as well. |
Beta Was this translation helpful? Give feedback.
OK so as it is open source, we can do a quick test by modifying the build script to time the item load:
On a test laptop, it takes 11 seconds to load the items table:
Also if you look in the source you can see that it is not loading all the rows in one insert, however instead it is reporting the rows loaded only at 50K and 100K, it does in fact load each row with a separate insert.