Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review gist of CrateDB adapter #63

Open
amotl opened this issue Jul 22, 2024 · 1 comment
Open

Review gist of CrateDB adapter #63

amotl opened this issue Jul 22, 2024 · 1 comment

Comments

@amotl
Copy link
Member

amotl commented Jul 22, 2024

Hi there,

we just talked about TSPERF again, because both TSBS and RockBench will no longer be maintained going forward, so TSPERF could be considered as a good alternative.

Can I humbly ask you to evaluate and/or confirm this is the optimal / most-performant way to insert data into CrateDB?

# Create table.
stmt = f"""CREATE TABLE {self.table_name} ("ts" TIMESTAMP WITH TIME ZONE,
"g_ts_{self.partition}" TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS date_trunc('{self.partition}', "ts"),
"payload" OBJECT(DYNAMIC))
CLUSTERED INTO {self.shards} SHARDS
PARTITIONED BY ("g_ts_{self.partition}")
WITH (number_of_replicas = {self.replicas})"""
logger.info(f"Preparing database with statement:\n{stmt}")
self.cursor.execute(stmt)
@timed_function()
def insert_stmt(self, timestamps: list, batch: list):
stmt = f"""INSERT INTO {self.table_name} (ts, payload) (SELECT col1, col2 FROM UNNEST(?,?))""" # noqa: S608
self.cursor.execute(stmt, (timestamps, batch))

With kind regards,
Andreas.

/cc @WalBeh, @proddata, @hammerhead, @hlcianfagna

@hammerhead
Copy link
Member

According to the documentation, UNNEST is still faster than multi-value statements. But it also says that the HTTP bulk endpoint is comparable to UNNEST:

The advantages are the same as using the UNNEST method:

From that perspective, I think we can use the more common self.cursor.executemany over UNNEST. Which is also what we typically use in production (for Python environments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants