-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migrate from icelake to iceberg-rust #19887
Conversation
We should merge risingwavelabs/iceberg-rust#10 first, and after that, I will rebase this PR and convert it into the normal state. |
Is wrong PR link in the description? |
Thanks! Good catch!🤣 |
deb061e
to
84d2463
Compare
b5dac34
to
2a93eba
Compare
async fn update_table(self: Arc<Self>, update_table: &UpdateTable) -> icelake::Result<Table> { | ||
execute_with_jni_env(self.jvm, |env| { | ||
let request_str = serde_json::to_string(&CommitTableRequest::try_from(update_table)?)?; | ||
|
||
let request_jni_str = env.new_string(&request_str).with_context(|| { | ||
format!("Failed to create jni string from request json: {request_str}.") | ||
})?; | ||
|
||
let result_json = | ||
call_method!(env, self.java_catalog.as_obj(), {String updateTable(String)}, | ||
&request_jni_str) | ||
.with_context(|| { | ||
format!( | ||
"Failed to update iceberg table: {}", | ||
update_table.table_name() | ||
) | ||
})?; | ||
|
||
let rust_json_str = jobj_to_str(env, result_json)?; | ||
|
||
let response: CommitTableResponse = serde_json::from_str(&rust_json_str)?; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation should be migrated to iceberg-rust JNI as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM. Thank you so much for this PR!!!
@@ -402,8 +312,57 @@ impl CatalogV2 for JniCatalog { | |||
} | |||
|
|||
/// Update a table to the catalog. | |||
async fn update_table(&self, _commit: TableCommit) -> iceberg::Result<TableV2> { | |||
todo!() | |||
async fn update_table(&self, mut commit: TableCommit) -> iceberg::Result<TableV2> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TableV2
now can be rename to Table
@@ -82,7 +82,7 @@ impl IcebergProperties { | |||
java_catalog_props.insert("jdbc.password".to_owned(), jdbc_password); | |||
} | |||
// TODO: support java_catalog_props for iceberg source | |||
self.common.load_table_v2(&java_catalog_props).await | |||
self.common.load_table(&java_catalog_props).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Methods contain v2
can be renamed as well
} | ||
|
||
#[allow(clippy::type_complexity)] | ||
enum IcebergWriterDispatch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember we had some discussion on the writer trait. Why do we now use enum instead?
db488e8
to
6ff4d64
Compare
32d5892
to
c942311
Compare
let table_metadata = response.metadata; | ||
|
||
let file_io = FileIO::from_path(&response.metadata_location)? | ||
.with_props(self.config.table_io_configs.iter()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we don't need self.config
(BaseCatalogConfig is a struct from icelake) anymore, we can simplify it to a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work!!
BTW I think splitting this PR into smaller ones (separate refactor and feat) can make it easier to review more carefully:
- migrate sink to iceberg-rust, without touching legacy code
- deleting icelake code
- rename
*_v2
to*
But I'm also ok to merge it as long as ci passes since iceberg connector is under heavy development status. 🤪
Let's move fast and get rid of icelake as soon as possible, because right now we have some known issues can't be resolved in icelake (like statistics). By migrating to iceberg-rust, we can support more complex nested data type for iceberg engine table, google cloud store integration, and so on. |
Thanks @chenzl25 ! I will send PR to get rid of all icelake later. |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR migrate icelake to iceberg-rust.
After risingwavelabs/iceberg-rust#10, iceberg-rust can support write, so we can migrate all things from icelake to iceberg-rust.
Checklist
Documentation
Release note