Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented custom SQL DB registration #917

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Conversation

zaleslaw
Copy link
Collaborator

@zaleslaw zaleslaw commented Oct 10, 2024

This pull request introduces several enhancements to the dataframe-jdbc module, primarily focusing on adding support for custom database types (DbType) in various methods. Additionally, it includes a minor change to make the H2 class extensible.

Also, this PR in the client application is based on 0.15-dev built on this branch

Enhancements to database type support:

  • dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/readJdbc.kt:
    • Added optional dbType parameter to multiple methods, allowing custom database type objects to be passed. This parameter defaults to null and, if not provided, the database type is determined from the connection.

Minor changes:

  • dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/db/H2.kt:
    • Changed H2 class to be open, allowing it to be extended.

For finalization, it requires

This commit introduces an optional `dbType` parameter to various JDBC read functions. It allows for specifying the database type directly, enhancing flexibility and control over query execution. This change ensures backward compatibility by defaulting to type inference when `dbType` is not provided.
This update introduces the ability to specify custom database types in the data schema generation process. The changes include the addition of a `dbTypeClassName` field and modifications to pass `DbType` to relevant functions. This enhances flexibility in handling various database types beyond the default configurations.
This update refactors the DataSchemaGenerator to use ServiceLoader for loading DbTypeProvider and removes hard-coded class loading. It also changes the annotation parameter from dbTypeClassName to dbTypeKClass for better type safety.
Copy link
Contributor

Generated sources will be updated after merging this PR.
Please inspect the changes in here.

@Jolanrensen Jolanrensen marked this pull request as ready for review October 14, 2024 11:02
@@ -113,16 +113,18 @@ public data class DbConnectionConfig(val url: String, val user: String = "", val
* @param [tableName] the name of the table to read data from.
* @param [limit] the maximum number of rows to retrieve from the table.
* @param [inferNullability] indicates how the column nullability should be inferred.
* @param [dbType] the type of database, could be a custom object, provided by user, optional, default is `null`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe specify what happens when it's null, aka, it will be inferred from dbConfig

@@ -142,12 +145,13 @@ public fun DataFrame.Companion.readSqlTable(
tableName: String,
limit: Int = DEFAULT_LIMIT,
inferNullability: Boolean = true,
dbType: DbType? = null,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While having a dbType argument here is great for building and debugging support for new DbTypes, I do feel like it's not "complete custom support" of new datatypes.

To be fully extensible, like with DataFrame.read(), I think we should support having service-registered DbTypes. Then a user should be able to do:

implementation("org.jetbrains.kotlinx.dataframe...")
implementation("com.custom.dbtype-for-df")

and DataFrame.readSqlTable(MyCustomDbConnection) should work, recognizing the custom DBType that's on the classpath.

Still, it's just an idea and having this dbType argument here is still a must for when the automatic dbtype detection fails.

@Jolanrensen Jolanrensen marked this pull request as draft October 14, 2024 11:02
@@ -967,4 +972,127 @@ class JdbcTest {
}
exception.message shouldBe "H2 database could not be specified with H2 dialect!"
}

// helper object created for API testing purposes
object CustomDB: H2(MySql)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's only for testing, i'd recommend to keep H2 final and use object CustomDB : DbType by H2(MySql)


@Test
fun `read from table from custom database`() {
val tableName = "Customer"
Copy link
Collaborator

@koperagen koperagen Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of assertions test correct reading of data, but actually functionality under test is correct propagation of DbType, as i understand. So most of it is duplicated code? In any case, changes that will trigger H2 tests assertions to fail, will fail assertions here too. For this reason please consider a different approach here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants