Add batch bulk insert, refactor insert #14

Sylvyrfysh · 2019-10-11T22:46:44Z

This begins adding batch and bulk insert operations, which on data size of 250 items, increase performance by 5% and 5000%, respectively. Since bulk does things so much better, I considered dropping batch, but it may be necessary in some situations I do not yet know about.

I have marked this as not ready for pull for 2 reasons:

Tests are not complete
These are probably not where they should be yet. I kept them alongside the other inserts, but they're really not on one model.

Let me know if you'd like anything changed, or feel free to make them yourself.

Tests have not yet been made

Sylvyrfysh · 2019-10-11T22:47:15Z

This is built on my other request branch, so it needs to be pulled after that or refactored.

Sylvyrfysh · 2019-10-12T01:37:58Z

In my testing, if using HikariCP, batch is by far the fastest.

vsch · 2019-10-12T04:30:43Z

@Sylvyrfysh, this is awesome. Give me until tomorrow to merge it in. I was doing some fire hose cleanup in 0.5.0 thinking I have a day or two before releasing a new version.

I left some original code from KotliQuery and focused on adding much needed features. Now going through the code I decided it was time to do some cleanup of its limitations and unnecessary overheads, take care of kludges I added and add some syntactic sugar (I thought) which turned out to be an improved parameter type preservation mechanism.

I too found that listQuery generated wrong SQL. The code looked like unfinished cut/paste with a few places failing to pass quote or alias to downstream implementations. Really need to add tests for all methods and parameter variations. I will focus more on tests.

Here is the extract from version history for the version committed to dev-0.5.0:

0.5.0-beta-8

Fix: SqlCall out params are now passed by index instead of name to make them independent of actual sql procedure parameter names.
Fix: param() to work for non-null values when calling context's capture type is *, previous implementation would always result in Any type, fixed implementation uses the value's class for type's class.
Fix: param() to not construct a new instance if passed value is already Parameter<*>
Fix: SQL generation for model listQuery with where clause and/or alias
Add: InOut type to Parameter(), defaults to InOut.IN so parameter carries direction of parameter and type.
Add: inTo, outTo and inOutTo infix functions to use instead of to for param creation. These create parameters with in/inOut/out type eliminating the need for multiple params() functions for directional information. Default direction is in if not provided.

They also create an instance of Parameter when the captured type is known. Functions using param() from a collection iteration have * for type capture and loose actual type, resorting to Any (ie. Object).

These also have Collection<T> versions which create a Parameter() with a collection value but type is T::class.java. When generating parameters for prepared stmt this type is used to determine sql type for the element of the collection instead of generic Any as was before, again because of * captured type from a collection.
Fix: extract SqlQueryBase<T> for SqlQuery and SqlCall to eliminate the need to override function just to cast them to the right class. SqlCall no longer extends SqlQuery.
Fix: SqlQueryBase now stores each parameter in Parameter() data class format. Eliminates creating a new instance with param() since if it is already parameter it just returns the instance.
Fix: deprecate old methods in favor of directional parameter declaration.
Fix: change Session methods to use SqlQueryBase<*> when either SqlQuery or SqlCall can be used. For some, like updates with get keys, SqlCall never returns any keys even when stored procedure executes DML that does, so these are now SqlQuery only. Similarly, those only applicable to SqlCall are now SqlCall typed and will not take SqlQuery
Fix: deprecate Session.forEach(SqlCall, (stmt: CallableStatement) -> Unit, (rs: ResultSet, index: Int) -> Unit)

Add: Session.executeCall(SqlCall, (results: SqlCallResults) -> Unit) to replace Session.forEach(SqlCall, (stmt: CallableStatement) -> Unit, (rs: ResultSet, index: Int) -> Unit). Use SqlCallResults.forEach to process result sets returned by the procedure call. To get values of out params use SqlCallResults.get{Type}(paramName) to get non-null values or SqlCallResults.get{Type}OrNull(paramName) to get nullable values.

For example, when using old forEach the code was:

val sqlQuery = sqlCall("""CALL processInstances(:clientId, :types)""")
  .inParams("clientId" to 35)
  .inOutParams("types" to "")

val jsonResult = MutableJsObject()
var types: List<String> = listOf()

session.forEach(sqlQuery, { stmt ->
    types = stmt.getString("types").split(',')
}) { rs, index ->
    val key = types[index]
    jsonResult[key] = MutableJsArray(Rows(rs).map(toJsonObject).toList())
}

Using the new executeCall and directional param declarations, the code changes to:

val sqlQuery = sqlCall("""CALL processInstances(:clientId, :types)""",
          mapOf("clientId" inTo 35, "types" inOutTo ""))

val jsonResult = MutableJsObject()

session.executeCall(sqlQuery) { results:SqlCallResults ->
    val types = results.getString("types").split(',')
    results.forEach { rs, index ->
        val key = types[index]
        jsonResult[key] = MutableJsArray(Rows(rs).map(toJsonObject).toList())
    }
}

I will continue working on it tomorrow and merge your changes into the code.

Sylvyrfysh · 2019-10-13T20:07:55Z

Sounds good! I am no SQL expert myself. and will trust that you'll update things as they come. I'm mostly adding things that are of use in my Database Systems class for this one project, and will continue contributing as long as I can. I've found a couple of issues around HikariCP and default schemas that I might contribute if I can both find the issue and fix it, otherwise I'll open an issue. Thank you for creating as much as you have though, this is much nicer than a standard Java API!

vsch · 2019-10-13T20:17:58Z

I'm mostly adding things that are of use in my Database Systems class for this one project

This is how this project got started and evolved. I did it originally trying to convince a client to ditch Scala Slick because it is to SQL what pig-latin is to English. Unfortunately they did not go for it thinking it will delay the project. Reality is the reverse and in the last two years I wasted a lot of (billable) time trying to figure out how to do a simple SQL query in Scala Slick. In the end it always comes out undecipherable gibberish only a Scala disciple can unravel.

I appreciate all input and contributions. My time of these days is scarce and I wish I could spend more on it. Kotlin has enough flexibility to make a very intuitive SQL API.

vsch · 2019-12-17T03:47:18Z

@Sylvyrfysh, I will add my comments and questions. I like the idea because bulk and batch inserts work so much faster. This is especially noticeable over remote DB connections.

However, I am not sure if the current implementation addresses the general use cases.

Sylvyrfysh · 2019-12-19T16:42:25Z

I don't know that it does, but with 0.5.0 and releasing the companion, I'm not sure of a better way

vsch · 2019-12-19T16:49:36Z

@Sylvyrfysh, I will think about it but can see one way is to group the model instances by matched insert column list. Then all the models in the group can be bulk inserted and the code will work for all model definitions.

I am not sure about the batch insert since I never used it but is this not just a driver optimization to send multiple inserts as a single request or does it too have one insert column list followed by multiple sets of insert values?

Sylvyrfysh added 2 commits October 11, 2019 17:21

Fix List Query 0.5 branch

81e55c4

Adds batch and bulk insert operations

fc4b633

Tests have not yet been made

Acutally add items on batch insert

3a0e113

Merge branch 'dev-0.5.0' into add-batch-bulk-insert

a14a9b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch bulk insert, refactor insert #14

Add batch bulk insert, refactor insert #14

Sylvyrfysh commented Oct 11, 2019

Sylvyrfysh commented Oct 11, 2019

Sylvyrfysh commented Oct 12, 2019

vsch commented Oct 12, 2019

Sylvyrfysh commented Oct 13, 2019

vsch commented Oct 13, 2019

vsch commented Dec 17, 2019

Sylvyrfysh commented Dec 19, 2019

vsch commented Dec 19, 2019

Add batch bulk insert, refactor insert #14

Are you sure you want to change the base?

Add batch bulk insert, refactor insert #14

Conversation

Sylvyrfysh commented Oct 11, 2019

Sylvyrfysh commented Oct 11, 2019

Sylvyrfysh commented Oct 12, 2019

vsch commented Oct 12, 2019

0.5.0-beta-8

Sylvyrfysh commented Oct 13, 2019

vsch commented Oct 13, 2019

vsch commented Dec 17, 2019

Sylvyrfysh commented Dec 19, 2019

vsch commented Dec 19, 2019