Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelining mode #160

Merged
merged 36 commits into from
Apr 28, 2024
Merged

Pipelining mode #160

merged 36 commits into from
Apr 28, 2024

Conversation

nikita-volkov
Copy link
Owner

@nikita-volkov nikita-volkov commented Apr 20, 2024

This is a draft of integration with the pipelining mode provided in libpq >14. So far I haven't done any testing at all, so it might just not work at all yet :)

The purpose of this PR is to provide a basis for all the other necessary stuff for the release to be added to. I've managed to make the API backwards compatible.

What's lacking of the necessary stuff:

  • Tests
  • Docs
  • Solution for distributing the libpq-wrapper lib

What would be nice to have for the release:

  • Benchmarks
  • Profiling

Everything is up for discussion here including the current design and implementation. In particular I think there is some space for performance improvement in the way the Pipeline abstraction is implemented, but it might do okay for the initial release.

Any contribution is welcome. If you decide to jump in on any of the mentioned stuff please make PRs to this branch. Also prior to jumping on any work please notify in this thread to avoid stepping on each others toes.

FYI @GulinSS, @steve-chavez, @robx, @wolfgangwalther, @domenkozar, @avanov

@nikita-volkov nikita-volkov marked this pull request as draft April 20, 2024 09:03
* master:
  Isolate testing utils
  Format
  Update the actions
* master:
  Rename runSession to runSessionOnLocalDb
  Factor constants out
@nikita-volkov
Copy link
Owner Author

Made a bunch of tests. It doesn't work yet. Has to do with the processing of results. I'll work on this further.

@nikita-volkov
Copy link
Owner Author

Okay. I've managed to resolve this. However the error processing is now a bit dirty during pipeline execution. The error model needs revision unfortunately, so the changes will likely be backward incompatible. Working on it.

@avanov
Copy link

avanov commented Apr 21, 2024

@nikita-volkov thanks for looking into this! I personally anticipate it a lot, been hoping for it since your old experiments with the binary protocol, as I've seen a few cases in a commercial setting where Haskell data processors were losing perf to Python because the latter had support for pipelining, so this can level the playing field in future debates over picking a tech stack for a service :) In fact, I haven't heard of any other Haskell PG driver that would provide support for pipelining, so this could be the first one. 🚀

Regarding the error handling, FWIW, the Python one essentially has two different interfaces: the plain old regular interface for .execute() and the pipelining mode via .executemany(), and the error/exception recovery assumptions for these modes are different too. Have you considered keeping the pipelining interface as a separate entity with a minimal level of primitives sharing between the current hasql-transaction statements and the pipelining statements (let's say, sharing only at a level of Connection)?

@nikita-volkov
Copy link
Owner Author

nikita-volkov commented Apr 22, 2024

Alright guys, It seems to be fully working now! It took some revision of the error model unfortunately. I think I'll polish it in the coming days.

I'll now extend the task list on this feature with the following requirement:

  • Provide extensive instructions on the migration of the error model in the changelog

Another good news is that it's now covered with a bunch of tests. Following are the results. PRs with more tests are welcome!

Hasql.Pipeline
  Single-statement
    Unprepared
      Collects results and sends params [✔]
    Prepared
      Collects results and sends params [✔]
  Multi-statement
    On unprepared statements
      Collects results and sends params [✔]
    On prepared statements
      Collects results and sends params [✔]
    When a part in the middle fails
      With query error
        Captures the error [✔]
        Leaves the connection usable [✔]
      With decoding error
        Captures the error [✔]
        Leaves the connection usable [✔]

@avanov Unfortunately looking for ways to isolate it completely would have caused complications in both the API and the implementation.

@nikita-volkov
Copy link
Owner Author

Alright. Another milestone has been reached! I've bundled the necessary updates to libpq into Hasql, so that we neither get blocked by having them merged into postgresql-libpq or having somebody maintain a fork just because of that.

Updated the changelog too.

@nikita-volkov nikita-volkov marked this pull request as ready for review April 28, 2024 08:19
@nikita-volkov nikita-volkov changed the title Pipelining mode (WIP) Pipelining mode Apr 28, 2024
@nikita-volkov
Copy link
Owner Author

nikita-volkov commented Apr 28, 2024

Docs are now there. Changelog is updated. I consider the absence of criticism during this PR a positive feedback :) Merging!

@nikita-volkov nikita-volkov merged commit 5b6dd9d into master Apr 28, 2024
4 checks passed
@nikita-volkov nikita-volkov deleted the pipelining branch April 28, 2024 13:01
@nikita-volkov
Copy link
Owner Author

Released in 1.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants