Sync riverqueue with upstream #4

tigrato · 2025-01-15T19:05:23Z

No description provided.

These were swapped and pointing at each other's PRs.

… counts (#698) This one's here to address #697, where's it been reported that it may be possible for retry schedules to overflow to the past when reaching degenerately high numbers of attempts. I couldn't reproduce the reported problem, but it is possible for `time.Duration` to overflow, so here we shore up `DefaultClientRetryPolicy` so that we're explicitly defining what behavior should occur under these conditions. The maximum length of time that can go in a `time.Duration` is about 292 years. Here's a sample program that demonstrates an overflow happening: func main() { const maxDuration time.Duration = 1<<63 - 1 var maxDurationSeconds = float64(maxDuration / time.Second) notOverflowed := time.Duration(maxDurationSeconds) * time.Second fmt.Printf("not overflowed: %+v\n", notOverflowed) overflowed := time.Duration(maxDurationSeconds+1) * time.Second fmt.Printf("overflowed: %+v\n", overflowed) } not overflowed: 2562047h47m16s overflowed: -2562047h47m16.709551616s Here, modify `DefaultClientRetryPolicy` so that if it were to return a next retry duration greater than what would fit in `time.Duration`, we just return 292 years instead. The maximum bound occurs at 310 retries, so every retry after 310 increments by 292 years. I didn't bother putting a maximum bound on the time returned by `NextRetry` because the maximum `time.Time` in Go is somewhere in the year 219250468, so even in 292 year increments, you'll never get there.

While looking into #695 the other day, I was reminded that the handling of `time.Duration` in things like logging is potentially not very good. `time.Duration` has a good `String()` override that tends to get used with text handlers, but for various legacy reasons it doesn't have a `MarshalJSON` implementation, so when dumped to JSON it gets put in nanoseconds, which isn't very readable for human or computer. This is relevant to use because slog's `JSONHandler` and `TextHandler` will probably be the most common logging instruments for River. e.g. func main() { dur := 500 * time.Millisecond loggerText := slog.New(slog.NewTextHandler(os.Stdout, nil)) loggerText.Info("with text handler", "sleep_duration", dur) loggerJSON := slog.New(slog.NewJSONHandler(os.Stdout, nil)) loggerJSON.Info("with JSON handler", "sleep_duration", dur) } time=2024-12-19T21:36:41.948-07:00 level=INFO msg="with text handler" sleep_duration=500ms {"time":"2024-12-19T21:36:41.949239-07:00","level":"INFO","msg":"with JSONtext handler","sleep_duration":500000000} Here, change the various places that we log sleep to use a more definitive value for purposes of the log. In this case a duration string representation like `10s`. I debated making this a float instead that'd be represented in seconds because the float would be more parseable for ingestion engines like Splunk. I went with the former option in the end though because it's probably better for humans and most other cases, and we could always add an alternative `sleep_seconds` field that's a float later if someone asks for it.

Bumps the go-dependencies group with 1 update: [github.com/jackc/pgx/v5](https://github.com/jackc/pgx). Updates `github.com/jackc/pgx/v5` from 5.7.1 to 5.7.2 - [Changelog](https://github.com/jackc/pgx/blob/master/CHANGELOG.md) - [Commits](jackc/pgx@v5.7.1...v5.7.2) --- updated-dependencies: - dependency-name: github.com/jackc/pgx/v5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: go-dependencies ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

This one's related to #676, in which it was reported that database URLs work somewhat suboptimally when dealing with passwords that contain characters that traditionally require encoding in URLs (e.g. various types of brackets). I was originally going down a path wherein we'd offer an alternative that'd involve adding additional CLI parameters that could work as an alternative to `--database-url` based on these ones from psql: Connection options: -h, --host=HOSTNAME database server host or socket directory (default: "local socket") -p, --port=PORT database server port (default: "5432") -U, --username=USERNAME database user name (default: "brandur") -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) However, as I implemented it, I found that pgx somewhat surprisingly doesn't allow you to instantiate your own config struct. If not parsed from a URL, it requires you to assemble a more traditional connection string like `user=XX pass=YY database=ZZ` and parse that, which introduces its own special character encoding challenges when using something like a space, equal, or quotation mark. Digging further I realized that pgx automatically supports the whole set of PG* environmental variables like `PGHOST`, `PGPORT`, `PGUSER`, and supporting these would add a solid way of using the River CLI without a database URL using a common interface that's also supported by psql and a variety of other pieces of related software. Even better, since we expect most people to be using a database URL most of the time, env vars give us a way to provide an alternative, but one which doesn't add any complication to the existing CLI interface. In general fewer parameters is better because it keeps the help docs easy to understand. Another advantage is that this will technically let us support more complex connection setups involving things like TLS certificates. All of these vars are supported, and none of which had a great answer with a database URL: * `PGSSLCERT` * `PGSSLKEY` * `PGSSLROOTCERT` * `PGSSLPASSWORD` Fixes #676.

Just reduce some really easy to avoid slice allocations for every unique insertion: * No reason to have `requiredV3states` allocated every time a new unique opts is validated. It never changes. * Leave `missingStates` unallocated by default. Most of the time a unique insertion will have have no missing unique states, so this skips a unique allocation completely the overwhelmingly majority of the time.

…ptsByStateDefault()` (#707) Two small ones related to unique job insertion: * Adding a missing doc comment on `JobRow.UniqueStates`. * As discussed in #704, reveal a way for end users to get the default set of unique job states via `rivertype.UniqueOptsByStateDefault()`. Similar to `rivertype.JobStates()`, this is revealed as a function so that it's not possible to accidentally mutate a global slice. (Go: readonly variables would sure be pretty nice.)

We've got a few changes queued up in `master` and it's been a while since the last release so go for a Boxing Day gift with 0.15.0. [skip ci]

…711) One nice thing about the use of a Go workspace is that it's no longer necessary to use `replace` directives in the River CLI module in `./cmd/river`, but it's still able to use latest code from the other modules. This means that we no longer need the finnicky two-phase release process that was previously necessary for the CLI. Here, correct development instructions to detail only the single unified release flow. Note the use of `[skip ci]` so as not to pollute the Go Module cache with the release PR's CI run.

A tiny change to add a missing parenthesis to one of the CLI help scripts (for `--database-url`). Also mention libpq when talking about `PG*` env vars (since that's how the Postgres docs mention them), and tighten up the language slightly.

I'd like to propose that we remove `go.work.sum` from version control. I'd say in general going to workspaces has improved 95%+ of workflows, but there are a couple annoying sharp edges. The main noticeable one is that on any new release versions in `go.work.sum` get updated, forcing the change to be checked in with a separate PR. Reading more about `go.work.sum`, I found that the Go developers didn't even bother to document it for two full years, only doing so reluctantly earlier this year [1]. Its purpose is to track dependencies that are used in the workspace but not part of any normal modules [2]: > The go command will maintain a go.work.sum file that keeps track of > hashes used by the workspace that are not in collective workspace > modules’ go.sum files. In River, only the modules, not the workspace, are public entities used in other projects, and all of their builds are fully reproducible because they all have `go.sum`. `go.work.sum` tracks some extra stuff, but I don't think any of it is crucial to what River's users are actually working with. Furthermore, I learned that the Go docs actually recommend not checking in `go.work` in many cases: > It is generally inadvisable to commit go.work files into version control systems, for two reasons: > > * A checked-in go.work file might override a developer’s own go.work > file from a parent directory, causing confusion when their use > directives don’t apply. > > * A checked-in go.work file may cause a continuous integration (CI) > system to select and thus test the wrong versions of a module’s > dependencies. CI systems should generally not be allowed to use the > go.work file so that they can test the behavior of the module as it > would be used when required by other modules, where a go.work file > within the module has no effect. > > That said, there are some cases where committing a go.work file makes > sense. For example, when the modules in a repository are developed > exclusively with each other but not together with external modules, > there may not be a reason the developer would want to use a different > combination of modules in a workspace. In that case, the module author > should ensure the individual modules are tested and released properly. I think the last paragraph applies more to River -- it's a clear UX improvement and we're all working in a fairly consistent environment without using any other external modules, but even so, if it's often safe that a `go.work` could be omitted, I think there's a good argument that `go.work.sum` could be omitted too. I tried this in a branch and `go get`ing River from another test project, and the removal of `go.work.sum` doesn't seem to have any negative effect. I'd like to propose that we try removing it from Git for a while and just see if there are any noticeable problems. I don't think there will be, and this will purely be beneficial as it lets us avoid needing to bump the file on every release, but we can always add it back in case that turns out to be wrong. [1] golang/go#51941 [2] https://go.dev/ref/mod#workspaces

bgentry and others added 13 commits December 19, 2024 10:06

Fix v0.14.3 changelog PR links

13155b7

These were swapped and pointing at each other's PRs.

add line name to migration comment (#703)

81ca3b9

correct wording on UniqueOpts description (#706)

fc61933

Prepare release 0.15.0 (#708)

58d30fb

We've got a few changes queued up in `master` and it's been a while since the last release so go for a Boxing Day gift with 0.15.0. [skip ci]

r0mant approved these changes Jan 15, 2025

View reviewed changes

tigrato merged commit bbbac33 into gravitational:teleport Jan 15, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync riverqueue with upstream #4

Sync riverqueue with upstream #4

tigrato commented Jan 15, 2025

Sync riverqueue with upstream #4

Sync riverqueue with upstream #4

Conversation

tigrato commented Jan 15, 2025