Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tt: add command tt upgrade #936

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mandesero
Copy link
Contributor

@mandesero mandesero commented Sep 3, 2024

tt upgrade command steps:

  • For each replicaset:
    • On the master instance:
      1. Execute the following commands sequentially:
        box.schema.upgrade()
        box.snapshot()
    • On each replica:
      1. Wait until the replica applies all transactions produced by box.schema.upgrade() on the master by comparing the vector clocks (vclock).
      2. Once synchronization is confirmed, execute the following command on the replica:
        box.snapshot()

If any errors occur during the upgrade process, the process will stop and an error report will be generated.


The replica is waiting for synchronization for Timeout seconds. The default value for Timeout is 5 seconds, but you can specify it manually using the --timeout option.

$tt upgrade [<APP_NAME>] --timeout 10

You can also specify which replicaset(s) to upgrade by using the --replicaset option.

$tt upgrade [<APP_NAME>] --replicaset <RS_NAME_1> -r <RS_NAME_2> ...

@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 3 times, most recently from b10d9bd to 34a3589 Compare September 6, 2024 15:31
@mandesero
Copy link
Contributor Author

mandesero commented Sep 6, 2024

Examples:

Case 1: OK

INSTANCE            STATUS   PID    MODE
 app2:storage-002-b  RUNNING  12556  RO
 app2:router-001-a   RUNNING  12560  RW
 app2:storage-001-a  RUNNING  12567  RO
 app2:storage-001-b  RUNNING  12554  RO
 app2:storage-002-a  RUNNING  12555  RW
$ tt upgrade app2
• storage-001: ok
• storage-002: ok
• router-001: ok

Case 2: More than one master in the same replicaset

INSTANCE            STATUS   PID    MODE
app2:storage-002-b  RUNNING  12556  RO
app2:router-001-a   RUNNING  12560  RW
app2:storage-001-a  RUNNING  12567  RO
app2:storage-001-b  RUNNING  12554  RW
app2:storage-002-a  RUNNING  12555  RW
$ tt upgrade app2
• storage-001: error
  ⨯ [storage-001]: app2:storage-001-a and app2:storage-001-b are both masters

Case 3: LSN didn't update

$ tt upgrade app2
• storage-001: error
   ⨯ [storage-001]: LSN wait timeout: error waiting LSN 2003085 in vclock component 1 on app2:storage-001-b: time quota 5 seconds exceeded

Case 4: There is a replicaset that does not have a master

 INSTANCE            STATUS   PID    MODE
 app2:storage-002-b  RUNNING  12556  RO
 app2:router-001-a   RUNNING  12560  RW
 app2:storage-001-a  RUNNING  12567  RO
 app2:storage-001-b  RUNNING  12554  RO
 app2:storage-002-a  RUNNING  12555  RO
$ tt upgrade app2
• storage-001: error
   ⨯ [storage-001]: has not master instance

Case 5: A non-existent replicaset was specified

$ tt upgrade app2 --replicaset foo
   ⨯ replicaset with alias "foo" doesn't exist

@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 4 times, most recently from 2b33e94 to f2406ee Compare September 8, 2024 12:06
@mandesero mandesero changed the title tt: add command tt upgrade tt: add command tt upgrade [WIP] Sep 9, 2024
@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 3 times, most recently from 234b980 to e12fe07 Compare September 10, 2024 11:52
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 8 times, most recently from a30a1e5 to 45fc816 Compare September 17, 2024 09:59
@mandesero mandesero changed the title tt: add command tt upgrade [WIP] tt: add command tt upgrade Sep 17, 2024
@mandesero mandesero marked this pull request as ready for review September 17, 2024 10:14
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
Comment on lines 105 to 108
conn, err := connector.Connect(connector.ConnectOpts{
Network: "unix",
Address: run.ConsoleSocket,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that tt upgrade can be used only if all the instances of the given replicaset(s) are started on the same machine, where the command is executed?

If so, I have two questions:

  • I think that we should at least make this contraint clear from the documentation.
  • Is it complicated to implement evaluation of the needed commands on remote instances using iproto (assuming that we have a permission to evaluate arbitrary code)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subject to discussion. See my comment below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: I filed #968 in case we'll implement the main logic here and remote instance support separately. We're free to solve them both at once if we want/agreed.

cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
cli/upgrade/upgrade.go Outdated Show resolved Hide resolved
Comment on lines 165 to 172
err = WaitLSN(conn, masterUpgradeInfo, lsnTimeout)
if err != nil {
printReplicasetStatus(rs.Alias, "error")
return fmt.Errorf("[%s]: LSN wait timeout: error waiting LSN %d "+
"in vclock component %d on %s: time quota %d seconds "+
"exceeded", rs.Alias, masterUpgradeInfo.LSN,
masterUpgradeInfo.IID, fullInstanceName, lsnTimeout)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psergee Let's discuss it in a thread (to ease navigation over the discussion later).

⨯ [storage-001]: LSN wait timeout: error waiting LSN 2003085 in vclock component 1

This message about LSN seems a bit confusing to me. Maybe add something like "Schema upgrade replication timeout exceeded:" before "error waiting LSN..." to make it clearer what has just happened?

I agree that it may be convenient to have some reason, why the LSN waiting is needed in the first place. However, 'schema upgrade replication timeout' seems a bit vague for me. How about the following variant?

⨯ [storage-001-replica]: can't ensure that upgrade operations performed on "storage-001-master" are replicated to "storage-001-replica" to perform snapshotting on it: error waiting LSN 2003085 in vclock component 1: <..last error from eval or actual LSN..>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose the last from the ones suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@psergee Is it OK for you?

Comment on lines 11 to 12
@pytest.mark.skipif(tarantool_major_version < 3,
reason="skip test with cluster config for Tarantool < 3")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests without the cluster configuration. However, I see nothing specific to tarantool 3.x in the implementation. What is the reason? Is it easier to write a test for tarantool 3.x only?

@psergee Should we add some test that works on tarantool 2.11?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still in progress.


@pytest.mark.skipif(tarantool_major_version < 3,
reason="skip test with cluster config for Tarantool < 3")
def test_upgrade_all_replicasets(tt_cmd, tmp_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the test cases perform a no-op upgrade: the instances are already on the last database schema version. As result an implementation that doesn't call box.schema.upgrade() at all would pass this test. No-op WaitLSN would also pass it.

@psergee Should we add scenarios that perform a real upgrade from, say, a 2.11.1 schema? Or it is considered as too complicated in the given testing infrastructure?

I see two variants how to implement it:

  • Save the pre-generated 2.11 snapshot into the repository, copy them into appropriate instance directories before starting the cluster.
  • Downgrade the database schema to 2.11.1 on running instances before calling the tt upgrade command.

Dowgrading may need extra tricks before tarantool/tarantool#10150 is solved if a cluster configuration is in use.

Copy link
Contributor Author

@mandesero mandesero Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests (downgrade -> 2.11 -> upgrade 3.x) added for single instance and vshard cluster.

Copy link
Contributor

@oleg-jukovec oleg-jukovec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a special subcomand tt replicaset for such staff. I propose to add a tt replicaset upgrade command instead of the tt upgrade.

@Totktonada
Copy link
Member

We have a special subcomand tt replicaset for such staff. I propose to add a tt replicaset upgrade command instead of the tt upgrade.

@oleg-jukovec Is it a decision or a proposal to discuss?

If the former, OK.

If the latter I would highlight my opinion on it.

  • We also have tt cluster and the command relates to the cluster in some sense -- it may work with several replicasets. From a user point of view tt cluster and tt replicaset looks very close and where a particular subcommand resides seems arbitrary. And there is tt cluster replicaset...
  • As a user who wish to perform an upgrade I would look for it on the top level first and, if it is not there, would look in tt cluster, tt replicaset, tt cluster replicaset. The verb (the action) looks more important here. I'm thinking on 'upgrading a cluster', not on 'do something with a cluster and this action is upgrade' (it reminds me find command syntax :)).
  • I see no point in these three namespaces, TBH. tt status is just tt status and nobody complains that it is not tt <..something..> status. I'm not a big fan of doing things less obvious for a user, because things were done in a non-obvious way before. Maybe it is a good time to start to change this approach piece-by-piece?
  • As a meaningful alternative I can propose tt admin upgrade, because the upgrade looks as a typical administration task. OTOH, tt is at all about administration and development tasks. So tt upgrade is still my favorite.

@oleg-jukovec
Copy link
Contributor

oleg-jukovec commented Sep 19, 2024

We have a special subcomand tt replicaset for such staff. I propose to add a tt replicaset upgrade command instead of the tt upgrade.

@oleg-jukovec Is it a decision or a proposal to discuss?

If the former, OK.

If the latter I would highlight my opinion on it.

* We also have `tt cluster` and the command relates to the cluster in some sense -- it may work with several replicasets. From a user point of view `tt cluster` and `tt replicaset` looks very close and where a particular subcommand resides seems arbitrary. And there is `tt cluster replicaset`...

* As a user who wish to perform an upgrade I would look for it on the top level first and, if it is not there, would look in `tt cluster`, `tt replicaset`, `tt cluster replicaset`. The verb (the action) looks more important here. I'm thinking on 'upgrading a cluster', not on 'do something with a cluster and this action is upgrade' (it reminds me `find` command syntax :)).

* I see no point in these three namespaces, TBH. `tt status` is just `tt status` and nobody complains that it is not `tt <..something..> status`. I'm not a big fan of doing things less obvious for a user, because things were done in a non-obvious way before. Maybe it is a good time to start to change this approach piece-by-piece?

* As a meaningful alternative I can propose `tt admin upgrade`, because the upgrade looks as a typical administration task. OTOH, `tt` is at all about administration and development tasks. So `tt upgrade` is still my favorite.

I agree that the current naming tt replicaset, tt cluster, and tt cluster replicaset are confusing. This is a separate issue that we will try to fix in the next major release.

Currently, tt replicaset is used for managing a local cluster, including actions like bootstrap, vshard bootstrap, and rebootstrap. The upgrade command, in its current context, logically complements this set of commands.

I don’t see a point to introduce further confusion with individual commands/another approach or to add new sets of subcommands. This should be a centralized design decision that is outside the scope of this pull request.

@Totktonada
Copy link
Member

Hm. My expectation is that the upgrade command is suitable for an arbitrary cluster, not only local cluster. (AFAIU, tt can switch to iproto for remote instances, at least in some commands.) Does it change its category from tt replicaset to something different?

This should be a centralized design decision that is outside the scope of this pull request.

OK. I'm a bit worrying about a need for compatibility aliases even for such a new functionality. However, I understand the wish to separate the problems.

It sounds like a decision, so I'll not insist on my feelings here anymore (unless the discussion will gain new points from others).

@psergee
Copy link
Collaborator

psergee commented Sep 20, 2024

Regarding the tt upgrade command compared to the tt replicaset upgrade command, I agree with Oleg that it should be a subcommand of the replicaset command. I still find it confusing to use the cluster, cluster replicaset, and replicaset commands, and I hope that they will be reorganized in a more user-friendly manner in the future. Yes, it could even be called the tt admin command or something similar.

@mandesero mandesero self-assigned this Oct 3, 2024
@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 3 times, most recently from 97ed05d to dc16c99 Compare October 4, 2024 14:06
@mandesero
Copy link
Contributor Author

mandesero commented Oct 4, 2024

The command is now called tt replicaset upgrade [<APP_NAME> | <URI>] [flags] (No tests yet).
The command works similarly to tt replicaset status [..args..], collecting information about replicasets using the Discovery mechanism.

Changes:

  • The structure has been refactored to maximize reuse of the new functionality in tt replicaset downgrade.
  • An initial draft has been created for working with remote instances (getInstanceConnector). However, further discussion is needed to determine the best approach for implementing this functionality. The following issues are debatable:
    • To connect to a remote instance using iproto, the user needs to grant the super or lua_eval role.
    • Configuration of the remote cluster needs to be specified (likely requiring a configuration for Tarantool 3.x).
    • There may be other factors to consider.

In the current version, there is already a (workaround-based) ability to perform an upgrade on a remote cluster (each replicaset separately). This uses a mechanism similar to tt replicaset status <URI>, where replicaset information is obtained from box.info.replication of the instance connected via .


For example, the application is deployed on the server 10.0.10.123 with the following configuration:

Config.yaml
credentials:
  users:
    client:
      password: 'secret'
      roles: [super]
    replicator:
      password: 'secret'
      roles: [replication]
    storage:
      password: 'secret'
      roles: [sharding]

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

sharding:
  bucket_count: 3000

groups:
  storages:
    app:
      module: storage
    sharding:
      roles: [storage]
    replication:
      failover: manual
    replicasets:
      storage-001:
        leader: storage-001-a
        instances:
          storage-001-a:
            iproto:
              listen:
                - uri: 10.0.10.123:3301
          storage-001-b:
            iproto:
              listen:
                - uri: 10.0.10.123:3302
      storage-002:
        leader: storage-002-a
        instances:
          storage-002-a:
            iproto:
              listen:
                - uri: 10.0.10.123:3303
          storage-002-b:
            iproto:
              listen:
                - uri: 10.0.10.123:3304
  routers:
    app:
      module: router
    sharding:
      roles: [router]
    replicasets:
      router-001:
        instances:
          router-001-a:
            iproto:
              listen:
                - uri: 10.0.10.123:3305

Upgrade commands:

$ tt replicaset upgrade tcp://client:[email protected]:3301
• storage-001: ok

$ tt replicaset upgrade tcp://client:[email protected]:3303
• storage-002: ok

tt replicaset upgrade tcp://client:[email protected]:3305
• router-001: ok

However, this is likely not the ideal solution that we aim for, since this method requires a lot of manual work when upgrading a large cluster.

We can make the TCP connection a mock in this patch, support only upgrades on local instances. However, since the mechanism for connecting to remote instances already exists, we might want to implement the full upgrade functionality right away. I would like to hear your thoughts on this @oleg-jukovec @psergee.

@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 2 times, most recently from 019d9f5 to da3946b Compare October 5, 2024 19:05
@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch from da3946b to 49a7cb1 Compare October 7, 2024 16:41
Comment on lines +64 to +81
// This may be a single-instance application without Tarantool-3 config
// or instances.yml file.
if len(discoveryCtx.RunningCtx.Instances) == 1 {
// Create a dummy replicaset
var replicasetList []replicaset.Replicaset
var dummyReplicaset replicaset.Replicaset
var instance replicaset.Instance

instance.InstanceCtx = discoveryCtx.RunningCtx.Instances[0]
instance.Alias = running.GetAppInstanceName(instance.InstanceCtx)
instance.InstanceCtxFound = true

dummyReplicaset.Alias = instance.Alias
dummyReplicaset.Instances = append(dummyReplicaset.Instances, instance)
replicasetList = append(replicasetList, dummyReplicaset)

return internalUpgrade(replicasetList)
}
Copy link
Contributor Author

@mandesero mandesero Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this functionality is really needed, but I thought about it when writing tests (see the last one in test_replication_upgrade.py). (In short, this is Tarantool 2.11 application with single instance).

@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch 2 times, most recently from 956c134 to 5b37eec Compare October 8, 2024 12:52
@mandesero mandesero force-pushed the mandesero/gh-924-upgrade-db-schema branch from 5b37eec to 5168866 Compare October 8, 2024 13:10
Part of tarantool#924

@TarantoolBot document
Title: `tt replicaset upgrade` upgrades database schema.

The `tt replicaset upgrade` command allows for a automate upgrade of each
replicaset in a Tarantool cluster. The process is performed sequentially on
the master instance and its replicas to ensure data consistency. Below are
the steps involved:

For Each Replicaset:
- **On the Master Instance**:
  1. Run the following commands in sequence to upgrade the schema and take
  a snapshot:
     ```lua
     box.schema.upgrade()
     box.snapshot()
     ```

- **On Each Replica**:
  1. Wait for the replica to apply all transactions produced by the
  `box.schema.upgrade()` command executed on the master. This is done
  by monitoring the vector clocks (vclock) to ensure synchronization.
  2. Once the repica has caught up, run the following command to take
  a snapshot:
     ```lua
     box.snapshot()
     ```

> **Error Handling**: If any errors occur during the upgrade process, the
operation will halt, and an error report will be generated.

---

- Timeout for Synchronization

Replicas will wait for synchronization for a maximum of `Timeout` seconds.
The default timeout is set to 5 seconds, but this can be adjusted manually
using the `--timeout` option.

**Example:**
```bash
$ tt replicaset upgrade [<APP_NAME>] --timeout 10
```

- Selecting Replicasets for Upgrade

You can specify which replicaset(s) to upgrade by using the `--replicaset`
or `-r` option to target specific replicaset names.

**Example:**
```bash
$ tt replicaset upgrade [<APP_NAME> | <URI>] --replicaset <RS_NAME_1> -r <RS_NAME_2> ...
```

This provides flexibility in upgrading only the desired parts of the cluster
without affecting the entire system.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants