978 Migrate Local Users, Datasets, Folders and Files #1058

tcnichol · 2024-05-20T18:01:58Z

To test:

If you have a local running V1 instance, you can use that. And take a look at notes below to properly set up V1 instance.
↓↓↓↓
Otherwise you can use the demo instance if possible.

Create an .env file and place it under the migration folder
This .env should look like below

CLOWDER_V1=https://clowder.ncsa.illinois.edu/clowder
ADMIN_KEY_V1={v1 api key}

CLOWDER_V2=http://0.0.0.0:8000
ADMIN_KEY_V2={v2 api key}

Adjust the v1 users line for testing

The first part of user migration. Here we will migrate users from a clowder v1 instance into a clowder v2 instance. This works for local users, datasets, folders, and files.

I have refactored this so that there is a 'migrate_user' method. This will probably be a useful pattern to follow since we might want to handle CILogon users differently. We also might want to make migration something a user can initiate from another instance, where there might be options they need to select. This might become important once we get into collection hierarchies, how to handle spaces and sharing, metadata, and then things like licenses.

A later pull request will address collection hierarchies, metadata, CILogon accounts, and other features.

To test this, you will want to run a clowder v1 instance. Since both need to run a different version of MongoDB, change this in the .yml file for the dependencies in v1.

  # database to hold metadata (required)
  mongo:
    image: mongo:3.6
    restart: unless-stopped
    networks:
      - clowder
    command: mongod --port 27018
    ports:
      - '27018:27018'
    volumes:
      - mongo:/data/db

also change this in application.conf

mongodbURI = "mongodb://127.0.0.1:27018/clowder"

You will also need to do add this line in application.conf so that the emails will be printed in terminal

smtp.mock=true

It can be added anywhere.

To run v1, use

docker-compose up mongo -d

And then run clowder v1 using IntelliJ.
If you add entries to the v1 instance, try migrating them.

Also, you will need an admin user from the v1 instance and an api key to be placed in a .env file.

This can also be tested with another instance. The migration is handled using the API so you can use an existing instance and try to migrate.

TODO:

… migrated from old clowder instance

NOTE - we need a feature that says 'reset password' if 'reset password' is set, then you have to reset password on logging in

will require some port changes for the running v1 instance (or v2) will work out a strategy

need to move to new db and minio

file upload broken, need to fix

… on the first page

removing unused code

… correct folders

this might need to be modified for CILogon, but this will work for local users

* migrate metadata definition; modify migration script * local migrate changes (#1186) --------- Co-authored-by: Todd Nicholson <[email protected]>

longshuicy · 2024-08-26T14:32:03Z

Move descriptions from other PRs over there

This pull request was created against @longshuicy branch for the migration.
To migrate spaces to groups, I have left the migration of datasets, files etc as before. Added the creation of 2 dictionaries that map the user_v1 ids to the user_v2 ids, and the dataset ids from v1 to datasets from v2.
After the datasets and users are migrated, I go through the spaces of the users in v1 and use them to create groups in v2, and then map the right users into the groups and then the groups are shared with the new datasets migrated to v2. This way we don't do anything with spaces until all the users and data are moved, so that we don't miss anything.

I am working on a branch that I created from this one for spaces.
Something I noticed - right now I create user spaces and make them groups in processing the user. This would mean that if a user created a space, then all the datasets that the user created will be added properly.
But if a dataset is in the space and that dataset is from a user that hasn't been migrated yet, then it wouldn't be in the space. And we cannot add a user that isn't in v2.
So I'm thinking that somehow we should temporarily save the api keys for users, and do spaces after everything else? I'll be changing my branch to fit that.

This pull request is for migrating spaces. It's from this branch.
I tested and so far, datasets, files, folders, users are migrated correctly, and this one will change spaces to groups. I had to handle the groups after all users and datasets were created, because otherwise the users and datasets that need to be in the group in v2 might not exist yet.
#1186

scripts/migration/migrate.py

tcnichol · 2024-08-26T19:19:27Z

On spaces, right now if you want to test spaces you'll need to do it locally with this branch:

clowder-framework/clowder#453

Also add the changes to v1 mentioned above.

Currently v1 api does not return creator of spaces with space api calls. The above pull request adds that.

…ate-users # Conflicts: # scripts/migration/migrate.py

scripts/migration/migrate.py

* dataset metadata is working * register migration extractor and successfully migrate machine metadata

* match other branch * adding new collection metadata * str not float * getting collections for a dataset in v1, fixing metadata for collections * posts collection name and id * adding routes for getting collections * making a method like the one in v1 for self and ancestors. it will be easier to build a collection hierarchy from this * sample json for mdata * posts collection name and id * building the data for collections * something works now * matching with other branch * methods for migrating collections as metadata * need to post it as metadata * change name * adding the metadata for collections * adding context url and right endpoint * getting spaces as well as collections * change name * remaning method * created v2 license based on v1 license details (#1193) Co-authored-by: Chen Wang <[email protected]> * removing print statements * better error logging --------- Co-authored-by: Dipannita <[email protected]> Co-authored-by: Chen Wang <[email protected]>

tcnichol added 12 commits March 28, 2024 13:00

adding new dependency and class

511923d

starting a user migration

f763cb4

adding script for migrate metadata definitions

84d43bc

adding option of temporary password (for use with users who are being…

e82e023

… migrated from old clowder instance

reverting changes

dcdbea1

temporary - for users that are migrated

79e56b4

creates users

33f1cab

NOTE - we need a feature that says 'reset password' if 'reset password' is set, then you have to reset password on logging in

does not actually create the dataset

93509e7

will require some port changes for the running v1 instance (or v2) will work out a strategy

using beanie might not work here due to async issues

aba43a3

using nest asyncio to fix problem with even loop closing

727685f

entry from right table for file bytes

d4e9526

need to move to new db and minio

dependencies for migration, adding file entry and indexing files

8c7d6bd

tcnichol linked an issue May 20, 2024 that may be closed by this pull request

migrate local users, datasets, folders and files #978

Open

tcnichol requested a review from lmarini May 20, 2024 18:02

tcnichol added 16 commits May 20, 2024 14:51

wrong api key

2ba7df5

api key works for user,

f624784

file upload broken, need to fix

file upload works, sloppy needs fixed

adc6c63

we are only getting partial files, not sure how to fix yet

171b2b3

files upload right now

1f6a2ed

delete file

2bf100c

adding folders

4eccae1

folder hierarchy - this should work

c01ce96

works for now, folders need to be fixed

c2ab289

previous is no longer selectable when moving to my datasets. we start…

2d7cdce

… on the first page

Merge branch 'main' into 978-migrate-users

65599bb

new folder hierarchy method works

5be97bf

need a license for datasets

f479284

removing unused code

method for getting all the folders

cb4b835

need to remove logging, but folders are created and files uploaded to…

e885a2a

… correct folders

should work now

b4db0f4

tcnichol added 2 commits June 3, 2024 13:34

fixing typos

72e0845

formatting

a6cd25a

tcnichol changed the title ~~978 migrate users~~ 978 Migrate Local Users, Datasets, Folders and Files Jun 3, 2024

tcnichol added 2 commits June 3, 2024 13:38

sample .env

54747ca

formatting

55ff21a

tcnichol marked this pull request as ready for review June 3, 2024 18:45

tcnichol requested review from max-zilla and longshuicy as code owners June 3, 2024 18:45

tcnichol added 3 commits July 13, 2024 11:28

Merge branch 'main' into 978-migrate-users

e216405

refactoring migrate_user

851503e

this might need to be modified for CILogon, but this will work for local users

formatting

074154e

longshuicy assigned tcnichol Aug 1, 2024

Merge branch 'main' into 978-migrate-users

909b8f6

tcnichol changed the base branch from main to release/v2.0-beta-3 August 6, 2024 15:47

tcnichol and others added 5 commits August 6, 2024 10:48

Merge branch 'release/v2.0-beta-3' into 978-migrate-users

b309023

Merge remote-tracking branch 'origin/main' into 978-migrate-users

58ada46

rename

62d5b02

Merge branch 'release/v2.0-beta-3' into 978-migrate-users

8772325

migrate metadata definition; modify migration script (#1184)

ec56aa4

* migrate metadata definition; modify migration script * local migrate changes (#1186) --------- Co-authored-by: Todd Nicholson <[email protected]>

longshuicy reviewed Aug 26, 2024

View reviewed changes

scripts/migration/migrate.py Outdated Show resolved Hide resolved

longshuicy reviewed Aug 26, 2024

View reviewed changes

scripts/migration/migrate.py Show resolved Hide resolved

longshuicy and others added 4 commits August 27, 2024 11:26

fix clowder user api key creation issue

8050460

fix user api key if user already exists

c936760

Merge remote-tracking branch 'origin/978-migrate-users' into 978-migr…

5547715

…ate-users # Conflicts: # scripts/migration/migrate.py

switch headers and files

86c8493

ddey2 reviewed Aug 27, 2024

View reviewed changes

scripts/migration/migrate.py Show resolved Hide resolved

longshuicy and others added 2 commits August 29, 2024 09:12

Migrate metadata (#1192)

2a52c7c

* dataset metadata is working * register migration extractor and successfully migrate machine metadata

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

978 Migrate Local Users, Datasets, Folders and Files #1058

978 Migrate Local Users, Datasets, Folders and Files #1058

tcnichol commented May 20, 2024 •

edited by longshuicy

Loading

longshuicy commented Aug 26, 2024

tcnichol commented Aug 26, 2024

978 Migrate Local Users, Datasets, Folders and Files #1058

Are you sure you want to change the base?

978 Migrate Local Users, Datasets, Folders and Files #1058

Conversation

tcnichol commented May 20, 2024 • edited by longshuicy Loading

longshuicy commented Aug 26, 2024

tcnichol commented Aug 26, 2024

tcnichol commented May 20, 2024 •

edited by longshuicy

Loading