Skip to content

Commit

Permalink
Feat: release 2.9.0 - adds workflow import (#164)
Browse files Browse the repository at this point in the history
* first implementation

* fixing params

* improve error message

* improve error message2

* improve error message3

* adding pytests

* docs

* typo

* typo

* doc improvement

* doc improvement

* doc improvement

* doc improvement

* improve docs

* docs

* docs

* improving docs

* improve help
  • Loading branch information
dapineyro authored Apr 11, 2024
1 parent c169460 commit 79a651e
Show file tree
Hide file tree
Showing 11 changed files with 359 additions and 2 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
## lifebit-ai/cloudos-cli: changelog

## v2.9.0 (2024-04-09)

- Adds `workflow import` command, allowing user to import Nextflow workflows to CloudOS.

## v2.8.0 (2024-04-05)

- Adds support for using CloudOS HPC executor.
Expand Down
143 changes: 143 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,149 @@ Executing list...
Workflow list saved to workflow_list.csv
```

#### Import a Nextflow workflow to a CloudOS workspace

You can import new workflows to your CloudOS workspaces. The only requirements are:

- The workflow is a Nextflow pipeline.
- The workflow repository is located at GitHub or Bitbucket server.
- If your repository is private, you have access to the repository and you have linked your GitHub or Bitbucket server accounts to CloudOS.
- You have got the `repository_id` and the `repository_project_id`.

**How to get `repository_id` and `repository_project_id` from a GitHub repository**

**Option 1: searching in the page source code**

1. Go to the repository URL. Click on the right button of your mouse to get the following menu and click on "View Page Source".

![Github Repo right click](docs/github_right_click.png)

2. For collecting the `repository_project_id`, search for `octolytics-dimension-user_id` string in the source code. The `content` value is your `repository_project_id` (`30871219` in the example image).

![Github Repo owner id](docs/github_user_id.png)

3. For collecting the `repository_id`, search for `octolytics-dimension-repository_id` string in the source code. The `content` value is your `repository_id` (`122059362` in the example image).

![Github Repo id](docs/github_repository_id.png)

**Option 2: using github CLI**

If you have access to the repository, you can use the following tools to collect the required values:

- [gh](https://cli.github.com/)
- [jq](https://jqlang.github.io/jq/download/)

For collecting the `repository_project_id`:

```
# If your repo URL is https://github.com/lifebit-ai/DeepVariant
OWNER="lifebit-ai"
REPO="DeepVariant"
repository_project_id=$(gh api -H "Accept: application/vnd.github+json" repos/$OWNER/$REPO | jq .owner.id)
echo $repository_project_id
30871219
```

For collecting the `repository_id`:

```
# If your repo URL is https://github.com/lifebit-ai/DeepVariant
OWNER="lifebit-ai"
REPO="DeepVariant"
repository_id=$(gh api -H "Accept: application/vnd.github+json" repos/$OWNER/$REPO | jq .id)
echo $repository_id
122059362
```

**How to get `repository_project_id` from a Bitbucket server repository**

For Bitbucket server repositories, only `repository_project_id` is required. To collect it:

**Option 1: using the REST API from your browser**

1. Create a REST API URL from your repo URL by adding `/rest/api/latest` to the URL:

```
Original URL: https://bitbucket.com/projects/MYPROJECT/repos/my-repo
REST API URL: https://bitbucket.com/rest/api/latest/projects/MYPROJECT/repos/my-repo
```

> IMPORTANT NOTE: Please, as your repository original URL, do not use the "clone" URL provided by Bitbucket (the one with `.git` extension), use the actual browser URL, removing the terminal `/browse`.
2. Use the REST API URL in a browser and it will generate a JSON output.

3. Your `repository_project_id` is the value of the `project.id` field.

![bitbucket project id](docs/bitbucket_project_id.png)

**Option 2: using cURL**

If you have access to the repository, you can use the following tools to collect the required value:

- [cURL](https://curl.se/)
- [jq](https://jqlang.github.io/jq/download/)

For collecting the `repository_project_id`:

```
BITBUCKET_TOKEN="xxx"
repository_project_id=$(curl https://bitbucket.com/rest/api/latest/projects/MYPROJECT/repos/my-repo -H "Authorization: Bearer $BITBUCKET_TOKEN" | jq .project.id)
echo $repository_project_id
1234
```

#### Usage of the workflow import command

To import GitHub workflows to CloudOS, you can use the following command:

```bash
# Example workflow to import: https://github.com/lifebit-ai/DeepVariant
WORKFLOW_URL="https://github.com/lifebit-ai/DeepVariant"

# You will need the repository_project_id and repository_id values explained above
REPOSITORY_PROJECT_ID=30871219
REPOSITORY_ID=122059362

cloudos workflow import \
--cloudos-url $CLOUDOS \
--apikey $MY_API_KEY \
--workspace-id $WORKSPACE_ID \
--workflow-url $WORKFLOW_URL \
--workflow-name "new_name_for_the_github_workflow" \
--repository-project-id $REPOSITORY_PROJECT_ID \
--repository-id $REPOSITORY_ID
```

The expected output will be:

```console
CloudOS workflow functionality: list and import workflows.

Executing workflow import...

[Message] Only Nextflow workflows are currently supported.

Workflow test_import_github_3 was imported successfully with the following ID: 6616a8cb454b09bbb3d9dc20
```

To import bitbucket server workflows, `--repository-id` parameter is not required:

```bash
WORKFLOW_URL="https://bitbucket.com/projects/MYPROJECT/repos/my-repo"

# You will need only the repository_project_id
REPOSITORY_PROJECT_ID=1234

cloudos workflow import \
--cloudos-url $CLOUDOS \
--apikey $MY_API_KEY \
--workspace-id $WORKSPACE_ID \
--workflow-url $WORKFLOW_URL \
--workflow-name "new_name_for_the_bitbucket_workflow" \
--repository-project-id $REPOSITORY_PROJECT_ID
```

> NOTE: please, take into account that importing workflows using cloudos-cli is not yet available in all the CloudOS workspaces. If you try to use this feature in a non-prepared workspace you will get the following error message: `It seems your API key is not authorised. Please check if your workspace has support for importing workflows using cloudos-cli`.
#### Get a list of all available projects from a CloudOS workspace

Expand Down
60 changes: 59 additions & 1 deletion cloudos/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def job():

@run_cloudos_cli.group()
def workflow():
"""CloudOS workflow functionality: list workflows in CloudOS."""
"""CloudOS workflow functionality: list and import workflows."""
print(workflow.__doc__ + '\n')


Expand Down Expand Up @@ -778,6 +778,64 @@ def list_workflows(apikey,
print(f'\tWorkflow list saved to {outfile}')


@workflow.command('import')
@click.option('-k',
'--apikey',
help='Your CloudOS API key',
required=True)
@click.option('-c',
'--cloudos-url',
help=('The CloudOS url you are trying to access to. ' +
'Default=https://cloudos.lifebit.ai.'),
default='https://cloudos.lifebit.ai')
@click.option('--workspace-id',
help='The specific CloudOS workspace id.',
required=True)
@click.option('--workflow-url',
help=('URL of the workflow to import. Please, note that it should ' +
'be the URL shown in the browser, and it should come without ' +
'any of the .git or /browse extensions.'),
required=True)
@click.option('--workflow-name',
help="The name that the workflow will have in CloudOS",
required=True)
@click.option('--repository-project-id',
type=int,
help="The ID of your repository project",
required=True)
@click.option('--repository-id',
type=int,
help="The ID of your repository. Only required for GitHub repositories")
@click.option('--disable-ssl-verification',
help=('Disable SSL certificate verification. Please, remember that this option is ' +
'not generally recommended for security reasons.'),
is_flag=True)
@click.option('--ssl-cert',
help='Path to your SSL certificate file.')
def import_workflows(apikey,
cloudos_url,
workspace_id,
workflow_url,
workflow_name,
repository_project_id,
repository_id,
disable_ssl_verification,
ssl_cert):
"""Imports workflows to CloudOS."""
verify_ssl = ssl_selector(disable_ssl_verification, ssl_cert)
print('Executing workflow import...\n')
print('\t[Message] Only Nextflow workflows are currently supported.\n')
cl = Cloudos(cloudos_url, apikey, None)
workflow_id = cl.workflow_import(workspace_id,
workflow_url,
workflow_name,
repository_project_id,
repository_id,
verify=verify_ssl)
print(f'\tWorkflow {workflow_name} was imported successfully with the ' +
f'following ID: {workflow_id}')


@project.command('list')
@click.option('-k',
'--apikey',
Expand Down
2 changes: 1 addition & 1 deletion cloudos/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '2.8.0'
__version__ = '2.9.0'
77 changes: 77 additions & 0 deletions cloudos/clos.py
Original file line number Diff line number Diff line change
Expand Up @@ -532,3 +532,80 @@ def process_project_list(r, all_fields=False):
else:
df = df_full.loc[:, COLUMNS]
return df

def workflow_import(self, workspace_id, workflow_url, workflow_name,
repository_project_id, repository_id=None, verify=True):
"""Imports workflows to CloudOS.
Parameters
----------
workspace_id : string
The CloudOS workspace id from to collect the projects.
workflow_url : string
The URL of the workflow. Only Github or Bitbucket are allowed.
workflow_name : string
A name for the imported pipeline in CloudOS.
repository_project_id : int
The repository project ID.
repository_id : int
The repository ID. Only required for GitHub repositories.
verify: [bool|string]
Whether to use SSL verification or not. Alternatively, if
a string is passed, it will be interpreted as the path to
the SSL certificate file.
returns
-------
workflow_id : string
The newly imported worflow ID.
"""
platform_url = workflow_url.split('/')[2].split('.')[0]
repository_name = workflow_url.split('/')[-1]
if platform_url == 'github':
platform = 'github'
repository_project = workflow_url.split('/')[3]
if repository_id is None:
raise ValueError('Please, specify --repository-id when importing a GitHub repository')
elif platform_url == 'bitbucket':
platform = 'bitbucketServer'
repository_project = workflow_url.split('/')[4]
repository_id = repository_name
else:
raise ValueError(f'Your repository platform is not supported: {platform_url}. ' +
'Please use either GitHub or BitbucketServer.')
repository_name = workflow_url.split('/')[-1]

data = {
"apikey": self.apikey,
"workflowType": "nextflow",
"repository": {
"platform": platform,
"repositoryId": repository_id,
"name": repository_name,
"owner": {
"login": repository_project,
"id": repository_project_id},
"isPrivate": True,
"url": workflow_url,
"commit": "",
"branch": ""
},
"name": workflow_name,
"description": "",
"isPublic": False,
"mainFile": "main.nf",
"defaultContainer": None,
"processes": [],
"docsLink": "",
"team": workspace_id
}
r = retry_requests_post("{}/api/v1/workflows?teamId={}".format(self.cloudos_url,
workspace_id),
json=data, verify=verify)
if r.status_code == 401:
raise ValueError('It seems your API key is not authorised. Please check if ' +
'your workspace has support for importing workflows using cloudos-cli')
elif r.status_code >= 400:
raise BadRequestException(r)
content = json.loads(r.content)
return content['_id']
Binary file added docs/bitbucket_project_id.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/github_repository_id.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/github_right_click.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/github_user_id.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 74 additions & 0 deletions tests/test_clos/test_workflow_import.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
import mock
import json
import pytest
import responses
from cloudos.clos import Cloudos
from cloudos.utils.errors import BadRequestException
from tests.functions_for_pytest import load_json_file

OUTPUT = "tests/test_data/workflows/workflow_import.json"
APIKEY = 'vnoiweur89u2ongs'
CLOUDOS_URL = 'http://cloudos.lifebit.ai'
WORKSPACE_ID = 'lv89ufc838sdig'
WORKFLOW_URL = 'https://github.com/lifebit-ai/repo'
WORKFLOW_NAME = 'test-repo'
REPOSITORY_PROJECT_ID = 1234
REPOSITORY_ID = 567


@mock.patch('cloudos.clos', mock.MagicMock())
@responses.activate
def test_workflow_import_correct():
"""
Test 'import_workflows' to work as intended
API request is mocked and replicated with json files
"""
create_json = load_json_file(OUTPUT)
search_str = f"teamId={WORKSPACE_ID}"
# mock POST method with the .json
responses.add(
responses.POST,
url=f"{CLOUDOS_URL}/api/v1/workflows?{search_str}",
body=create_json,
status=200)
# start cloudOS service
clos = Cloudos(apikey=APIKEY, cromwell_token=None, cloudos_url=CLOUDOS_URL)
# get mock response
workflow_id = clos.workflow_import(WORKSPACE_ID,
WORKFLOW_URL,
WORKFLOW_NAME,
REPOSITORY_PROJECT_ID,
REPOSITORY_ID)
# check the response
assert isinstance(workflow_id, str)
assert workflow_id == '66156ba61d5f06a39b1da573'


@mock.patch('cloudos.clos', mock.MagicMock())
@responses.activate
def test_workflow_import_incorrect():
"""
Test 'workflow_import' to fail with '400' response
"""
# prepare error message
error_message = {"statusCode": 400, "code": "BadRequest",
"message": "Bad Request.", "time": "2022-11-23_17:31:07"}
error_json = json.dumps(error_message)
search_str = f"teamId={WORKSPACE_ID}"
# mock POST method with the .json
responses.add(
responses.POST,
url=f"{CLOUDOS_URL}/api/v1/workflows?{search_str}",
body=error_json,
status=400)
# raise 400 error
with pytest.raises(BadRequestException) as error:
# check if it failed
clos = Cloudos(apikey=APIKEY, cromwell_token=None,
cloudos_url=CLOUDOS_URL)
clos.workflow_import(WORKSPACE_ID,
WORKFLOW_URL,
WORKFLOW_NAME,
REPOSITORY_PROJECT_ID,
REPOSITORY_ID)
assert "Server returned status 400." in (str(error))
1 change: 1 addition & 0 deletions tests/test_data/workflows/workflow_import.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"_id":"66156ba61d5f06a39b1da573"}

0 comments on commit 79a651e

Please sign in to comment.