Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/metax integration #356

Merged
merged 79 commits into from
Apr 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
61dfa3c
Add metax mock api.
genie9 Jan 5, 2022
e5f745b
Add functionality for creating dataset to Metax
genie9 Jan 19, 2022
0762395
Add functionality for updating dataset to Metax.
genie9 Jan 20, 2022
f54acf1
Delete object from Metax service
genie9 Feb 16, 2022
b8dc5f9
Update integration tests for basic crud with metax
genie9 Jan 21, 2022
08a6091
Add Metax datasets publishing on folder publish
genie9 Jan 21, 2022
e8d439b
Update changelog and pyspell wordlist
genie9 Feb 23, 2022
ece28ae
Add integration test trying to update metax id
genie9 Feb 24, 2022
9197574
Update env vars to building configs
genie9 Feb 24, 2022
9d18e61
Update metax related confs and take them in use
genie9 Mar 1, 2022
20be8ce
Change location of metax_api_handler
genie9 Mar 1, 2022
995bcbe
Fix extra DB read during Metax data creation
genie9 Mar 4, 2022
77d864a
Fix extra DB read during Metax data update
genie9 Mar 4, 2022
baac01b
Fix private methods names with underscore
genie9 Mar 4, 2022
b871758
add study and dataset identifiers to folder schema
blankdots Jan 10, 2022
9ec3bbe
make dataset description mandatory
blankdots Jan 25, 2022
de388c9
make study abstract mandatory
blankdots Jan 25, 2022
842040f
add keywords to folder doi info
blankdots Jan 25, 2022
5dd22d4
not all functions need to be async
blankdots Feb 15, 2022
0fe5c22
correct error reasons for operators
blankdots Feb 16, 2022
882ed07
create draft dois for study and datasets
blankdots Feb 16, 2022
7ed442f
Resource Type formatted for front-end display
blankdots Feb 17, 2022
2a6a2af
multilevel patch objects
blankdots Feb 17, 2022
c9a58b6
refactor doi in objects to make it easier to test
blankdots Feb 17, 2022
3e33fe7
mock doi API PUT request for publishing DOI
blankdots Feb 17, 2022
30fd3d5
Publish Study and Dataset when publishing folders
blankdots Feb 17, 2022
0999393
integration tests for publishing folder adjusted
blankdots Feb 17, 2022
c1fe225
adjust message logs for integration tests
blankdots Feb 17, 2022
e8e46d2
Attempt to curb race condition with study POST
blankdots Feb 18, 2022
3197ea4
not all functions need to be async
blankdots Feb 15, 2022
efa1901
create draft dois for study and datasets
blankdots Feb 16, 2022
3935764
publish Study and Dataset when publishing folders
blankdots Feb 17, 2022
895dba1
attempt to curb race condition with study POST
blankdots Feb 18, 2022
2efa393
check values in set instead of list
blankdots Feb 15, 2022
1bc4c40
add to set instead of extending list
blankdots Feb 28, 2022
4ff0e0e
integrating doi with metax
blankdots Mar 2, 2022
7c09c82
rebase with metax integrations
blankdots Mar 7, 2022
e3874df
Small refactoring
genie9 Mar 7, 2022
ba49229
Remove reading DB on replace object to find metax id exists
genie9 Mar 7, 2022
92a5809
Update how DB is red on object patch
genie9 Mar 8, 2022
3593807
Remove status from metax info in object
genie9 Mar 9, 2022
908cd78
Remove unnecessary metax service calling functions
genie9 Mar 9, 2022
529c309
Update changelog
genie9 Mar 9, 2022
e7f71b8
add option to delete doi on object delete
blankdots Mar 10, 2022
d833057
update changelog with changes from #332
blankdots Mar 10, 2022
c818abf
streamline doi config into a dict
blankdots Mar 10, 2022
d882f2a
specify which schema the object was not found in
blankdots Mar 10, 2022
9609293
publish metax ids no iterations
blankdots Mar 10, 2022
ea1f911
don't update the folder when identifiers created
blankdots Mar 10, 2022
6c82185
prepare DOI data for publishing and move to folder
blankdots Mar 10, 2022
9bd4672
some metax functions don't need to be async
blankdots Mar 10, 2022
3d27fda
fix typo for set state doi raises docs
blankdots Mar 11, 2022
d8a7bf1
Add bulk patching to Metax mock api
genie9 Mar 16, 2022
11f7026
Add required fields from doi info
genie9 Mar 16, 2022
4005a50
Update integration tests
genie9 Mar 16, 2022
51a2199
Add metax patching on object creation
genie9 Mar 14, 2022
f8a74bd
Fix integration tests
genie9 Mar 16, 2022
ad33e81
Simplify aiohttp calls in metax_api_handler
genie9 Mar 16, 2022
05cab09
Fix bug with multipart content filename extraction
genie9 Mar 10, 2022
f003db9
Update submit endpoint
genie9 Mar 10, 2022
e362c0c
Update changelog and openapi specs
genie9 Mar 10, 2022
91691d5
Fix merge related bugs
genie9 Mar 29, 2022
4cd36fe
Clean metax handler from TODOs
genie9 Mar 29, 2022
199285a
keywords as list instead of array
blankdots Apr 1, 2022
87f204f
Merge pull request #390 from CSCfi/feature/keywords-string
blankdots Apr 1, 2022
edb7992
Update schemas with required fields
genie9 Apr 4, 2022
341482c
Add DISCOVERY_URL to env vars
genie9 Apr 5, 2022
be1beb7
restrict the use of doi key for study and dataset
blankdots Apr 6, 2022
856be45
Merge pull request #392 from CSCfi/feature/restrict-doi-key
blankdots Apr 6, 2022
a21814e
Update loggings to track Metax calls
genie9 Mar 30, 2022
f444f1a
Add mapper class
genie9 Mar 30, 2022
6a7f2da
Add optional fields to mapper
genie9 Mar 30, 2022
a691b7d
Update tests
genie9 Mar 30, 2022
c5825e1
Update dates mapping for several same date type
genie9 Mar 30, 2022
01845aa
Fix configs for new json folder
genie9 Mar 31, 2022
dbdcf40
Clean mapper class
genie9 Apr 5, 2022
3ac7543
Update changelog
genie9 Apr 5, 2022
92bf5f4
Fix logging print format
genie9 Apr 6, 2022
4e5b528
Merge pull request #387 from CSCfi/feature/map-datacite-metadata-to-m…
genie9 Apr 6, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,9 @@ DOI_API=http://mockdoi:8001/dois
DOI_PREFIX=10.xxxx
DOI_USER=user
DOI_KEY=key
DISCOVERY_URL=https://etsin.demo.fairdata.fi/dataset/

# metax
METAX_USER=sd
METAX_PASS=demo_pass
METAX_URL=http://mockmetax:8002
6 changes: 6 additions & 0 deletions .github/config/.wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ dockerfile
docstrings
doi
doiinfo
dois
dt
dzongkha
ean
Expand Down Expand Up @@ -196,6 +197,7 @@ experimentref
experimenttype
externalid
extrainfo
fairdata
faire
familyname
faroese
Expand Down Expand Up @@ -342,6 +344,9 @@ metagenomic
metagenomics
metatranscriptome
metatranscriptomic
metax
metaxidentifier
metaxservicehandler
methylation
methylcytidine
mf
Expand Down Expand Up @@ -609,6 +614,7 @@ txt
ui
ujson
umi
uncomment
unencryptedchecksum
uniqueitems
unlocalised
Expand Down
8 changes: 3 additions & 5 deletions .github/workflows/int.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,7 @@ jobs:
env:
BASE_URL: http://localhost:5430
OIDC_URL: http://localhost:8000
MONGO_HOST: localhost:27017
MONGO_DATABASE: default
MONGO_AUTHDB: admin


- name: Collect logs from docker
if: ${{ failure() }}
run: docker-compose logs --no-color -t > tests/dockerlogs || true
Expand Down Expand Up @@ -78,6 +75,7 @@ jobs:
MONGO_HOST: localhost:27017
MONGO_DATABASE: default
MONGO_AUTHDB: admin
MONGO_SSL: True

- name: Run Integration test
run: |
Expand All @@ -86,7 +84,7 @@ jobs:
BASE_URL: http://localhost:5430
OIDC_URL: http://localhost:8000
MONGO_SSL: True

- name: Collect logs from docker
if: ${{ failure() }}
run: docker-compose logs --no-color -t > tests/dockerlogs || true
Expand Down
28 changes: 25 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,31 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- Add patching of folders after object save and update operations
- Submission endpoint update #371
- Adds mandatory query parameter `folder` for submit endpoint POST
- On actions add and modify object is added or updated to folder(submission) where it belongs with it's accession ID, schema, submission type, title and filename
- Adds metax integration to submit endpoint
- Integration with Metax service #356 #387
- Adds new local container for testing against mocked Metax API
- Introduces new env vars: METAX_USER, METAX_PASS, METAX_URL for connection to Metax service
- Introduces new env var DISCOVERY_URL for creating link to dataset inside Fairdata SD catalog
- Adds new key metaxIdentifier to Study and Dataset collections containing metax id returned from Metax API
- Adds new handler MetaxServiceHandler to take care of mapping Submitter metadata to Metax metadata and to connect to Metax API
- Adds new mapper class to adjust incoming metadata to Metax schema
- Add patching of folders after object save and update operations #354
- Adds mandatory query parameter `folder` for objects endpoint POST
- Object is added or updated to folder(submission) where it belongs with it's accession ID, schema, submission type, title and filename in the case of CSV and XML upload
- Adds configuration for mypy linting to VScode devcontainer setup
- Templates API #256
- use `ujson` as default json library
- Creating draft Datacite DOI for folders #257
- Creating draft Datacite DOI for folders #257 #332
- created a mock web app, which would act similarly to DataCite REST API
- altered `publish_folder` endpoint so that `extraInfo` containing the DOI data is added upon publishing
- added `datePublished` key to folders which takes in the date/time, when folder is published
- DOI Publishing and deletion to Datacite #332 #369
- create draft DOIs for both Study and Datasets and add them to the folder `extraInfo` when published
- delete draft DOIs on object delete
- update DOI info at Datacite when folder is published
- VScode Dev environment #287
- Add VS Code development container
- Update docker for development
Expand Down Expand Up @@ -46,6 +62,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- new endpoint `GET /templates` to replace `GET /users/current` `{"templates":[...]}`
- new JSON keys `index` and `tags` to `PATCH /templates/schema/templateId`, same values as were previously used in `PATCH /user` which is now removed
- WARNING: breaking change that requires fresh database, because "project" is new information that did not exist before, and it can't be migrated to existing user-owned hierarchy
- Multilevel add patch objects to support `/extraInfo/datasetIdentifiers/-` which needs dot notation for mongodb to work e.g. `extraInfo.datasetIdentifiers` #332

### Changed

Expand All @@ -59,7 +76,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- README updated with tox command, development build instructions, and prettify Dockerfile.
- Update ENA XML and JSON schemas #299
- Github actions changed the use of https://git.io/misspell to rojopolis/spellcheck-github-actions #316
- Separated most of the handlers to own files inside the handlers folder #319
- Separated most of the handlers to own files inside the handlers folder #319
- allow inserting only one study in folder #332
- JSON schemas #332
- introduce `keywords` required for Metax in `doiInfo`
- dataset `description` and study `studyAbstract` are now mandatory
- `keywords` will be comma separated values, that will require splitting when adding to Metax API

### Fixed

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Server can then be found from `http://localhost:5430`.
- Suitable mongodb instance can be launched with Docker by running `docker-compose up database`
- After installing and setting up database, server can be launched with `metadata_submitter`

If you also need frontend for development, check out [frontend repository](https://github.com/CSCfi/metadata-submitter-frontend/).
If you also need frontend for development, check out [frontend repository](https://github.com/CSCfi/metadata-submitter-frontend/). You will also need to uncomment `REDIRECT_URL` environment variable from .env file.

## Tests

Expand Down
24 changes: 23 additions & 1 deletion docker-compose-tls.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ services:
build:
dockerfile: Dockerfile-dev
context: .
target: develop
image: cscfi/metadata-submitter-dev
container_name: "metadata_submitter_backend_dev"
volumes:
Expand All @@ -14,6 +15,7 @@ services:
- database
- mockauth
- mockdoi
- mockmetax
restart: on-failure
environment:
- "MONGO_HOST=${MONGO_HOST}"
Expand All @@ -34,6 +36,10 @@ services:
- "DOI_PREFIX=${DOI_PREFIX}"
- "DOI_USER=${DOI_USER}"
- "DOI_KEY=${DOI_KEY}"
- "DISCOVERY_URL=${DISCOVERY_URL}"
- "METAX_USER=${METAX_USER}"
- "METAX_PASS=${METAX_PASS}"
- "METAX_URL=${METAX_URL}"
database:
image: "mongo"
container_name: "metadata_submitter_database_dev"
Expand All @@ -54,6 +60,7 @@ services:
build:
dockerfile: Dockerfile-dev
context: .
target: develop
image: cscfi/metadata-submitter-dev
environment:
- "LOG_LEVEL=${LOG_LEVEL}"
Expand All @@ -71,6 +78,7 @@ services:
build:
dockerfile: Dockerfile-dev
context: .
target: develop
image: cscfi/metadata-submitter-dev
hostname: mockdoi
expose:
Expand All @@ -80,5 +88,19 @@ services:
volumes:
- ./tests/integration/mock_doi_api.py:/mock_doi_api.py
entrypoint: ["python", "/mock_doi_api.py", "0.0.0.0", "8001"]
mockmetax:
build:
dockerfile: Dockerfile-dev
context: .
target: develop
image: cscfi/metadata-submitter-dev
hostname: mockmetax
expose:
- 8002
ports:
- 8002:8002
volumes:
- ./tests/integration/mock_metax_api.py:/mock_metax_api.py
entrypoint: ["python", "/mock_metax_api.py", "0.0.0.0", "8002"]
volumes:
data:
data:
23 changes: 21 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ services:
- database
- mockauth
- mockdoi
- mockmetax
restart: on-failure
environment:
- "MONGO_HOST=${MONGO_HOST}"
Expand All @@ -29,6 +30,10 @@ services:
- "DOI_PREFIX=${DOI_PREFIX}"
- "DOI_USER=${DOI_USER}"
- "DOI_KEY=${DOI_KEY}"
- "DISCOVERY_URL=${DISCOVERY_URL}"
- "METAX_USER=${METAX_USER}"
- "METAX_PASS=${METAX_PASS}"
- "METAX_URL=${METAX_URL}"
database:
image: "mongo"
container_name: "metadata_submitter_database_dev"
Expand Down Expand Up @@ -60,7 +65,7 @@ services:
- 8000:8000
volumes:
- ./tests/integration/mock_auth.py:/mock_auth.py
entrypoint: [ "python", "/mock_auth.py", "0.0.0.0", "8000" ]
entrypoint: ["python", "/mock_auth.py", "0.0.0.0", "8000"]
mockdoi:
build:
dockerfile: Dockerfile-dev
Expand All @@ -74,6 +79,20 @@ services:
- 8001:8001
volumes:
- ./tests/integration/mock_doi_api.py:/mock_doi_api.py
entrypoint: [ "python", "/mock_doi_api.py", "0.0.0.0", "8001" ]
entrypoint: ["python", "/mock_doi_api.py", "0.0.0.0", "8001"]
mockmetax:
build:
dockerfile: Dockerfile-dev
context: .
target: develop
image: cscfi/metadata-submitter-dev
hostname: mockmetax
expose:
- 8002
ports:
- 8002:8002
volumes:
- ./tests/integration/mock_metax_api.py:/mock_metax_api.py
entrypoint: ["python", "/mock_metax_api.py", "0.0.0.0", "8002"]
volumes:
data:
13 changes: 8 additions & 5 deletions docs/specification.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,14 @@ paths:
tags:
- Submission
summary: XML submission endpoint, will also trigger validation.
parameters:
- name: folder
in: query
schema:
type: string
description: The folder ID where object belongs to.
required: true
example: "folder=12345"
requestBody:
content:
multipart/form-data:
Expand Down Expand Up @@ -293,11 +301,6 @@ paths:
- Submission
summary: Submit data to a specific schema
parameters:
- in: query
name: folder
schema:
type: string
description: The folder ID where object belongs to.
- name: schema
in: path
description: Name of the Metadata schema.
Expand Down
16 changes: 8 additions & 8 deletions metadata_backend/api/handlers/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

async def multipart_content(
req: Request, extract_one: bool = False, expect_xml: bool = False
) -> Tuple[List[Tuple[Any, str]], str, str]:
) -> Tuple[List[Tuple[Any, str, str]], str]:
"""Get content(s) and schema type(s) of a multipart request (from either csv or xml format).

Note: for multiple files support check: https://docs.aiohttp.org/en/stable/multipart.html#hacking-multipart
Expand All @@ -26,8 +26,8 @@ async def multipart_content(
:raises: HTTPBadRequest for multiple different reasons
:returns: content and schema type for each uploaded file and file type of the upload
"""
xml_files: List[Tuple[str, str]] = []
csv_files: List[Tuple[Dict, str]] = []
xml_files: List[Tuple[str, str, str]] = []
csv_files: List[Tuple[Dict, str, str]] = []
try:
reader = await req.multipart()
except AssertionError:
Expand Down Expand Up @@ -59,20 +59,20 @@ async def multipart_content(
if expect_xml or part.headers[hdrs.CONTENT_TYPE] == "text/xml":
content, schema_type = await _extract_upload(part)
_check_xml(content)
xml_files.append((content, schema_type))
xml_files.append((content, schema_type, filename))
elif part.headers[hdrs.CONTENT_TYPE] == "text/csv":
content, schema_type = await _extract_upload(part)
_check_csv(content)
csv_content = CSVToJSONParser().parse(schema_type, content)
for row in csv_content:
csv_files.append((row, schema_type))
csv_files.append((row, schema_type, filename))
else:
reason = "Submitted file was not proper XML nor CSV."
LOG.error(reason)
raise web.HTTPBadRequest(reason=reason)

# Return extracted content
return _get_content_with_type(xml_files, csv_files) + (filename,)
return _get_content_with_type(xml_files, csv_files)


async def _extract_upload(part: BodyPartReader) -> Tuple[str, str]:
Expand Down Expand Up @@ -137,8 +137,8 @@ def _check_xml(content: str) -> bool:


def _get_content_with_type(
xml_files: List[Tuple[str, str]], csv_files: List[Tuple[Dict, str]]
) -> Tuple[List[Tuple[Any, str]], str]:
xml_files: List[Tuple[str, str, str]], csv_files: List[Tuple[Dict, str, str]]
) -> Tuple[List[Tuple[Any, str, str]], str]:
"""Return either list of XML or CSV files with the file type info.

:param xml_files: List of xml contents with schema types
Expand Down
Loading