Skip to content

Commit

Permalink
sort authors using list order field (#217)
Browse files Browse the repository at this point in the history
Sorts authors for all pages using the `author_list_order` field. I also
updated the distinct field to match the sort field because [it is
recommended](https://hasura.io/docs/latest/queries/postgres/distinct-queries/),
and Apollo throws an error if they do not match
  • Loading branch information
codemonkey800 authored Dec 6, 2023
1 parent f1ec1a2 commit 7cb4dea
Show file tree
Hide file tree
Showing 4 changed files with 57 additions and 26 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,14 @@ const GET_DATASETS_DATA_QUERY = gql(`
key_photo_thumbnail_url
related_database_entries
authors {
# TODO Remove distinct_on when data is verified to be unique
authors(
distinct_on: name,
order_by: {
author_list_order: asc,
name: asc,
},
) {
name
primary_author_status
}
Expand Down
9 changes: 8 additions & 1 deletion frontend/packages/data-portal/app/routes/datasets.$id.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,14 @@ const GET_DATASET_BY_ID = gql(`
sample_type
tissue_name
authors(distinct_on: name) {
# TODO Remove distinct_on when data is verified to be unique
authors(
distinct_on: name,
order_by: {
author_list_order: asc,
name: asc,
},
) {
name
email
primary_author_status
Expand Down
9 changes: 8 additions & 1 deletion frontend/packages/data-portal/app/routes/runs.$id.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,14 @@ const GET_RUN_BY_ID_QUERY = gql(`
tissue_name
title
authors(distinct_on: name) {
# TODO Remove distinct_on when data is verified to be unique
authors(
distinct_on: name,
order_by: {
author_list_order: asc,
name: asc,
},
) {
name
email
primary_author_status
Expand Down
56 changes: 33 additions & 23 deletions website-docs/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ We hope these answers will help you get the most out of the CryoET Data Portal!

Did you encounter a bug, error, or other issue while using the portal? [Submit an issue on Github](https://github.com/chanzuckerberg/cryoet-data-portal/issues) to let us know!

To submit an issue, you'll need to create a [free Github account](https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F&source=header-home).
This allows our team to follow up with you on Github if we have a question about the problem you encountered. Then, [fill out this form](https://github.com/chanzuckerberg/cryoet-data-portal/issues/new).
To submit an issue, you'll need to create a [free Github account](https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F&source=header-home).
This allows our team to follow up with you on Github if we have a question about the problem you encountered. Then, [fill out this form](https://github.com/chanzuckerberg/cryoet-data-portal/issues/new).
We suggest you use a descriptive title, paste an error messages using the `<>` icon on the form, and provide as many details as possible about the problem, including what you expected to happen and what type of machine you were using.

For more information about submiting issues on Github, please refer to [Github's documentation](https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue#creating-an-issue-from-a-repository).
Expand All @@ -16,8 +16,9 @@ For more information about submiting issues on Github, please refer to [Github's

Title: Tomogram TS_026 cannot be downloaded

Body:
Body:
I have the AWS CLI tool installed on a Mac computer. I copied the download command from the prompt on the tomogram page. Instead of downloading, I received this error message:

```
ERROR MESSAGE COPIED FROM TERMINAL
```
Expand All @@ -38,11 +39,12 @@ Descriptions of all terminology and metadata used in the Portal is provided [her

<Accordion title="How do I download data using Amazon Web Services (AWS)?">

**The Data Portal's S3 bucket is public**, so it can be used without sign-in credentials by specifying `--no-sign-request` in your commands. We recommend following our [Quickstart Guide](#quickstart) to get started downloading data in only a few minutes.
**The Data Portal's S3 bucket is public**, so it can be used without sign-in credentials by specifying `--no-sign-request` in your commands. We recommend following our [Quickstart Guide](#quickstart) to get started downloading data in only a few minutes.

For more details or to troubleshoot, refer to the [Installation](#installation), [Download Data](#download-data), and [Optimize Download Speed](#optimize-download-speed) in-depth explanations.

## Quickstart

1. Download the installer: [MacOS Installer Download](https://awscli.amazonaws.com/AWSCLIV2.pkg) / [Windows Installer Download](https://awscli.amazonaws.com/AWSCLIV2.msi)
2. Open installer and complete installation following the prompts. (No further steps, since credentials ARE NOT needed to use the tool.)
3. Open terminal (MacOS) or command prompt (Windows).
Expand All @@ -58,6 +60,7 @@ For example, to download a particular JSON file of tomogram metadata into a fold
```
aws s3 cp --no-sign-request s3://cryoet-data-portal-public/10000/TS_026/Tomograms/VoxelSpacing13.48/CanonicalTomogram/tomogram_metadata.json ~/Downloads/
```

In the above example, the download happened very quickly because the file was only about 1 kB in size. However, typical tomograms are multiple GB, so expect downloading to take 30-60 mins for a single tomogram for a given run, but downloading could take as long as days depending on the number and sizes of the files. To speed up download, you can follow [these instructions to optimize download speed](#optimize-download-speed)

For more detailed instructions, please refer to the sections below.
Expand All @@ -77,32 +80,36 @@ Once AWS CLI is installed, you will be able to use it in terminal (MacOS) or com
1. Download the installer pkg file using this URL: [https://awscli.amazonaws.com/AWSCLIV2.pkg](https://awscli.amazonaws.com/AWSCLIV2.pkg)
2. Open the file and follow the instructions provided in the installer window.

To confirm successful installation, open terminal and type `aws --version` to list the version of the AWS CLI installed. If installation was successful, you should see an output like:
To confirm successful installation, open terminal and type `aws --version` to list the version of the AWS CLI installed. If installation was successful, you should see an output like:

```
aws-cli/2.7.25 Python/3.10.6 Darwin/23.0.0 source/arm64 prompt/off
```

### Windows Installation

1. Download the installer pkg file using this URL: [https://awscli.amazonaws.com/AWSCLIV2.msi](https://awscli.amazonaws.com/AWSCLIV2.msi)
2. Open the file and follow the instructions provided in the installer window.
2. Open the file and follow the instructions provided in the installer window.

To confirm successful installation, open a command prompt window (open the Start menu and search for cmd) and type `aws --version` to list the version of the AWS CLI installed. If installation was successful, you should see an output like:

To confirm successful installation, open a command prompt window (open the Start menu and search for cmd) and type `aws --version` to list the version of the AWS CLI installed. If installation was successful, you should see an output like:
```
aws-cli/2.10.0 Python/3.11.2 Windows/10 exe/AMD64 prompt/off
```

## Download Data

To download data, we'll run commands in terminal (MacOS) or command prompt (Windows). The basic structure of these commands is below:

```
aws <command> <subcommand> <flags> [options and parameters (often S3 URL)]
```

If you followed the above installation instructions, which did not include setting up credentials, use `--no-sign-request` as a `<flag>` in all of your AWS CLI commands to indicate that you are accessing the bucket without signing in.

The URL of the CryoET Data Portal is `s3://cryoet-data-portal-public`, and each dataset in the bucket has its own unique URL such as `s3://cryoet-data-portal-public/10000/TS_026`.

To list all files in a directory, use the `s3` and `ls` as the `<command>` and `<subcommand>`, respectively.
To list all files in a directory, use the `s3` and `ls` as the `<command>` and `<subcommand>`, respectively.

The basic structure of this command is `aws s3 ls --no-sign-request [s3 bucket URL]`. For example, to list all data in the portal use:

Expand All @@ -123,16 +130,19 @@ To download a file, We can use the `s3` and `cp` as the `<command>` and `<subcom
```
aws s3 cp --no-sign-request s3://cryoet-data-portal-public/10000/TS_026/Tomograms/VoxelSpacing13.48/CanonicalTomogram/tomogram_metadata.json ~/Downloads/
```

The file should appear in your specified directory and the output in terminal / command prompt should be something like:

```
download: s3://cryoet-data-portal-public/10000/TS_026/Tomograms/VoxelSpacing13.48/CanonicalTomogram/tomogram_metadata.json to ./tomogram_metadata.json
```

In the above example, the download happened very quickly because the file was only about 1 kB in size. However, typical tomograms are multiple GB, so expect downloading to take 30-60 mins for a single tomogram for a given run, but downloading could take as long as days depending on the number and sizes of the files.
In the above example, the download happened very quickly because the file was only about 1 kB in size. However, typical tomograms are multiple GB, so expect downloading to take 30-60 mins for a single tomogram for a given run, but downloading could take as long as days depending on the number and sizes of the files.

## Optimize Download Speed

You can optimize your download speed by configuring your AWS CLI with the below command, which will increase your transfer rate to ~50 MB/s if your connection has sufficient bandwidth.
You can optimize your download speed by configuring your AWS CLI with the below command, which will increase your transfer rate to ~50 MB/s if your connection has sufficient bandwidth.

```
aws configure set default.s3.max_concurrent_requests 30
```
Expand All @@ -155,7 +165,7 @@ The CryoET Data Portal napari plugin can be used to visualize tomograms, annotat

<Accordion title="How do I download data using the Portal's API?">

- The `Dataset`, `Run`, and `TomogramVoxelSpacing` classes have `download_everything` methods which allow you to download all data associated with one of those objects.
- The `Dataset`, `Run`, and `TomogramVoxelSpacing` classes have `download_everything` methods which allow you to download all data associated with one of those objects.

- The `Tomogram` class has `download_mrcfile` and `download_omezarr` methods to download the tomogram as a MRC or OME-Zarr file, respectively.

Expand All @@ -169,7 +179,7 @@ from cryoet_data_portal import Client, Tomogram
# Instantiate a client, using the data portal GraphQL API by default
client = Client()

# Query the Tomogram class to find the tomogram named TS_026
# Query the Tomogram class to find the tomogram named TS_026
tomo = Tomogram.find(client, query_filters=[Tomogram.name == "TS_026"])

# Download tomogram
Expand All @@ -182,7 +192,7 @@ For more examples of downloading data with the API, check out the [tutorial here

<Accordion title="How do I use the Portal's API to select data?">

Every class in the Data Portal API has a `find` method which can be used to select all objects that match criteria provided in a query. The `find` method utilizes python comparison operators `==`, `!=`, `>`, `>=`, `<`, `<=`, as well as `like`, `ilike`, and `_in` methods used to search for strings that match a given pattern, to create queries.
Every class in the Data Portal API has a `find` method which can be used to select all objects that match criteria provided in a query. The `find` method utilizes python comparison operators `==`, `!=`, `>`, `>=`, `<`, `<=`, as well as `like`, `ilike`, and `_in` methods used to search for strings that match a given pattern, to create queries.

- `like` is a partial match, with the % character being a wildcard
- `ilike` is similar to like but case-insensitive
Expand Down Expand Up @@ -211,13 +221,13 @@ For more examples of using the `find` operator, check out the [tutorial here](ht

The tilt series quality score/rating is a relative subjective scale meant for comparing tilt series within a dataset. The contributor of the dataset assigns quality scores to each of the tilt series to communicate their quality estimate to users. Below is an example scale based mainly on alignability and usefulness for the intended analysis.

| Rating | Quality | Description |
| -------| ----------- | ----- |
| 5 | Excellent | Full Tilt Series/Reconstructions could be used in publication ready figures. |
| 4 | Good | Full Tilt Series/Reconstructions are useful for analysis (subtomogram averaging, segmentation).|
| 3 | Medium | Minor parts of the tilt series (projection images) need to be or have been discarded prior to reconstruction and analysis. |
| 2 | Marginal | Major parts of the tilt series (projection images) need to be or have been discarded prior to reconstruction and analysis. Useful for analysis only after heavy manual intervention.|
| 1 | Low | Not useful for analysis with current tools (not alignable), useful as a test case for problematic data only.|
| Rating | Quality | Description |
| ------ | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| 5 | Excellent | Full Tilt Series/Reconstructions could be used in publication ready figures. |
| 4 | Good | Full Tilt Series/Reconstructions are useful for analysis (subtomogram averaging, segmentation). |
| 3 | Medium | Minor parts of the tilt series (projection images) need to be or have been discarded prior to reconstruction and analysis. |
| 2 | Marginal | Major parts of the tilt series (projection images) need to be or have been discarded prior to reconstruction and analysis. Useful for analysis only after heavy manual intervention. |
| 1 | Low | Not useful for analysis with current tools (not alignable), useful as a test case for problematic data only. |

</Accordion>

Expand All @@ -231,11 +241,11 @@ Descriptions of all terminology and metadata used in the Portal is provided [her

<Accordion title="How do I contribute data to the Portal?">

Thank you for considering submitting data to the Portal!
Thank you for considering submitting data to the Portal!

Contributions can be raw data (tilt series and movie frames) + resulting tomograms, a new tomogram for existing raw data in the Portal generated using a different algorithm, and/or annotations of existing tomograms. We encourage all contributions, including those which may be of lower quality than existing datasets on the Portal, as these datasets are useful for developing better annotation and data processing algorithms.
Contributions can be raw data (tilt series and movie frames) + resulting tomograms, a new tomogram for existing raw data in the Portal generated using a different algorithm, and/or annotations of existing tomograms. We encourage all contributions, including those which may be of lower quality than existing datasets on the Portal, as these datasets are useful for developing better annotation and data processing algorithms.

We will work with you to upload the data to the Portal. Please fill out [this contribution form](https://airtable.com/apppmytRJXoXYTO9w/shr5UxgeQcUTSGyiY?prefill_Event=Contribution+from+portal&hide_Event=true), which is also found through the `Tell Us More` button on the bottom of the Portal homepage. We will then reach out to you to start the process of uploading your data. We have a ~6 month release cycle, so please allow time for the data to become available through the portal.
We will work with you to upload the data to the Portal. Please fill out [this contribution form](https://airtable.com/apppmytRJXoXYTO9w/shr5UxgeQcUTSGyiY?prefill_Event=Contribution+from+portal&hide_Event=true), which is also found through the `Tell Us More` button on the bottom of the Portal homepage. We will then reach out to you to start the process of uploading your data. We have a ~6 month release cycle, so please allow time for the data to become available through the portal.

In the future, we plan to implement a self-upload process so that users can add their data to the Portal on their own.

Expand Down

0 comments on commit 7cb4dea

Please sign in to comment.