Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For #1658 #650

Closed
Closed

Conversation

mugitty
Copy link
Collaborator

@mugitty mugitty commented Oct 12, 2023

Updated qualifier labels

@dustine32
Copy link
Collaborator

Linking to ticket geneontology/go-site#1658

@dustine32
Copy link
Collaborator

@mugitty These are pretty non-complex code changes but do we know the expected effects? As in, I'm suspecting changing these lookups will cause many GAF 2.2 lines to be dropped if they don't use the comma. For example, I see this line in the current WormBase upstream at http://skyhook.berkeleybop.org/c_elegans.PRJNA13758.WS289.gene_association.wb.gz:

WB      WBGene00000039  acn-1   acts_upstream_of_or_within_negative_effect      GO:0045026      WB_REF:WBPaper00006295|PMID:14559923 IMP             P               C42D8.5 gene    taxon:6239      20180813        WB           happens_during(WBls:0000038),occurs_in(WBbt:0005753)

@mugitty I'm probably being paranoid but are you able to test parsing this c_elegans.PRJNA13758.WS289.gene_association.wb file to see if the acts_upstream_of_or_within_negative_effect lines are being dropped? If they are being dropped, @pgaudet is this expected? Are all upstreams going to align to using the "with comma" relation label in GAF 2.2?

@mugitty
Copy link
Collaborator Author

mugitty commented Oct 13, 2023

@dustine32, the WB line fails the qualifier test as expected.

@pgaudet how should be proceed?

@pgaudet
Copy link

pgaudet commented Oct 16, 2023

@dustine32

What is the correct way format for these these, with or without comma? We could have a REPAIR for these.
In RO and in GOREL there is a comma, but we can remove it, if that's not how groups spell out the relation in the GAF file.

@dustine32
Copy link
Collaborator

@pgaudet @mugitty I'd lean on whichever format is in the GAF 2.2 spec as I assume this is what everyone is currently following to produce their GAFs. This spec lists out accepted qualifier values and none of them contain a comma.

Though, I'm still a bit uncertain who has authority over these GAF qualifier values. @kltm @vanaukenk @balhoff These underscored qualifier values (e.g., acts_upstream_of_or_within_negative_effect, is_active_in) are just shorthand we use at GO to translate from GAF to the proper RO, GOREF, etc. CURIE, which then will be tied to a proper label (e.g., "acts upstream of or within, positive effect", "is active in") maintained at their respective ontologies. Does this sound right?

In other words, there are three distinct Term fields here:

  • Term IRI - RO:004032
  • Term label - acts upstream of or within, positive effect
  • Term GAF qualifier - acts_upstream_of_or_within_negative_effect

And we at GO can just demand (via the GAF spec) that GAF producers use whatever GAF qualifiers (i.e., w/ comma or w/o comma) we like?

@kltm
Copy link
Member

kltm commented Oct 25, 2023

Though, I'm still a bit uncertain who has authority over these GAF qualifier values. @kltm @vanaukenk @balhoff These underscored qualifier values (e.g., acts_upstream_of_or_within_negative_effect, is_active_in) are just shorthand we use at GO to translate from GAF to the proper RO, GOREF, etc. CURIE, which then will be tied to a proper label (e.g., "acts upstream of or within, positive effect", "is active in") maintained at their respective ontologies. Does this sound right?

While the format is the format, we have control over expectations and QC. Once upon a time, the labels were pretty much just that, but became tied to ontology terms behind the scenes as time went on. The GAF still displays this history on its shirt, GPAD/GPI does away with it. Different consumers of the files treat these different (or ignores them), but we are ideally tying everything back to the appropriate ontology term when we process.

In other words, there are three distinct Term fields here:

* Term IRI - RO:004032

CURIE

* Term label - acts upstream of or within, positive effect

Yes.

* Term GAF qualifier - acts_upstream_of_or_within_negative_effect

Unfortunate token proxy for ontology term

And we at GO can just demand (via the GAF spec) that GAF producers use whatever GAF qualifiers (i.e., w/ comma or w/o comma) we like?

Well, the ones defined in the spec. That's the rule. We map those to the appropriate term.

@mugitty
Copy link
Collaborator Author

mugitty commented Oct 25, 2023

According to https://geneontology.org/docs/go-annotation-file-gaf-format-2.2/#qualifier-column-4, the commas are not in the qualifiers. Therefore, update for commas is not required

@pgaudet
Copy link

pgaudet commented Nov 30, 2023

We dont want the comma anymore.- please close

@mugitty mugitty closed this Dec 5, 2023
@mugitty mugitty deleted the go-site-1658-gorule-0000061-update-qualifiers branch December 5, 2023 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants