Synchronize Readme specs and test test json files #105

jnguyenx · 2015-04-30T18:12:44Z

I noticed two small inconsistencies:

In this test file there's a field called hgvs which is not defined in the specs. In my implementation it'll be ignored.
In both test files, the Feature has a label field whereas in the specs this field is not present but has ageOfOnset instead.
The patient wrapper is not present in the test files either.

buske · 2015-04-30T18:39:05Z

@jnguyenx Thanks for you comments! For 1. and 2., it's true, hgvs and label are not in the specification but it was intentional. We've been operating under the policy that occasionally, extra data may be useful to include in the API for interpretation or debugging, and sites that don't support it should ignore these fields (this is also how we get backwards-compatibility between minor versions). We should specify this more formally, and perhaps these should be removed from the test dataset anyway?

Regarding the patient wrapper, I'm unsure of whether this is helpful or not. The files serve two uses:

to provide example data that people should upload into their matchmaker
to provide example data for the query to make internally or externally to verify matchmaking
Since only the later involves a patient wrapper, I left it off, but I'd be happy to people think this would be more consistent.

jnguyenx · 2015-04-30T18:50:32Z

Hi Orion, thanks for the quick answer.

Well I wanted to use these json files to test against my implementation, what I did is to use the Readme specs to do all my implementation, and then added unit tests which consume those files. I ran into a couple of errors due to the 3 points I mentioned above.

So in my opinion it would be very nice just to be able to consume those files as they are provided.

I think that it is fine to leave extra fields which are not going to be parsed, but I'm a bit more worried about 2., because ageOfOnset is a required field, therefore the test files are not compliant. Same for patient.

buske · 2015-04-30T20:13:28Z

Hi Jeremy,

Regarding patient: that's a fair point. I'm happy to update the test files if someone else +1s this.

Regarding ageOfOnset, it is an optional field. The only mandatory field for a phenotypic feature is id. This is described in the spec, but perhaps either:

It should be clarified, and/or
The summary specification at the top should only include mandatory fields?

Thoughts?

jnguyenx · 2015-04-30T20:38:04Z

Hi,

Oh yeah my bad, I didn't see the optional.

I think that ID in Feature should have the (mandatory) label as done for GenomicFeatures. That way it'll clearer.

The summary specification is fine like that for me, in combination with the detailed specs.

Thanks for your help!

Suggested in #105

cmungall · 2015-04-30T23:25:45Z

+1 to making the spec and test data consistent

buske · 2015-05-01T00:18:59Z

Since each request includes only a single patient, there's no easy way to make the 50 patient dataset spec-compliant, per se. I could make the test data a list of requests, (so a list of objects, each with a "patient" field and the corresponding details), but this is confusing as well, since it still isn't a valid request altogether. I think it makes the most sense to leave the dataset the way it is (perhaps removing informal hgvs and label fields), but provide an additional file with a sample request that completely conforms to spec, including the "patient" wrapper. Would this be a good compromise?

jnguyenx · 2015-05-01T00:27:06Z

Sounds good! And also rename the files so their content is implicit, like set of patients, set of queries, etc...

Thanks!

Relequestual · 2015-05-01T09:13:10Z

I don't think the test data file wasn't designed to be used for forming the test queries. Having said that, there's no reason we couldn't provide both an import file and individual files for individual requests. Going down that line of thought, once you split them into individual files, there's I can't see why we would mainatin both. For a one time data import script, I see little difference between opening one file vs 50. Open to discussion though.

I'm running acorss issues converting some of the hgvs codes to ref and alt alleles. I've ran into one variant specifically which is causing problems (NM_005559.3(LAMA1):c.2988_2989delA) as this doesn't appear to be a valid notation. I'm discussing this with Orion and François via email currently. Will report back here with any updates.

buske · 2015-05-05T22:34:02Z

Update: we tracked it down to a typo in the original manuscript and committed a fix. I'm going through and verifying all the other variants now.

Suggested in #105

Issue #109: added zygosity Issue #105: removed non-spec fields from test data

buske added the Testing label Apr 30, 2015

buske added this to the Before v1.1 milestone Apr 30, 2015

buske added a commit that referenced this issue Apr 30, 2015

[cleanup] clarified feature id field is mandatory

7bb324c

Suggested in #105

buske mentioned this issue May 12, 2015

Fix variant errors in test data #108

Open

buske added a commit that referenced this issue May 12, 2015

Issue #105: removed non-spec fields from test data

1559cbb

buske added a commit that referenced this issue May 12, 2015

Issue #105: Renamed test data files to better describe their contents

8a8ca24

buske added a commit that referenced this issue May 12, 2015

Issue #105: removed out-of-sync fhs dataset

b4d947c

buske mentioned this issue May 12, 2015

Make test data conform to v1.0 API #110

Closed

buske added the PR submitted label Aug 4, 2015

buske added a commit that referenced this issue Nov 26, 2016

[cleanup] clarified feature id field is mandatory

ef145bf

Suggested in #105

buske added a commit that referenced this issue Nov 26, 2016

Issue #108: Updated test data

d4afae4

Issue #109: added zygosity Issue #105: removed non-spec fields from test data

buske added a commit that referenced this issue Sep 25, 2017

Issue #108: Updated test data

20d7fd2

Issue #109: added zygosity Issue #105: removed non-spec fields from test data

buske added a commit that referenced this issue Sep 25, 2017

Issue #108: Updated test data

e109139

Issue #109: added zygosity Issue #105: removed non-spec fields from test data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronize Readme specs and test test json files #105

Synchronize Readme specs and test test json files #105

jnguyenx commented Apr 30, 2015

buske commented Apr 30, 2015

jnguyenx commented Apr 30, 2015

buske commented Apr 30, 2015

jnguyenx commented Apr 30, 2015

cmungall commented Apr 30, 2015

buske commented May 1, 2015

jnguyenx commented May 1, 2015

Relequestual commented May 1, 2015

buske commented May 5, 2015

Synchronize Readme specs and test test json files #105

Synchronize Readme specs and test test json files #105

Comments

jnguyenx commented Apr 30, 2015

buske commented Apr 30, 2015

jnguyenx commented Apr 30, 2015

buske commented Apr 30, 2015

jnguyenx commented Apr 30, 2015

cmungall commented Apr 30, 2015

buske commented May 1, 2015

jnguyenx commented May 1, 2015

Relequestual commented May 1, 2015

buske commented May 5, 2015