Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bioboxes image list #4

Open
pbelmann opened this issue Dec 7, 2016 · 4 comments
Open

Update bioboxes image list #4

pbelmann opened this issue Dec 7, 2016 · 4 comments

Comments

@pbelmann
Copy link
Member

pbelmann commented Dec 7, 2016

I suggest the following changes:

Problem: One tool/image could implement multiple interfaces. By using the current list we would have to list the an image for each interface.

Solution: in PR #2 I suggest to change the list to the following format where we state for each task the corresponding interface:

    title: velvet
    image:
      dockerhub: bioboxes/velvet
      repo: https://github.com/bioboxes/velvet
      source: https://github.com/dzerbino/velvet
    pmid: 18349386
    homepage: https://www.ebi.ac.uk/~zerbino/velvet/
    mailing_list: http://listserver.ebi.ac.uk/mailman/listinfo/velvet-users
    description:
      The velvet assembler was one of the first assemblers created for short read sequencing. Velvet was developed at the European Bioinformatics Institute.
    tasks:
      - name: default
        interface: assembler
      - name: careful
        interface: assembler

Interfaces are listed in a separate file: interfaces.yml

  1. @michaelbarton suggested to add SHA ids of the bioboxes images.

  2. I suggest to add a field called 'tags'. This field should allow us to add categories for each biobox.

Example:

    title: ray
    image:
      dockerhub: bioboxes/ray
      repo: https://github.com/sebhtml/ray
      source:
    pmid: 20958248
    homepage: http://gatb.inria.fr/
    tags: [nucleotid.es, CAMI]
    mailing_list: https://www.biostars.org/t/gatb/
    description: 
      Ray is a parallel software that computes de novo genome assemblies with next-generation sequencing data.
    tasks:
      - name: default
        interface: assembler
  1. @fernandomeyer added the velour biobox but our page on bioboxes.org is not updated. Maybe we should implement this listing by using javascript which allows us to fetch the list each time the webpage is opened.
@michaelbarton
Copy link
Contributor

michaelbarton commented Dec 7, 2016 via email

@pbelmann
Copy link
Member Author

pbelmann commented Dec 8, 2016

Problem: One tool/image could implement multiple interfaces. By using the
current list we would have to list the an image for each interface.

My preference is for each biobox is to implement one type of interface to
simplify the management of biobox images, especially the image tasks. Having
one interface per biobox image means that each of the task is scoped by the
interface too. For example the default task for assembler should imply the
what the author belives is the best possible assembly given a wide variety of
inputs, which is what the CLI runs if --task is not specified. A careful
task implies trading assembly size for accuracy.

I agree this would be difficult if we want to specify a default task for each interface.
In CAMI we have for example binning evaluation tools that could be used taxonomic and non taxonomic binning files. Thats why they implement the taxonomic and non-taxonomic binning evaluation interface.
But I guess we will have to build two different images that are fetching the same library/github repository.

I have been using SHA256 digests so far because this is supported
by the docker client. For example the command:
docker run repo/image@sha256:digest

Ok, so using or listing the digest does makes sense if you are referencing a specific biobox from a different service, like nucleotid.es or CAMI.

I think the tags you mentioned refer to a different use case though, is
that correct?

Yes, with tags I do not mean docker tags. I think it would be useful in our current bioboxes listing (http://bioboxes.org/available-bioboxes/) to have a field called 'tags' or 'metatags'. In this field we could categorize our containers. Tags could be for example 'CAMI' or 'nucleotid.es'.

  1. @fernandomeyer added the velour biobox but our
    page on bioboxes.org is not
    updated. Maybe we should implement this listing by using javascript which
    allows us to fetch the list each time the webpage is opened.

I believe we could update the circle.yml for the data repository to
automatically request a rebuild of the website every time a pull request is
merged into master.

Sounds great! Could you update the repo?

@michaelbarton
Copy link
Contributor

michaelbarton commented Dec 9, 2016 via email

@pbelmann
Copy link
Member Author

I think the tags you mentioned refer to a different use case though, is
that correct?

Yes, with tags I do not mean docker tags. I think it would be useful in our
current bioboxes listing (http://bioboxes.org/available-bioboxes/) to have a
field called 'tags' or 'metatags'. In this field we could categorize our
containers. Tags could be for example 'CAMI' or 'nucleotid.es'.

I'm not sure what the use case would be. If it would be to link to the
benchmarked data, then I think that would be great idea. For example if we
could like to all benchmarking data that's available for the specific image.

Yes, that is something I would like to implement in future. But for now I think it would be enough to add a tag with a link to the benchmarking website. I could setup a PR.

For showing benchmarking results for bioinformatics software we would have to define something like a common REST API that should be used by nucleotid.es and CAMI and maybe other evaluation/benchmarking websites. Other websites that are listing bioinformatics software could use this API. But this is something that is independent of bioboxes. If you are interested in working on such an API with me we should discuss this somewhere else.

I believe we could update the circle.yml for the data repository to
automatically request a rebuild of the website every time a pull request
is merged into master.

Sounds great! Could you update the repo?
Yes, I'll look into setting this up.

Great. I will create a separate issue, so that we do not forget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants