Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checksum verification for ENSEMBL sequence and annotation wrappers #36

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

mobilegenome
Copy link
Contributor

  • I've added a routine to compare checksums for downloaded files.
  • Output files in the test snakefiles are now ending with ".gz" to reflect the downloaded fileformat correctly.

Ideally, the checksum() method I've added would be stored separately to avoid redundancy, but I believe this would break the ability of each wrapper to function stand-alone. If there is a way to import common libraries, please let me know.

@mobilegenome
Copy link
Contributor Author

mobilegenome commented Dec 5, 2019

Before going further with this, I would like to discuss an issue I have discovered with the current version: Python's ftplib does not work behind a corporate proxy-server and I didn't find a good workaround for this. Now, I have a working version of these sets of wrappers that call curl via subprocess.run. If interested in this change, I am happy to renew this pull-request.

Update: It is also possible to retrieve the data files in Python using the urllib module. Then there is no need for using a subprocess and curl.

cksum = int(fields[0])
filename = fields[2]
if filename == basename(snakemake.output[0]):
cksum_local = int(run(["sum", snakemake.output[0]], capture_output=True).stdout.strip().split()[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we should use Pythons internal checksum implementation.

if filename == basename(snakemake.output[0]):
cksum_local = int(run(["sum", snakemake.output[0]], capture_output=True).stdout.strip().split()[0])
if cksum_local == cksum:
print("CHECKSUM OK: %s" % snakemake.output[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("CHECKSUM OK: %s" % snakemake.output[0])
print("CHECKSUM OK", file=sys.stderr)

print("CHECKSUM OK: %s" % snakemake.output[0])
exit(0)
else:
print("CHECKSUM FAILED: %s" % snakemake.output[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("CHECKSUM FAILED: %s" % snakemake.output[0])
print("CHECKSUM FAILED:", snakemake.output[0], file=sys.stderr)

@@ -2,21 +2,45 @@
__copyright__ = "Copyright 2019, Johannes Köster"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this wrapper has been polished, changes should be transferred to the other wrapper in this PR:

@johanneskoester johanneskoester added the stale: contribution welcome Please feel free to finalize this work. label Aug 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale: contribution welcome Please feel free to finalize this work.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants