-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: smaller BAM files #80
Comments
Please provide the command line you are using. |
I am using the following command: The link is being blocked by FortiGuard for being in violation of company internet policy. |
I believe you are running an older version of the tool in docker. Have you tried using the current version 2.0.0? It is way faster. Could you provide me with more information why https://vntyper.org/ is blocked in your environment? What is the message you get? |
I have now pulled the current docker image and tried to run it as described on your readme but it is failing completely now
error: During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): Since I am running with sudo and it worked with v1.0.0 I cannot explain the sudden issues with permissions. Regarding the website: |
I just updated the README file for the Docker instructions, which were out of date. # pull the docker image
docker pull saei/vntyper:main
# run the pipeline using the docker image
docker run -w /opt/vntyper --rm \
-v /local/input/folder/:/opt/vntyper/input \
-v /local/output/folder/:/opt/vntyper/output \
saei/vntyper:main \
vntyper pipeline \
--bam /opt/vntyper/input/filename.bam \
-o /opt/vntyper/output/filename/ You should just have to replace the "/local/input/folder/" with the path to your BAM file and "/local/output/folder/" with the path where you want your results saved. |
Regarding FortiGuard: |
I needed to specify my user and group to be able to run the program in docker due to some permission problems that were not present in v1. However, it does run now and I do not see any marked difference between the samples as before. 2025-01-17 08:40:10,668 - root - INFO - Filter column 'is_frameshift' exists; 420 -> 0 rows remain after requiring True. 2025-01-16 13:40:59,337 - root - INFO - Filter column 'is_frameshift' exists; 788 -> 156 rows remain after requiring True. whereas for my positive control, the data is filtered out at "alt_filter_pass". 2025-01-16 12:58:25,981 - root - INFO - Filter column 'is_frameshift' exists; 614 -> 39 rows remain after requiring True. What is being removed in this filter? |
The website can be accessed now and generates the same results as the pipeline version. |
Could you please send us the zipped version of the output folder for the positive sample generated with the latest version? Please make sure that the intermediate files are included so we can better look over the issue. |
Hi euweiss, Thank you for helping us debug. Glad that the webservice works for you. Could you please specify some things:
|
I had to specify the user:group in the doker command using the --user flag
yes a typical dupC
I have used a modified version of HotCount
Since I am working with patient data I will need to check I am allowed to do that. |
Thank you @euweiss, vntyper 2.0 sets a non root user in the Docker container which is recommended for security reasons. This might explain your problems with the new image. I will look into it and document this better. It will be a new issue. Regarding your core problem, I would like to unravel the case a little bit and summarize. Please correct:
For debugging, it would be great if you could send us just the MUC1 subset of the BAM files of both your NGS data. Because MUC1 is so small, it barely holds any genetic information that can be identifiable. You can also remove the header information from the BAM ( I have a script for that here: https://github.com/hassansaei/VNtyper/blob/main/reference/pseudonymize.py). |
That makes sense. When I tried changing some directory permissions the container generated files as user Administrator in the Administrator group
No, the snapshot positive case has never produced a positiv result with VNtyper for me. I apologise if hat was unclear. In the old version it gave me the strange log, which sparked opening this issue, as did any other (presumably) negative case if the BAM file was around 7 GB as opposed to >10 GB. I was able to confirm a random positive call (with a BAM >10 GB) from the other software, which is has not been snapshot tested. Therefore, I suspect coverage to be the issue in same way.
I would gladly provide you this but I need to check with my manager if we are legally allowed to share this. The laws regarding patient data are very strict. |
The user permission issue should be resolved in the latest update. Please pull the newest version and let us know the results.
Have you tested both methods (--extra-modules advntr) to check the output? It’s quite unusual for two independent methods to miss the dupC variant. Have you sequenced the SNaPshot product to rule out any SNVs or indels in the MwoI site? Could you confirm if you’re using the Twist capturing protocol? Thank you. |
I have a BAM file for a positive control sample, which I am trying to confirm with VNtyper. I have previously run the program with other BAM files and it worked without problems producing positive and negative results as expected. These BAM files were > 10GB in size. The current file in question is only ~ 7 GB and doesn't produce all the same output. One indication that something is not all right is given at run time:
[bam_sort_core] merging from 0 files and 8 in-memory blocks...
Usually it would say
[bam_sort_core] merging from 8 files and 8 in-memory blocks...
I have tested the program with other BAM files from the same sequecing run (of similar size) with the same result.
When I used BAM files from another sequencing run, it ran normally for larger files (>10 GB) but produced the same issue in smaller ones (~ 7GB) despite being from the same run and quality parameters being fine.
Can you explain where this problem may stem from? Is there a minimum for sequencing depth that may not be reached for these samples?
The text was updated successfully, but these errors were encountered: