Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dp24 vecscreen2 updates #41

Merged
merged 23 commits into from
Nov 20, 2023
Merged
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
ca6e3d6
added VecScreen related files
eeaunin Nov 16, 2023
568e963
Edited the main workflow and test yaml
eeaunin Nov 16, 2023
87cffd9
Edited the main workflow
eeaunin Nov 16, 2023
f045b69
Edited the test yaml file
eeaunin Nov 16, 2023
dcfb977
Edited the test yaml file
eeaunin Nov 16, 2023
320e071
Edited the test yaml file
eeaunin Nov 16, 2023
ddc2f30
Deleted trailing Ns scripts and reformatted the VecScreen Python scri…
eeaunin Nov 16, 2023
f39e4c5
Trying a change in github_testing/test.yaml
eeaunin Nov 16, 2023
d5d7838
Trying a change in github_testing/test.yaml
eeaunin Nov 16, 2023
234803a
add vecscreen db
yumisims Nov 17, 2023
1f62898
putting in blastdb tuple generation
yumisims Nov 17, 2023
a1cd1ff
refactor chunk fasta
yumisims Nov 17, 2023
6cc19bc
black
yumisims Nov 17, 2023
dea8d11
black
yumisims Nov 17, 2023
8bcae07
the default python is 3.5 so change the beloved f striiiiing
yumisims Nov 17, 2023
77a98a7
Formatting and minor changes to channels
DLBPointon Nov 17, 2023
7317fd7
removed remnant file
DLBPointon Nov 17, 2023
005c13e
Missed a couple of comments
DLBPointon Nov 17, 2023
4669b12
add wildcard to vecscreen output
yumisims Nov 17, 2023
946cf0e
remove standard error output to pass github test
yumisims Nov 17, 2023
05612dd
Reverting changes to pacbio_barcode_check and adding barcodes found i…
DLBPointon Nov 20, 2023
79a804a
Correct the other test.yaml for github testing
DLBPointon Nov 20, 2023
afbbde7
Fixed chunk assembly for vecscreen, added argparse for threshold vari…
DLBPointon Nov 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 22 additions & 11 deletions bin/pacbio_barcode_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
Originally written by Eerik Aunin @eeaunin

Adapted by Damon-Lee Pointon @DLBPointon

Refactored by Yumi Sims
"""

from pathlib import Path
Expand Down Expand Up @@ -45,19 +47,28 @@ def detect_barcodes_from_read_file_names(barcodes_fasta_path, pacbio_read_files)

def check_if_barcodes_exist_in_barcodes_fasta(barcodes_list, barcodes_fasta_path):
"""
Checks if the specified barcodes exist in the barcode sequences FASTA file, exits with an error message if a barcode is not found
Checks if the specified barcodes exist in the barcode sequences FASTA file, prints a message for each missing barcode.
DLBPointon marked this conversation as resolved.
Show resolved Hide resolved
DLBPointon marked this conversation as resolved.
Show resolved Hide resolved
"""
barcodes_fasta_data = gpf.l(barcodes_fasta_path)
barcode_names_in_fasta = [n.split(">")[1] for n in barcodes_fasta_data if n.startswith(">")]
for barcode in barcodes_list:
if barcode not in barcode_names_in_fasta:
sys.stderr.write(
f"The PacBio multiplexing barcode ({barcode}) was not found in the barcode sequences file ({barcodes_fasta_path})\n"
)
sys.exit(1)

# If this print statement is reached, all user-supplied codes are present.
print("The query barcodes exist in the barcodes database file")
barcode_names_in_fasta = [n.split(">")[1].strip() for n in barcodes_fasta_data if n.startswith(">")]

missing_barcodes = [barcode for barcode in barcodes_list if barcode not in barcode_names_in_fasta]

print(
"\n".join(
[
f"Warning: The PacBio multiplexing barcode ({barcode}) was not found in the barcode sequences file ({barcodes_fasta_path})"
for barcode in missing_barcodes
]
)
)

if missing_barcodes:
print(
f"\nSummary: Some barcodes were not found in the barcode sequences file.\nMissing barcodes: {', '.join(missing_barcodes)}"
)
else:
print("\nAll query barcodes exist in the barcode sequences file.")


def main(barcodes_fasta_path, pacbio_read_files, pacbio_multiplexing_barcode_names):
Expand Down