Releases: vgteam/vg
vg 1.54.0 - Parafada
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.54.0
Buildable Source Tarball: vg-v1.54.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Integrated haplotype sampling in
vg giraffe
now does diploid sampling. - GBWTGraph algorithm for parsing GFA now handles P-line names of the form
sample#contig
correctly.
vg 1.53.0 - Valmontone
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.53.0
Buildable Source Tarball: vg-v1.53.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg sim
no longer crashes on graphs with 1-node cyclesvg autoindex
can identify haplotypes specified as P-lines in a GFA- Set reference samples in GBWT or GBZ with
vg gbwt
option--set-reference
. vg rna
no longer projects transcripts twice onto a reference given byRS
tag in a GFAvg rna
assigns unique names to twice-projected transcripts on cyclic haplotypes- GBWT construction automatically increases buffer size if the paths are too long.
- In
vg haplotypes
, the default number of candidates for diploid sampling is now 32. vg giraffe
now explains that--named-coordinates
works for GAF outputlibvgio
now uses quoted includes internally- vg's README now prominently lists some recommended papers to cite when using parts of
vg
in your work - Updated
dozeu
submodule should no longer crashvg giraffe
andvg surject
.
Updated Submodules
dozeu
gbwt
libhandlegraph
libvgio
vg 1.52.0 - Bozen
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.52.0
Buildable Source Tarball: vg-v1.52.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg construct
now has a-A, --alt-paths-plain
option for storing IDs from the VCF instead of hash-based IDs for alt allele paths.vg call
patched so that certain problem cases no longer take forever.- Mac CI now actually installs Node
- GBZ files can now hold reference paths like
GRCh38#chr1
, with no haplotype phase number. - vg is now compatible with jq 1.7.
- Mac build should no longer fail with complaints about a missing atomic library.
- Tests should no longer fail due to odd alignments from
diff
. - Better error messages from
vg haplotypes
. - vg build process should now always use exactly one libhandlegraph
- add missing
-O
help forvg call
- vg Makefile now can take a
CXX_STANDARD
variable in. You should be able to e.g.make CXX_STANDARD=20
if you have a Protobuf/Abseil for C++20. - GCSA2 construction in
vg autoindex
rewinds to pruning if memory is too high
Updated Submodules
- kff-cpp-api
- gcsa2
- libhandlegraph
- libvgio
- vcflib
- gbwtgraph
vg 1.51.0 - Quellenhof
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.51.0
Buildable Source Tarball: vg-v1.51.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Giraffe can do haplotype sampling automatically if sufficient inputs are provided.
- Simplified
vg giraffe
command line help; full list of options is still available with-h
. - Diploid mode for haplotype sampling: first select N haplotypes, then choose the best pair.
- Add ref-path stubbification option -S to vg clip
vg validate
now complains about duplicate path names- vg CI expects only the allocated cores on the Gitlab runners
- vg CI Buildkit docker builds use the local Docker Hub mirror
vg convert
option--no-translation
for converting GBWTGraph to GFA directly without using the node-to-segment translation.vg rna
will not crash when adding transcripts with an intron of length 0vg paths
now supports-H
for selecting haplotype paths and-R
for selecting reference paths
Updated Submodules
- backward-cpp
- gbwtgraph
- libbdsg
vg 1.50.1 - Monopoli
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.50.1
Buildable Source Tarball: vg-v1.50.1.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg autoindex --workflow map
can index GFAs with many W linesvg autoindex -w map
and-w mpmap
won't enter an infinite loop when they can't write to diskGRCh38#chr1
style path names in GFA P lines should now be parseable again
Updated Submodules
gcsa2
libhandlegraph
vg 1.50.0 - Monopoli
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.50.0
Buildable Source Tarball: vg-v1.50.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- CI test jobs now cache pulled Docker images
- GAF output should now have more correct path end positions and block lengths
- Paths that look like PanSN but aren't, due to having a non-numeric haplotype number field, will no longer be parsed, and should thus no longer produce crashes due to parsing failures.
- Haplotype sampling now copies the vg node to GFA segment translation correctly from the original graph.
vg minimizer
requires a distance index for building a minimizer index.-S
option added tovg call
to select reference paths by sample name. This is more convenient as it allows, ex-S GRCh38
to be used in place of-p GRCh38#0#chr1 -p GRCh38#0#chr2 ..
. Such selection is necessary when the graph has more than one reference sample andvg call
will now refuse to handle graphs with multiple reference samples unless paths are selected with-S
or-p
.vg filter
can filter to only mapped or only unmapped readsvg deconstruct
changed back to writing the full sample / hap/ contig name in VCF contig field. In order to just write the contig name (like in the past few versions of vg), use the new-C
option.
Updated Submodules
gbwtgraph
gcsa2
libhandlegraph
libvgio
vg 1.49.0 - Peschici
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.49.0
Buildable Source Tarball: vg-v1.49.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Giraffe can now use weighted minimizers, which try to avoid selecting frequent kmers as minimizes.
vg inject
can read BAM files with unmapped readsvg giraffe
now has--match
,--mismatch
,--gap-open
,--gap-extend
, and--full-l-bonus
options to control alignment scoring.- Fix crash during assertion in
vg deconstruct
on PGGB graph that was introduced in v1.48.0 :
character now allowed in path name duringcontig:start-end
range extraction from command line options (ie invg chunk
).- vg now builds with C++17 on Mac, as required by the version of Protobuf packaged in Homebrew
- vg now deduplicates arguments from pkg-config, to limit command line length with Protobuf's 30-odd Abseil dependencies.
- Better default parameters for haplotype sampling.
vg clip
crash on PackedGraphs fixed.- Mac CI now collects Homebrew debug info
- vg's CI can now run on local Gitlab runners
- CI no longer does extra Docker builds without proper caching
vg giraffe -b fast
preset now works again and is under test- Serialized mutable graphs keep proper track of the number of edges they contain
Updated Submodules
gbwtgraph
libbdsg
libvgio
vg 1.48.0 - Gallipoli
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.48.0
Buildable Source Tarball: vg-v1.48.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg chunk
will now report an error if asked to chunk reads that do not go with the graphvg autoindex
can construct linear reference indexes from a FASTA filevg gbwt
will now refuse to add reference sample names with#
in them, and will try and advise on what the tags are supposed to be likevg surject
can project to paths that intersect themselves in the reverse orientationvg surject
will now print warning messages when processing a read or pair takes a suspiciously long amount of time.vg giraffe
should no longer try to put hypothetical sequencing errors in empty intervals, and should report errors in MAPQ cap computation in a more debuggable way.- Crashes now include the stack trace by default; set
VG_FULL_TRACEBACK=0
to suppress it to a file. vg surject
andvg giraffe
should now include relevant read name hints when crashing in many cases.- Added
crash_unless()
as an alternative toassert()
that reports these hints. We eventually want to use it everywhere. - Crash reports now have cool hyperlinks.
vg surject
will limit itself to 200 anchors per target path segment by default; use the new-a/--max-anchors
option to control this limit. Surjection against PGGB graphs may require--max-anchors 20
to complete.vg surject
may be able to limit itself to considering only high-scoring surjections in some cases.vg construct
now properly handles the case where it is looking for the end of an inversion from 1 base before itvg construct
will no longer try and coalesce nodes at construction chunk boundaries when those nodes have alt paths that visit them or edges to their outside endpoints. This should fix some crashes and incorrect placement of structural variant breakpoints in the graph.- Update vcflib to current version plus build and parser fixes
vg construct
should now be faster when variants are extremely long and overlap each othervg chunk
now outputs PackedGraph instead of Protobuf by default (unless-T
is used). Also, output files now get the.vg
file extension for any non-GFA format (usevg stats -F
to check the underlying format of any graph).- Snarl clipping bug in
vg clip
fixed so that when there are multiple different reference traversals in a snarl (common in PGGB output), then none of them are chopped. - Fixed build against Ubuntu 22.04's pybind11
- Docker containers now have
/usr/bin/time
for profiling
Updated Submodules
htslib
vcflib
gbwtgraph
libbdsg
tabixpp
vg v1.47.0 - Ostuni
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.47.0
Buildable Source Tarball: vg-v1.47.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg sim
andvg stats -a
sped up for GBZ input- Giraffe now uses the watchdog to detect slow reads
vg construct
should no longer fail assertions and will instead report errors.vg construct
now handles IUPAC codes in the reference as Ns even if they are covered by symbolic structural variants- Faster haplotype sampling with
vg haplotypes
. vg stats -a
also outputs statistics on alignment scores and mapping quality.vg giraffe
should no longer crash if the distance index is read-only.vg rna
now supports the GBZ format for the input graph and haplotypes (new option--gbz-format
).vg convert
now defaults to PackedGraph instead of HashGraph if no output format selected.- New option
vg clip -s
to remove stubs (dangling nodes not on ref path) vg call
andvg deconstruct
now only apply node ID translation from GBZ inputs if new-O
is used.vg surject
will now enforce that the reads it is surjection actually were mapped against the graph you are surjecting against. Right now it checks node IDs and lengths. You can turn this off with-V
/--no-validate
.vg gbwt
now accepts a-I
/--gg-in
option, which lets you load a.gg
file and a.gbwt
file and combine them into a.gbz
graph.vg validate
now accepts a-A
/--gam-only
option which will validate only the provided alignment's agreement with the graph, and not the graph itself.- The
vg surject
/vg giraffe
error: couldn't identify a path corresponding to surjected read
error message has been improved to dump more information about the offending read and path. - When selecting paths to surject to, a warning will now be printed if the user asks for a path with a
[]
-enclosed subrange at the end. The base path name without the[]
subrange coordinates should usually be used instead, because that is the space in which the SAM/BAM output will have its coordinates specified. - The
vg surject
Graph does not have a path named
error message should now no longer print pointer values, and is extended to explain a bit more about subpaths.
Updated Submodules
The kff-cpp-api
and libbdsg
submodules have been updated.
vg 1.46.0 - Altamura
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.46.0
Buildable Source Tarball: vg-v1.46.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Long read Giraffe codepath now falls back to non-GBWT alignment for very long tails, which is slow but at least tends to finish
- Long read Giraffe codepath refuses to use Dozeu for tails, because the tails are very long and Dozeu will clobber the stack when given a long alignment
- More knobs have been added to long read Giraffe to tweak what inter-pre-cluster connections are sent to reseeding, and what chains are actually made into alignments
vg stats
now reports on time-usage information in GAM files if available- Wiki tutorial on programming with
libbdsg
andgbwtgraph
is now under CI test. - Rescue alignment in
vg giraffe
paired-end mode should no longer decide it rescued off of the wrong alignments - GAMP files no longer lose the "secondary" annotation when converted to GAM
- New benchmarking and read-simulating scripts for testing long-read Giraffe
- Fixed a crash in Giraffe correctness tracking in the long-read codepath due to out-of-bounds accesses into previous stages in the funnel
- Reading SAM/BAM/CRAM files into a graph (i,e,
vg inject
) will now bail out and complain if they are against haplotypes and not reference or generic paths (because positional lookup is likely to be too slow to be practical) vg inject
now defaults to the normal default number of threadsvg gamcompare
now has a-n
/--rename
options for comparing GAM files annotated with position on the same contigs but with different names.vg annotate
now uses a ReferencePathOverlayHelper to make sure it has fast access to the positions of graph nodes along paths.- vg CI now tests against sequenceTubeMap using its recommended Node version
vg rna
will no longer in certain cases skip the first line when the annotation input has a headervg rna
no longer crashes when adding splice-junction from a BED file with intronsscripts/make_pbsim_reads.sh
now works with local graph files in addition to S3 URLsscripts/lr_benchmark.sh
now downloads and uses CHM13 graphs- Chaining lookback now stops at 15 total items max
- Tail alignment with GSSW now refuses to fill more than 16 mibi-cells
- Fix off-by-1 array size bug in
vg clip
edge clipping - Faster
vg chunk
on GBZ input - Highly experimental
vg haplotypes
subcommand for sampling haplotypes based on kmer counts. vg giraffe
now preloads the distance index into memory before mapping any reads- Makefile should deal better with
protoc
not being installed - Surjecting now works for interleaved GAFs, even if one or both reads are unmapped.
Updated Submodules
The libbdsg
and gbwtgraph
submodules have been updated.
New Submodules
The kff-cpp-api
submodule has been added.