-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathHISTORY
932 lines (928 loc) · 57.6 KB
/
HISTORY
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
0.213 Critical locking overhaul resolves vrpipe-server not functioning
correctly.
Fixed vrpipe-output so that it works again.
Fixed all vrpipe-* scripts and processes to not spawn extraneous redis
servers; now only vrpipe-server will spawn 1 as needed.
bcftools_cnv step fixed to get true sample name from vcf.
bcf_to_vcf step improved so that $bcftools can be replaced by
bcftools_exe in post_calling_vcftools option.
bam_import_from_irods_and_index step fixed so that it can be installed.
0.212 Critical speed fix for steps that take datasource input with multiple
types and many files.
cram_index and bam_index steps now die if they can't delete an out-of-
date index file.
Overhauled internal locking mechanism should result in a slightly faster
system with fewer, shorter-lived stalls.
(Sanger specific:) irods warehouse datasource and npg_cram_stats_parser
step now default to getting F0xB00 stats files instead of F0x900. Also,
the correct studies according to warehouse are related to the input
file, ignoring study information stored in irods metadata. irods
datasource no longer falls over if given a query file with blank lines.
GATK steps can now use a gatk_key to turn off phoning home, enabled by
default if environment variable GATK_KEY is set.
New gatk_genotype_gvcfs pipeline which can do the calling to chrom vcf
in one setup (using both datasource and step chunking).
New pipelines: bam_calling_with_gatk_haplotype_caller,
bam_calling_with_mpileup_via_bcf, bam_import_from_irods_and_index,
bam_indel_realignment_and_bqsr, bam_merge_and_mark_duplicates,
bam_merge_and_streaming_mark_duplicates,
gvcf_calling_one_combine_with_gatk_haplotype_caller,
gvcf_calling_two_combine_with_gatk_haplotype_caller,
gvcf_calling_with_gatk_haplotype_caller,
sample_bam_remapping_with_bwa_mem
Deleted pipelines (some of them renamed to new ones above):
sample_gvcf_via_bcf_with_mpileup,
sample_gvcf_with_gatk_haplotype_caller,
sample_indel_realignment_and_bqsr,
sample_irods_import_and_mark_duplicates,
sample_irods_import_and_streaming_mark_duplicates
0.211 bam_improvement_gatk_v2 pipeline fixed to work again.
irods_sample_import pipeline renamed as
sample_irods_import_and_mark_duplicates.
bam_reheader step fixed to work when reads metadata on input file is
wrong.
bcftools_cnv step fixed to always remove invalid plot<-sample
relationships.
0.210 Various bam-outputting steps are now more tolerant of input file
reads metadata being wrong.
bcftools_cnv step fixed to properly relate all plots to the sample node.
New sample_indel_realignment_and_bqsr pipeline.
Critical speed fix for vrtrack_update_mapstats step.
Fixed polysomy step so the output files have the correct sample name
metadata.
0.209 Minor fix for vcf_merge_different_samples_control_aware step when the
samples have no control.
Fix for the polysomy step when the vcf has duplicate samples.
Fixed step post processing when an output file needs permission changes
and wasn't output by anything.
0.208 Critical speed fix for graph_filter queries in datasources (allowing
those datasources to work now).
Further critical speed fixes for processing all pipelines, allowing them
to proceed.
0.207 Critical speed fix for processing all pipelines (especially evident when
unstalling a stalled/started-from-scratch setup)
vrpipe-setup fixed to allow creating new setups again.
bwa_mem_to_bam fixed to solve conflict issues with samtools sort.
Various steps in the polysomy_cnv_caller pipeline have been fixed to
work as intended.
0.206 Proper fix for v0.205.
0.205 Fix for hipsci QC to handle _CTRL appended to sample names in files.
0.204 CNV and Pluritest-related steps now store their results properly in the
graph database against sample and donor nodes.
vrpipe-setup now allows reactivate/deactivate of multiple setups.
New sample_gvcf_via_bcf_with_mpileup pipeline.
New sample_gvcf_with_gatk_haplotype_caller pipeline.
vrtrack qc website donor view now allows setting donor sample qc status.
0.203 Fixed vrpipe-setup to handle new default options in the irods
datasource correctly.
0.202 npg_cram_stats_parser step now stores md5_of_ref_seq_md5s
0.201 irods datasource methods fixed to all work with the new irods protocol
system.
0.200 delete_inputs step behaviour now checks that input files are not needed
by another setup before deleting.
New pipeline for calling with fermikit.
Bwa mem steps can now take sequence dict file as input.
Steps can get chunks across the gnome with the new
StepGenomeChunkingRole.
npg_cram_stats_parser step now compares cram header to expectations,
with result stored in new Header_Mistakes graph node.
0.199 This is a big change from the previous version. Re-read README.md for
installation changes and study IMPORTANT_NOTES if upgrading.
Newer steps and pipelines, and some vrpipe internal and website
features, use the Neo4J graph database to store information. Neo4J may
eventually replace the use of MySQL/postrgres.
Issues with running out of database connections have been (mostly)
resolved, so you can increase or get rid of --max_submissions arg to
vrpipe-server.
Now multiple setup handlers are run on the farm, each handling a few
setups, to spread the load and help avoid some setups never being
updated because a single setup handler never got around to dealing with
it before being killed.
The location of output files has been improved to better avoid different
PipelineSetups with the same datasource and pipeline writing to the
same paths.
New supported filetypes hts, aln and var allow steps to work with
bams, crams, vcf and bcf files easily. num_records() now only considers
primary alignments.
Copying files using VRPipe API now preserves mode and ownership.
New file "protocol" system, allowing things like open() to work on
files stored on remote filesystems such as irods without downloading
them to local disc first.
The irods datasource no longer requires the local_root_dir option,
making use of the protocol system to represent the files as their
location in irods instead of a non-existent local file. The
*_with_warehouse_metadata methods also now look for qc files associated
with queried bam/cram files.
The vrpipe_with_genome_chunking datasource now works with all its
methods.
The irods and vrpipe datasources can now use a graph_filter, to filter
based on properties of nodes related to the file of interest.
New web front-end (vastly improved status) and beginnings of a QC
sub-site. All pages are now served securely with https.
vrpipe-rm can take a fofn.
vrpipe-permissions can now change permissions on input files by
selecting step 0.
vrpipe-fileinfo can now work from a fofn, and has options to suppress
the header and get info on setup input files by selecting step 0.
Various minor bug fixes and speed improvements, especially for steps
that deal with thousands of files. Unsticking stalled setups should now
also be many times faster on multi-processor systems.
New pipelines: sequencing_qc_from_irods*, bam_2015_vrtrack_processing*,
irods_sample_import, gatk_combine_gvcfs, gatk_genotype_gvcfs,
chipseq_qc_and_peak_calling,
Cram-related steps now use samtools (1.2+) instead of cramtools.jar.
Added support for picard 1.128+ to picard steps, which has a new command
line format.
New irods_convert_cram_to_bam option for irods_get_files_by_basename
step.
bamtofastq now done with biobambam bamtofastq.
0.198 Fixes the vcf_merge_different_samples* steps to cope with hundreds of
input VCFs (requires latest bcftools).
vrpipe-submissions has new --partial_reset option which is useful when
a step produces many submissions per data element, some of which worked,
but the failures need a new command line; --partial_reset will only
deleted the failed submissions instead of also restarting the ones that
worked (as --full_reset does). If updgrading, read IMPORTANT_NOTES.
0.197 Fixed inability to cope with file paths with commas in them.
New polysomy_cnv_caller pipeline and related steps.
0.196 New STAR pipeline and steps.
verify_bamid step now actually works.
0.195 vrtrack_populate_from_vrpipe_metadata step now sets npg_qc_status;
sample name now always comes from the sample metadata; public_name
metadata is now stored in individual alias. The irods datasource also
no longer ignores it when manual_qc metadata changes.
bcf_to_vcf step has new vcf_sample_from_metadata option which reheaders
the output vcf; bcftools_gtcheck step expected_sample_from_metadata_key
option now takes multiple values
0.194 mpileup_vcf/bcf_to_vcf step has new vcf_sample_from_metadata option
which reheaders the output vcf; bcftools_gtcheck step
expected_sample_from_metadata_key option now takes multi values
0.193 sequenom_csv_to_vcf step now handles fluidigm csvs as well; now produces
calculated_gender metadata instead of sequenom_gender, and takes
plex_storage_dir option instead of sequenom_plex_storage_dir.
vrpipe-fileinfo and vrpipe-output --step option now takes the format
<step_name_or_number>:<kind_of_output> - a colon separates them instead
of a pipe, making it easier to type on the command line and matching the
syntax used elsewhere. vrpipe-submissions --step option can now take
a step number instead of just a step name.
Fixed bug in vcf_merge_different_samples_control_aware step to corretly
handle single-sample merges.
The vrpipe datasource methods gain a new include_in_all_elements option
that allows you to include the output of one or more setups in every
element of a normally configured vrpipe datasource. This can be useful
for eg. a genotype checking pipeline where the normal input is VCF files
from setup X, each element being a single VCF file. But then you also
want to use a genotypes bcf file that is generated by setup Y, and that
single bcf should be included with each different VCF. In this case your
datasource source option would be X, and the include_in_all_elements
option would be Y.
0.192 Fixed split_genome_studio_genotype_files step to work when there are
multiple analysis files.
Better parsing of the imeta qu output in the irods datasource now allows
grep -v pipes to work reliably in the source.
0.191 Critical fix to restore functionality to the setups handler, lost in
0.190 - it now monitors setups again to keep them unstalled and look out
for new data elements.
0.190 sequenom_import_from_irods_and_covert_to_vcf pipeline now indexes the
vcf file it produces.
irods datasource improvements: all_with_warehouse_metadata method now
has a 'required_metadata' option to ignore files lacking those keys. It
also now retries irods commands multiple times on failure. All analysis
files (Sanger-specific) are now noted.
An issue exists where if multiple setups claim to have made the same
file, VRPipe will no longer automatically restart a step that produced
that file. This most often comes up for VCF index files. Some recent
VCF-producing and consuming pipelines have been altered to remove
the VCF-indexing step from the consuming piplines, and the producing
pipeline now always indexes. See IMPORTANT_NOTES if upgrading. VRPipe
itself also now checks if a file is in a setup's unique directory and
allows automatic step restarts in that case.
VRPipe::File has a new merge_metadata() method for combining metadata
from multiple files.
Critical fix for the bam_improvement_no_recal pipeline, which was
deleting a required file if the cleanup option was on.
When VRPipe datasources automatically start over dataelements because
file metadata changed, the details of the metadata change are now logged
(see vrpipe-logs).
vrtrack_populate_from_vrpipe_metadata step was improved to store bam
lane metadata in the way expected by most bam pipelines and to store
control metadata for stem cell data.
The pluritest-related pipelines and steps received improvements and
fixes.
vrpipe-setup --based_on now lets you unset an value by answering ''.
The setups handler now catches and emails about datasource errors in new
setups.
0.189 vrtrack_auto_qc_with_genotype_checking pipeline fixed and improved to
now take genotypes bcf as an input.
0.188 Updates to various bcftools-using steps to cope with both v0 and v1 of
bcftools automatically, and vcf merging is better at estimating memory
required.
vrpipe-rm is now no longer silent about failed or skipped deletes.
vcf_merge_different_samples_control_aware step now combines all meta-
data from its inputs instead of only keeping the intersection.
Tweaks and improvements to some Sanger-specific hipsci-related
pipelines.
New vrtrack_auto_qc_with_genotype_checking pipeline that uses latest
bcftools for genotype checking.
0.187 vrpipe-disk_usage now provides a complete report on all usage across all
disks.
Critical fixes and improvements for the htscmd_genotype_analysis step,
which now takes different options and calculates gt_status correctly.
pluritest_plot_gene_expression step fixed to not break due to an
undefined environment variable.
Critical fix for ARRAY-ref-related errors when trying to add metadata to
files.
Overhaul of genome-studio and loh-calling pipelines to work with the
irods datasource.
0.186 Fix to stop restarting steps that by design do no create their output
files, when the output files do not have all required metadata.
0.185 Fixed bug in VRPipe::File->openw to prevent deep recursion when lacking
permission to write.
VRTrack datasource analysis_genome_studio method now allows grouping on
analysis_uuid.
irods datasource now sets the type of files to their type in irods.
irods_analysis_files_download step now outputs analysis and input files
under different output keys. It also adds irods_local_storage_dir
metadata to say where the irods_analysis_files were downloaded to.
vrtrack_populate_from_vrpipe_metadata step now sets library ssid and
tag sequence.
pluritest_annotation_profile_files step now (only) works with idat files
from the irods datasource.
0.184 Sanger-specific improvements to the irods datasource, and a new step and
pipeline to update a VRTrack database and import the files based on the
irods datasource.
0.183 Fixed the new bam_improvement_gatk_v2 pipeline.
StepCmdSummary version strings can now be up to 64 characters long.
Full support for setting and getting multiple values per metadata key
has been added to Files.
The irods datasource now stores multiple values for the same metadata
key, and the all_with_warehouse_metadata method now gets all the
Sanger-specific metadata we'd need for complete tracking.
0.182 New vcf_merge_different_samples_control_aware step.
New vcf_merge_different_samples_to_indexed_bcf and hipsci_loh_caller
steps and pipelines.
New bam_improvement_gatk_v2 pipeline for doing the recalibration part
of the improvement pipeline with GATK v2 or higher (v3 needed for
working with bams produced by bwa mem).
iRODs datasource now has the Sanger-specific warehouse code split out
in to a new all_with_warehouse_metadata method.
htscmd_genotype_analysis now confirms the genotype when the expected
sample has a score with a ratio of 1.000 to the score of the highest
score.
bwa_mem_fastq and bam_merge_lane_splits steps now pass through sample_id
metadata.
0.181 Critical fix for infinite restart bug affecting setups for pipelines
with temp files where the user set a unix_group.
pluritest_annotation_profile_files step now does case insensitive
matching on its regex options and only considers annotation files with
9 columns.
0.180 Fixed resolve bug in vrpipe-fileinfo.
htscmd_gtcheck step now has a expected_sample_from_metadata_key option.
split_genome_studio_genotype_files step fixed to get correct samples.
penncnv_detect_cnv step now has a perl_for_penncnv_exe option.
0.179 The unix_group option during vrpipe-setup no longer results in file
permissions changing for files produced by other setups.
Added r_libs option to pluritest_plot_gene_expression step to allow it
to be used with a specific R install.
vcf_merge_different_samples step now works with the latest bcftools.
genome_studio_import_from_irods pipeline replaced with
genome_studio_import_from_irods_and_convert_to_vcf pipeline featuring
new illumina_coreexome_manifest_to_map step and updated
genome_studio_fcr_to_vcf step that both use fcr-to-vcf script from the
vr-codebase git repository.
Steps in VRPipe can now specify that they take as input arbitrary file
types, useful for when your step takes multiple different text formats
and you need to distinguish between them or take these from the
datasource.
The irods step got a search_by_metadata() method.
The sequenom_csv_to_vcf step now takes a sequenom_plex_storage_dir and
reference_name option and gets the correct plex manifest from irods.
The irods datasource now queries the Sanger-specific 'warehouse'
database and adds public_name and donor_id metadata to files.
0.178 Front-end vrpipe-* scripts now strip control codes the user may enter
while typing.
Fix for the vcf_merge step so that it works when the post_merge_vcftools
option includes a .gz path.
0.177 vrpipe-handler for setups can sometimes need more memory, so its
requirements have been increased to 2900MB.
vrpipe-permissions now has a --filter option to pick files to alter
based on their metadata.
vrtrack datasource now checks individual name and alias, and prefers to
use alias as the individual.
New Sanger-specific sequenom import pipeline.
0.176 Hotfix so that the vrtrack datasource ignores when a lane changes to
0 reads.
0.175 Hotfix so that the vrtrack datasource notices and updates when a lane
changes it number of reads.
0.174 Hotfix so that the LSF scheduler can correctly choose queues that have
no time limit defined.
0.173 LSF scheduler now has an improved method of picking the queue to submit
jobs to.
Jobs with cmd lines that have mutltiple things piped together will now
be considered failed if any cmd in the set of pipes exits non-0, instead
of if just the last cmd exits non-0.
Fix for the bam parser, allowing tests to pass on modern systems.
0.172 Fixed setup stalling introduced in previous version.
Submission failure emails now include the setup name.
vrpipe-server now always spawns a setup handler in production, so that
new work for setups can be discovered when all setups were complete
when the server was last stopped.
0.171 vrpipe-fileinfo --path mode now always show details on that path, even
if it has been deleted (no need to specify --include_removed).
Some bam processing steps now transfer over individual, project and
species metadata from their input to their output files.
Triggering setups (working out what work they have to do next) no longer
occurs while that setup's datasource is being updated, avoiding some
wasted effort.
vrpipe-setup now allows changing the datasource in some (safe)
circumstances.
vrpipe-elements --start_from_scratch now works more reliably.
vrtrack datasource now notices changes to sample names.
0.170 Fix for "Could not continue submission management for farm" issue.
0.169 Critical fix for the "Magic number checking on storable string failed"
error affecting DataSource updates and vrpipe-server.
0.168 Critical fix to stop VRPipe overloading the system's network
connections; associated large speed increase.
Critical fix for ec2 scheduler so that it works again.
Avoidance of possible DataElement "corruption".
Possible fix for strange DataSource update behaviour - they should now
update properly in a single attempt.
vrpipe-create_step now asks how many CPUs a cmd uses.
vrpipe-file_info works properly on VRPipe-created symlinks again, and
--dataelement option works on --paths again.
PipelineSetup names can now be 128 characters long instead of just 64.
0.167 Critical fix for correctly retrying submissions that failed due to
using too much memory.
0.166 Critical fix for adding file metadata correctly without losing
existing metadata.
Critical fix for steps that accept multiple file types.
Improved reliability of vrpipe-handler processes.
0.165 Critical fix for updating DataSources.
0.164 archive_files step improved so that it no longer generates submissions
that are guaranteed to fail.
Massive improvement to speed and reliability for steps that deal with
1000s of input files.
vrpipe-rm --filter option renamed --search_by_metadata for consistency
with vrpipe-fileinfo; this option now also works properly and more
quickly.
Critical fix for sge_ec2 scheduler so that everything does not stall out
when SGE has a dead host.
bwa_mem_fastq step fixed to record correct reads metadata.
0.163 Improvements and bugfixes for the ec2 and sge_ec2 schedulers, resulting
in more reliable instance launching and termination. EC2 spot requests
are now supported.
Increased the likelyhood of generating single large job arrays instead
of multiple smaller ones, for greater scheduling efficiency.
When vrpipe-server encounters problems, the admin is no longer emailed
the same error more than once per hour.
The default user for new setups is now the admin, not the fake user
'vrpipe'.
The fastq_metadata step no longer claims to output a file, avoiding
possible problems when resetting the step.
GATK steps that take bam files now correctly specify bai files as
required inputs to avoid restart-related failures.
Possible fix for vrpipe-handlers getting stuck after temporary loss of
db connection.
Fixed copy and move operations to discs that were nearly empty.
vrpipe-setup now allows specifying which unix group output files will be
chowned as, and if vrpipe-setup was run as root the files will also be
owned by the user that created the setup. --extra_options can now also
be used when creating a new setup, allowing the overriding of memory and
time for certain steps.
vrpipe-fileinfo --path, --setup and --search_by_metadata options now
work addively instead of independently.
The vrpipe datasource filter option now allows multiple comma separated
filters to be specified.
vrpipe-status has a new --show_steps option to show what steps a
pipeline has, letting you work out which step outputs your files of
interest.
0.162 Fixes for the ec2-related schedulers. README file reformatted as
README.md. The VRPipe wiki on github now contains detailed installation
and usage guides.
0.161 bam_to_fastq step now supports merged bam files containing multiple
lanes.
0.160 Large speed improvement and fixes for steps when they are fed thousands
of input files.
0.159 Fixes for compatibility with MySQL 5.5, which is now the recommended
version to use; see updated README/IMPORTANT_NOTES for recommended
configuration settings.
0.158 Critical fix to ensure Setups do not get reset when the version of your
installed Storable module changes. See IMPORTANT_NOTES if upgrading.
Various fixes related to moving VRPipe-tracked files from one disc to
another.
Fixed filetype checking for bams, vcfs and bcfs.
Fix for the LSF scheduler so that if your LSF administrator has set
LSF_UNIT_FOR_LIMITS to something other than KB, VRPipe no longer fails
to submit jobs.
Fix for the bam_index step so that it will reindex bams if they have a
newer timestamp than their .bai files.
Fixed the setups handler so that it correctly watches the source file
of file-based datasources, updating them when the file is altered.
Emails from "VRPipe Server" now have an address of the configured admin
instead of a fake address.
New sge_ec2 scheduler, providing SunGridEngine load balancing on
Amazon EC2.
New genome_studio_import_genotype_files pipeline.
vrpipe-fileinfo can now be used to search for files that have certain
metadata, and is also capable of reporting on files that were inputs
to setups, not just setup output files.
vrpipe-setup can now be used to change the output root of an existing
setup (this does not move any files; it just determines where new
output files would go).
0.157 Hotfix for the bamcheck step, so that it can cope with bams that have
no mapped reads.
0.156 Critical fix for steps that generate hundreds of output files - now
setups that use these steps will not stall with no submissions created.
0.155 Critical fixes and improvements for vrpipe-mv: in --setup mode it now
finds all the files for a setup and warns you if it can't move them
(because they are outside of the output_root). It also gains a new mode
of operation that lets you easily move all VRPipe output files in one
area to another, useful when moving the entire contents of a disc.
vrpipe-permissions fixed so that it can change permissions on files that
were moved or became apparently non-existent because a user without
permission to see the files used VRPipe to try and look at them.
vrpipe-fileinfo fixed so that it can more reliably show information such
as --setup_info on all files, instead of just giving up and saying
'unknown'.
0.154 New SGE scheduler for use with clusters running SunGridEngine.
vrpipe-fileinfo --setup_info now outputs a vrpipe datasource compatible
string of id[step_num:output_key] instead of duplicating the output of
vrpipe-status. --display tab is also now the default display mode. It
can now report what made a given --path even when multiple setups
created it.
0.153 The local scheduler was overhauled and is now much faster and more
reliable - it should actually work now. Using SQLite also works now,
even in combination with the local scheduler. There is an experimental
new ec2 scheduler for using VRPipe in Amazon's cloud (see README_EC2).
Fixed edge case where Submissions with the exact same time requirements
as the maximum amount of time allowed in the queue they were submitted
to would never run.
New bam_htscmd_genotype_checking and bam_lanelet_gt_check pipelines.
0.152 Critical fix for database connection exhaustion; critical fix for
logging.
Improved stall reduction. Improved database update consistency: fixed
remaining reliability issues.
vrpipe-fileinfo gains new options --dataelement and --input_file
vrpipe-elements --element option now takes a dataelement id, and gains a
--elementstate option that takes a dataelementstate id.
0.151 Critical fix for Submission failures due to mysterious SIGINTs.
Critical fix for database update inconsistencies: the pipeline system
itself should no longer be the cause of pipelines failing or entering
into strange states.
Fixed vrpipe-status --deactivated option to work again.
PipelineSetups will no longer remain stalled (not moving to the next
step) for too long (or forever), and fewer stalls should happen.
New vrpipe-logs script to investigate what happened to a setup.
The trigger process was sped up, in some case by orders of magnitude.
archive_files step no longer re-archives files that are already in the
archive pool.
Further reduction to database connection exhaustion (though the problem
remains).
0.150 Further fixes for database connection exhaustion. Fixed step limit
system to work again.
0.149 Possible fix for database connection exhaustion.
0.148 Further improved database update consistency. Improved speed.
vrpipe-status gains new --global_summary option.
0.147 Critical fix to avoid using up all database connections.
0.146 Improvement to database update consistency (should now avoid strange
failures where input files got deleted).
New bam_mapping_with_bwa_via_fastq_no_namesort pipeline to supercede
the other bam_mapping_with_bwa* pipelines.
Improved speed and efficiency for getting jobs that need be run
submitted to the scheduler.
0.145 Critical speed fix for slow DataSource updates. Fix for some reset
situations. Improved database consistency. LocalScheduler now functions
correctly (though the server itself is suffering from a crash issue on
certain low-cpu machines). Be sure to read IMPORTANT_NOTES.
0.144 Corrected database schema version number.
0.143 Critical (partial) fix for database inconsistency issues that have led
to random issues such as missing output files, jobs pending forever,
submissions restarting forever etc.
Scheduler job arrays are now used for better scheduling efficiency.
sequence_index datasource gains new method sample_fastqs.
vrpipe-file_info now shows resolved file paths by default, and can now
show files that have been deleted.
New PipelineSetupLog class that records all major events that occur for
each PipelineSetup.
vrpipe-elements can now be used to tempoarily withdraw elements, and can
also now show input paths and the output root.
vrpipe-setup gains a 'touch' option to force the refresh of a
datasource.
vrpipe-output now describes all output files, not just those that still
exist. You can also now --force_overwrite existing symlinks.
vrpipe-submissions now reports the host that the jobs are running on.
bam_to_fastq step reimplemented to use the bam2fastq exe.
0.142 Critical fix for File copy() and move() methods so that interrupted
attempts are no longer considered successful on a retry. bam_metadata
step now has a store_original_pg_chain option, which when turned off
allows it to be used on bams produced by VRPipe in a single-step
pipeline.
0.141 Critical fix for bug in PipelineSetup introduced in 0.140. Setup handler
fixed to avoid unnecessary full triggers. Submission handlers can
sometimes get stuck doing nothing when they are no longer needed; the
server now kills them off where possible.
0.140 Worked around an issue that could result in only a few of a setup's
incomplete dataelements being worked on. bamcheck parser updated for
compatibility with latest version of bamcheck.
The archive_files pipeline no longer advertises that it outputs any
files, to avoid rare risk of file loss due to resets at the wrong
moment.
Altered many steps and pipelines so that the requirement of bam index
files is properly known by the system, allowing automatic resets to
work properly.
0.139 Fixes error introduced in 0.138. bam_index step now checks if the index
already exists and skips actually indexing if it does.
0.138 Another fix for submissions that could pend forever.
0.137 Fixed some areas where database connections were left open for a long
time idling, wasting them. Improved (and reduced extraneous) error
messages when post-processing steps. Further fix for submissions that
could pend for forever.
0.136 Reduces database contention whilst selecting submissions to run, which
should increase speed and help avoid running out of database
connections.
0.135 Critical fix to avoid 0.133's fix from locking up the database for too
long (and to unstick even more pending submissions).
0.134 Fixed bug that meant submissions were queued to run and reported on in
vrpipe-status even if they were for dataelements that were withdrawn,
which was very wasteful and confusing.
0.133 Critical fix for submissions that pended forever.
0.132 Emergency change in behaviour when the log file can not be locked:
instead of emailing the admin, the log file is simply deleted. This is
to prevent thousands of email messages being sent and also should allow
subsequent log messages to be written to disc successfully.
0.131 Fixed VRPipe's internal confirmation of the existance of step output
files so that it is no longer restarts jobs if the output file had
previously beem moved elsewhere and deleted.
0.130 Critical fix for vrpipe datasources to prevent invalid dataelement
generation.
0.129 VRPipe can now cope better if job stdout/err files are deleted by some
external process. A linux-specific work-around has been implemented to
get correct memory usage stats on systems where Proc::ProcessTable gives
invalid results.
0.128 Critical speed improvement for pipelines with one or more "block and
skip" steps. Improved step parsing error messages.
0.127 In concert with the latest version of bamcheck, the bam import and
improvement pipelines can now cope with the use of bamcheck -f/F.
0.126 Fixed spurious error messages when a setup changes to having 0 elements.
Improved error messages for missing input/output files. vrtrack_auto_qc
step updated to work with latest bamcheck.
0.125 Fixed step limit handling that got broken in 0.124.
0.124 Critical fixes to try and lower database load and minimise wasted jobs
submitted to the job scheduler.
0.123 New elements for a setup now get triggered soon after the datasource
changes, instead of the next time the setup has 0 unfinished
submissions, resulting in a large speed-up potential.
Some critical database queries have potentially been optimised to
hopefully avoid overloading the database when many jobs are running at
once.
Fixes for the bam_spatial_filter and rna_seq_map_gsnap pipelines.
0.122 Fixed repeated emailing of the same setup problem.
0.121 Potential fix for complex vrpipe datasources that fail to update as
soon as they should. bam_spatial_filter pipeline fixes. Reduced chances
of a setup not doing work when it should be doing work. Submission
handlers that are no longer needed are now killed as soon as possible,
providing a large efficiency improvement.
0.120 Potential fix to stop vrpipe-server silently stalling. Errors
encountered during a Step post_process are now emailed out. New
bam_spatial_filter pipeline.
0.119 Critical fix for changes in previous version. Critical fix for
bam_split_by_region step, so it works with bams with arbitrary
split_sequence metadata. vrpipe-status stall messaging improved
slightly.
0.118 Critical fix to prevent steps that fail due to not creating any output
files from being considered completed OK.
0.117 Fixes to gmap/gsnap step/pipeline.
0.116 Improved indication of possible stalled setups, and general fixes to
vrpipe-server.
0.115 Critical fix for new datasources.
0.114 Critical fix for the setups handler: now setups actually get updated as
their datasources change.
0.113 Improved memory reservation. vrtrack datasource now simplified by
removing extraneous confusing options.
0.112 Critical bug fixes to 0.111, allowing it to work in production.
0.111 Radical overhaul to how VRPipe actually gets command lines run - should
result in much greater speed and efficiency, with no more sporadic
periods where nothing seems to be happening when there is work to be
done. Be sure to read through IMPORTANT_NOTES if upgrading.
New frontend scripts: vrpipe-create_step, vrpipe-create_pipeline and
vrpipe-disk_usage.
New VRPipe module for easy VRPipe perl 1-liners.
Users are now emailed when their setups complete or run into problems.
0.110 Fixes to vrpipe-output to show input paths properly and allow # in
output basenames.
0.109 Additional critical fixes for convex_plots and fastq_split steps.
0.108 Critical fixes for convex_plots and fastq_split steps.
0.107 New vrpipe-mv tool, useful for moving the output root directory of a
setup to a new location. vrpipe-fileinfo now reports on files produced
by dataelements that have not yet finished the pipeline.
GATK2-specific steps now default to finding the GATK2 jar file in a
GATK2 environment variable.
New convex plot generation pipeline.
Fixes for the grouping methods of some datasources, so that they do not
create empty, useless dataelements.
0.106 Improved handling of paths entered on the command-line: leading and
trailing whitespace is now stripped. New pipelines and steps for
carrying out tasks with GATK v2. bwa_index step updated with support for
the latest version of BWA. fastq_merge_and_index step speeded up.
vrpipe-status gains a --defunct option to report on bad setups that
are candidates for deletion.
vrpipe-setup gains a --cleanup option to delete output files made for
now withdrawn dataelements, recovering wasted disk space for completed
projects.
0.105 Fix for vrpipe-status so that it no longer produces wild submission
state numbers while reporting on a setup with a large number of
currently changing submissions. Fix for the fastq_merge_and_index step
so that it no longer uses up all database connections.
0.104 No changes to the code were made in this version. This release only
corrects the upgrading instructions in IMPORTANT_NOTES for v0.103. If
you already upgraded to 0.103, IMPORTANT_NOTES also contains advice on
how to fix issues that may have arrisen.
0.103 WARNING: if upgrading, follow the instructions in IMPORTANT_NOTES before
installing this version.
Changes were made to how DataElements store input file paths in the
database, now allowing a virtually unlimited number of paths, needed
for some kinds of Step. This also fixes cases of extraneous DataElement
withdrawal and needless repetition of work.
A fix was made to make VRPipe fully functional on recent stock Ubuntu
installs.
A fix was made so that Jobs get properly killed when necessary.
vrpipe-fileinfo and vrpipe-output gained a --include_withdrawn option.
General improvements were made to SNP calling pipelines.
0.102 SGA-related improvements, including the ability to create and call on
bam chunks.
0.101 vrtrack datasource fixed so that it does not create elements in
group_by_metadata mode if any member of the group has no files.
0.100 New auto_qc_min_ins_to_del_ratio option for AutoQC pipeline. When Job
stdout/err is archived, it is now limited to first and last 500 lines
to avoid storing massive files. See IMPORTANT_NOTES if you want to clean
up any large files you've already created.
0.99 Critical fix for memory leaks and slow-down problems when dealing with
pipeline steps that have 10s of input files and produce 100s of
duplicate submissions - like the new SNP calling pipeline(s). See
IMPORTANT_NOTES if upgrading.
0.98 Fix for fastqc_quality_report and cufflinks steps to make them work in
production.
0.97 Fix for vrpipe group_all method to wait until all elements are complete.
Fix for the chunking DataSource variants, so their methods are shown
when running vrpipe-setup. Fix for the retroseq_call step, for
compatability with the latest version of retroseq.
0.96 Fix for Requirments, allowing reservation of over 999 hrs. See
IMPORTANT_NOTES if upgrading. The use of Environement variables has
been cleaned up and clarified; see the updated README.
0.95 This version overhauls a number of SNP-calling-related piplines and
steps. If upgrading, be sure to read IMPORTANT_NOTES. The changes are
primarily concerned with moving the choice to do 'chunked' calling (as
opposed to calling across the whole genome at once) to the DataSource,
instead of having 2 separate piplines, one for whole-genome, one for
genome chunks. The change improves efficiency and makes it easier to
restart/trouble-shoot failed chunks.
Deleted pipelines:
gatk_genotype (renamed snp_calling_gatk_unified_genotyper)
gatk_variant_calling_and_filter_vcf (renamed snp_calling_gatk_unified_genotyper_and_filter_vcf)
mpileup_with_leftaln
snp_calling_chunked_mpileup_bcf (replaced by snp_calling_mpileup_via_bcf.pm + genome chunking)
snp_calling_chunked_mpileup_vcf (replaced by snp_calling_mpileup.pm + genome chunking)
snp_calling_gatk_vcf (replaced by snp_calling_gatk_unified_genotyper_and_filter_vcf + vqsr_for_snps)
snp_calling_mpileup_vcf (renamed snp_calling_mpileup)
snp_calling_mpileup_bcf (renamed snp_calling_mpileup_via_bcf)
vcf_chunked_vep_annotate (renamed vcf_split_and_vep_annotate)
Modified pipelines:
vcf_filter_merge_and_vep_annotate (vcf_index step added throughout)
vcf_vep_annotate (vcf_index step added at the end)
New pipelines:
vcf_concat.pm
merge_vcfs_to_site_list_and_recall_from_bcf
snp_calling_mpileup
snp_calling_gatk_unified_genotyper
snp_calling_mpileup_from_bcf
snp_calling_mpileup_via_bcf
vcf_split_and_vep_annotate
vqsr_for_snps
snp_calling_gatk_unified_genotyper_and_filter_vcf
0.94 Critical fix for getting interface_port when it is set as an environment
variable. See also the recent 0.93 changes.
0.93 This version features the beginning of a more radicial overhaul,
introducing vrpipe-server, which is a trivial-memory, trivial-cpu
daemon process that will eventually run the whole system, discovering
and dispatching jobs. In this version it only serves the frontends (both
cmd-line and web, and currently only for vrpipe-status) and runs the
local scheduler (if you have that configured). vrpipe-server will be
started automatically when needed. When it starts it gives you the
website address you can visit. This is actually faster than using the
equivalent cmd-line tool. Please see IMPORTANT_NOTES if upgrading.
The fofn, fofn_with_metadata and vrpipe DataSources now have a new
method 'group_all', which is useful for 'merge' type pipelines.
0.92 vrpipe-setup script can now --reset or --delete an existing
PipelineSetups, to completely wipe out all progress on them. Fixed
critical bug in the sequence_index DataSource.
0.91 When VRPipe counts the number of records in a bam, it now uses the
samtools executable in the $SAMTOOLS directory, not the first one in the
$PATH.
0.90 Upgraded irods step to make it compatible with latest version of
ichksum (critical fix for any irods-related pipeline).
0.89 Fixed critical bug in Submission reserved memory calculation that could
prevent certain pipelines from proceeding when a step ran out of memory.
0.88 bam_to_fastq step can now take an option to allow it to not care about
keeping the forward and reverse fastq files "in-sync", which helps with
the SGA-related steps.
0.87 vrpipe-db_upgrade no longer takes --from and --to options, but instead
will correctly upgrade the database from its current version to the
latest, avoiding user-error. Bug-fix for vrpipe-elements so that -f
works again.
0.86 Minor bug fix for getting file md5s. When the system has created
symlinks (eg. vrpipe-output was used) and the source file of a symlink
is moved, all symlinks are automatically corrected. The LSF scheduler
now implements the cpus requirement. SGA steps bug fixed and now have
support for gzipped fastq files.
0.85 bam_add_readgroup step now allows you to choose what metadata key SM is
set from, eg. you could have it come from individual instead of sample
0.84 Critical fix for the VRTrack datasource, so that it copes with metadata
changes better.
0.83 Further critical fixes to the new PerlTidy module. All Perl code has now
been tidied.
0.82 Critical fix to the new PerlTidy module so that it does not break
classes when tidying. Critical fix for VRPipe::File to revert to
previous behaviour of keeping file metadata even after file deletion.
0.81 This version features major changes to how the underlying system
interacts with the database, which results in greatly improved speed
(orders of magnitude in certain critical areas) and bounded memory
usage. End-users of the front-end scripts like vrpipe-status are not
really affected by the changes, but developers who have written their
own VRPipe scripts, Steps or Pipelines should be aware of the following:
Persistent methods (see updated POD of VRPipe::Persistent for details):
create() replaces what get() used to do: get an instance of a Persistent
object from the database, creating or updating it if necessary. get()
has been changed to only retrieve and update - it no longer creates but
throws if the row isn't already in the database. get() should still be
used whenever possible, especially in end-user scripts.
Persistent instances now no longer access the database every time you
call one of its methods to retrieve a column value. This means if you
get an instance, then change column values in a different process, your
instance will have out-of-date values. You can use new method
reselect_values_from_db() to update your instance.
There are new methods search(), get_column_values(), (also with
*_paged() variants) and search_rs() for fast retrieval of many rows.
New method bulk_create_or_update() lets you create many rows quickly.
New method dump() can be used when debugging, letting you Dumper a
Persistent instance without outputting tons of irrelevent information to
the screen.
New method do_transaction() can be used for doing a series of operations
in a transaction.
DataSource authors:
source methods are now called safely, guaranteed single simultaneous
process only. They no longer directly create or return DataElements
themselves, but should call _create_elements() method instead.
If you wish to contribute code with a pull request, please read the new
DEVELOPERS file. It explains details of how to use our custom perltidy
setup.
0.80 New bam_improvement_no_recal pipeline. New SGA-related pipelines. The
vrtrack_auto_qc step now stores test results in the new VRTrack AutoQC
table (instead of in a text file), and so requires VRTrack schema 20,
which is found in the vr-codebase git repository version 0.04 or higher.
0.79 Another critical bug fix for dcc_metadata step. Bug fix for bam_reheader
step. vrtrack_auto_qc step now has an extra metric.
0.78 Critical bug fix for dcc_metadata step. Bug fix for when a step input
is a symlink and the pointed-to file has been deleted without VRPipe's
knowledge.
0.77 When using SQLite as the database it may now lock up less, though the
local scheduler remains incompatible with it. Fixes to metadata stored
when merging bams, important for pipelines using bam_reheader step.
0.76 Another schema upgrade to add a missing index. See IMPORTANT_NOTES if
upgrading.
0.75 Schema upgrade to provide better database indexes. See IMPORTANT_NOTES
if upgrading.
0.74 Schema upgrade to allow the stats of multi-week-long running steps to
be stored.
0.73 (re-)Added support for sqlite, though it is only really suitable for
parsing as it may lock up if used for running pipelines. Fixed bug in
LSF stdout parser.
0.72 Small fixes/improvements to steps fastq_split, bin2hapmap_sites and
bam_name_sort.
0.71 New bam_improvement_and_update_vrtrack_no_recal pipeline.
0.70 New improvement pipeline that works with older versions of GATK. When
QC step updates VRTrack database, now no longer overwrites manually
applied qc_status.
0.69 New vrtrack_qc_graphs_and_auto_qc pipeline, suitable for rerunning QC
on already imported or improved bams. Copyright and license information
is now present on all source code files.
0.68 Critical fix for sequence_index datasource, so that it does not reset
elements just because their center_name changed case.
0.67 New single-step bam indexing pipeline. New Conifer pipeline. New
retroseq pipeline. Improved breakdancer pipeline. The vrpipe datasource
now has an option to filter after grouping. The vrtrack_auto_qc pipeline
now always fails a lane if the NPG status was failed. When a submission
fails and is retried, the stdout/err of previous attempts is now
accessible, eg. with vrpipe-submissions.
0.66 vrpipe-setup can now be used to change pipeline behaviours.
0.65 New breakdancer pipeline, single-step bam splitting pipeline, and the
vrpipe datasource now applies the filter after grouping, requiring only
1 file in the group to match the filter.
0.64 Fix for bam_reheader, affecting 1000 genomes pipelines.
0.63 Fix for rare bug in fastq_split which prevented it from working with
certain input.
0.62 Critical fix for new queue switching code.
0.61 Now, if a job is running in a time-limited queue, and the limit is
approaching, the job will be switched to a queue with a longer time
limit.
0.60 Fix for vrpipe-setup to make it compatible with the new vrtrack_auto_qc
pipeline.
0.59 New vrtrack_auto_qc pipeline. New (alternate) SNP pipeline. New
vrpipe-permissions script.
0.58 Various fixes to enable initial install and testing for new users using
latest CPAN modules.
0.57 vrpipe-fileinfo can now tell you how a file was generated.
0.56 New gatk_variant_calling_and_filter_vcf pipeline.
0.55 Further merge pipeline fixes. New bam realignment around discovered
indels pipeline.
0.54 Further fix for new merge pipeline.
0.53 Fixed issues with bam merging pipelines, and renamed tham all.
0.52 New fofn_with_metadata DataSource - useful for inputting external bams
into pipelines. VRTrack-releated steps now have
deadlock-failure-avoidance.
0.51 VRTrack DataSource now has an option to group_by_metadata.
0.50 New merge_bams pipelines, to do "merge across". VRTrack datasource now
allows filtering on more status types, and can get VRPipe improved bams.
0.49 Critical bug fix in bam_to_fastq step.
0.48 Tweaks and fixes to finalise new bam_genotype_checking pipeline.
0.47 Minor tweaks to finalise yet-unused pipelines.
0.46 New versions of merge lanes and stampy mapping pipelines with extra
features.
0.45 Critical speed fix for vrtrack datasource. Library merge pipelines now
index the resulting bams.
0.44 Fix for plot_bamcheck step, letting it work when then is no insert size.
0.43 Efficiency fix for vrtrack datasource.
0.42 Critical fix for vrtrack datasource, so that it now updates file
metadata when vrtrack metadata changes.
0.41 vrtrack_update_improved step now sets lane reads and bases.
0.40 Critical fix for vrtrack_update_mapstats step, letting it work without
exome_targets_file.
0.39 vrpipe DataSource behaviour changed, so that a child pipeline that
deletes inputs won't mess up a parent that still needs those files.
Overhauled the genotype checking pipeline and steps.
0.38 Fix for gatk_target_interval_creator step, increasing its default memory
reservation.
0.37 Overhaul of qc graphs & stats-related steps and pipelines so that now
wgs and exome projects all use the same pipeline, with a single bamcheck
call. bam_to_fastq step fixed so that it runs in constant <500MB and
copes with bams that miss reads.
0.36 Critical fixes to the underlying system to ensure job submission doesn't
stall out forever, to handle limits on steps better, and to avoid issues
when there are multiple submissions for the same job. Also a fix for
java to increase likelyhood of jvm starting up.
0.35 vrpipe-status script improved to give a better overview of what the
pipeline is doing, with warnings about pipeline stalls. bam_to_fastq
step reimplemented, should now be much better.
0.34 Critical speed fix for the VRTrack datasource. Fixes for the
archive_files pipeline and the vrtrack_update_mapstats step.
0.33 Optimised bam_import_from_irods_and_vrtrack_qc_wgs pipeline. Memory and
time reserved for jobs is now less likely to be insufficient.
0.32 Fixes for bam_mapping_with_bwa_via_fastq and bam_reheader step.
Efficiency improvement in how step max_simultaneous is handled.
0.31 Database independence now properly implemented. New separate bam
improvement pipeline, remapping bams via fastq pipeline, and some
Sanger-specific pipelines added.
0.30 Fixes related to archive_files pipeline.
0.29 New archive_files pipeline.
0.28 Really fix java-using steps so they get the memory they need.
0.27 Outputs of near-identical PipelineSetups will now never risk overwriting
themselves. Java-using steps get better recommended memory. New
IMPORTANT_NOTES file - you must read this!
0.26 Critical performance fix for StepStats.
0.25 New StepStats system for quick access to memory/time used stats.
0.24 Critical fix for mapping pipeline.
0.23 New Stampy mapping pipline. Fixes for SNP and DCC pipelines.
0.22 Critical fix for input files that are relative symlinks.
0.21 SNP discovery pipeline(s) now firming up; fixes for merging pipelines
0.20 Improved handling of limits, so that a good amount of jobs are always
running.
0.19 Various fixes to 1000 genomes-related pipelines.
0.18 Fix to allow sqlite to be used in production.
0.17 Install process for new external users should now work/be easy.
0.16 New merging pipelines and associated vrpipe datasource (for chaining
different pipelines together). Critical bug fixes that allow changes in
datasources to trigger restarts for the changed elements.
0.15 Front-end for creating PipelineSetups; improvements to smalt mapping so
we can map 454 data in 1000 genomes.
0.14 More front-end scripts added. Sequence index datasource now starts.
changed elements over from scratch, so we can now change the source file
safely.
0.13 Various fixes for pipelines. Memory leak issues fixed. Various front-end
scripts added.
0.12 Fixes for bam_mapping_with_bwa. New VCF annotation-related steps and
pipelines. Triggering pipelines in Manager has been optimised slightly.
0.11 Fixes for bam_mapping_with_bwa. New smalt mapping pipeline for handling
454 sequence data.
0.10 Bam Improvement steps now fully implemented. New bam_mapping_with_bwa
pipeline.
0.09 Scheduler independence: local can now be used for testing.
0.08 Submission retries now add time where necessary.
0.07 Fixed critical bug in mapping pipeline; should now work properly.
0.06 Myriad performance and stability improvements necessary to get the
mapping pipeline running smoothly.
0.05 Critical performance fix for dealing with large datasources.
0.04 Critical performance fix for checking bam file type.
0.03 0.02 only worked on test dataset; this should be the first version to
work on real data, following important schema changes and Step fixes.
0.02 Most interesting features not yet implemented, but this is the first
working version, needed to do the 1000genomes phase2 (re)mapping.
0.01 No real files; just starting up repository.