forked from gustaveroussy/EaCoN
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
348 lines (302 loc) · 23.6 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
EaCoN
-----
v0.3.3-1 (20181002) *LittleWomanNoCry*
-----------------
* BUG : Segment.SEQUENZA() : added imputation of NA values in L2R object that made copynumber::aspcf() unable to work (happened with microarrays for flagged probes, not WES).
* BUG : Segment.SEQUENZA() : BAF filtering wasn't working properly, resulting in wrong BAF segmentation, for all microarrays.
* BUG : CS.Process.Batch() : wrong variable name in header check.
* CORR : OS.Process() : corrected wrong handling of sex.chr output was forced as c("X", "Y") instead of variable, default c("chrX", "chrY").
* CORR : Segment.FACETS() Segment.SEQUENZA() : Added missing meta 'BAF.filter' in the object.
* CORR : README.md : fixed few links to dependencies, corrected default regex.
* MOD : Segment.* : Changed the structure of the profile PNG filename to "[samplename].SEG.[segmenter].png" (to ease the use of regex for further steps in batch mode).
* MOD : ASCN.ff.Batch() Annotate.ff.Batch() : corrected the default regex.
v0.3.3 (20180911) *Trinity*
-----------------
* NEW : SEQUENZA segmentation plainly implemented, for both L2R+BAF bivariate segmentation Segment.SEQUENZA() AND copy number estimation ASCN.SEQUENZA().
* BUG : Segment.ff() : Corrected wrong do.call() call (parameters not given as a list).
* CORR : ASCN.ASCAT() : CN output file was badly formatted.
* MOD : ASCN.ff() : Suppressed the "segmenter" parameter, which is read from the RDS meta$eacon$segmenter value.
v0.3.2-1 (20180906) *Bak2Skool*
-------------------
* BUG : WES.Bin() : bin order was totally messed up when capture BED is not properly ordered (desynch due to bin name converted to factor, then levels replaced by mis-ordered rowname). This made WES.Normalize() stop at the GC/wave normalization step due to non-synched bin orders in data when compared to BINpack [Thx to Alexander Lefranc for reporting it] [fixed].
* BUG : ASCN.ff() : fixed a bug in the do.call() call (arguments of ellipsis not converted to a list) [fixed].
* CORR : ASCN.ff() : fixed the handling of out.dir sub() call.
* CORR : README.md : Missing aroma.light dependency [Thx to Alexander Lefranc for reporting it, again].
* ADD : ASCN.ff() : Added a control to check if provided segmentation RDS file is compatible with requested ASCN method.
* MOD : Changed license to MIT.
v0.3.2 (20180808) *PapeMamiePichine*
-----------------
* ADD : FACETS segmentation plainly implemented, for both L2R+BAF bivariate segmentation AND copy number estimation. Coincidentaly, some modifications were performed :
* . Segment.ASCAT() and ASCN.ASCAT() wrappers have been renamed to generic Segment() and ASCN(), with a new "segmenter" parameter, to support both of ASCAT and FACETS.
* . Help functions for all segmentation / ASCN functions have been updated.
v0.3.1 (20180731) *Recalcitrante*
-----------------
* ADD : New feature : preparing the inclusion of FACETS segmentation methods.
* . This required to include some additional data in the normalized RDS.
v0.3.0-1 (20180725) *BzzzzSnap!*
-------------------
* BUG : BINpack.maker() : subfunction failed to load the foreach package [fixed].
* BUG : WES.Bin() : BAM indexation failed [fixed].
* BUG : ASCN.ASCAT.ff() : Typo in dirname replacement made it non functional [fixed].
* BUG : OS.Process() : chromosomes cutting with mingap failed due to calling an inexisting column [fixed].
* ADD : Added a control on list type for functions that require a 'data' object as first parameter.
* DEL : Removed some commented code chunks.
v0.3.0 (20180724) *PapoQueen*
-----------------
* WES : Revamped the WES.bin() function to limit RAM consumption and maximize BAM files access when using subthreads.
* WES : Almost completely rewrote the WES.Normalize() function :
* . Better L2R normalization using simpler and finer filters.
* . Incorporated the segregation of homo / hetero SNPs, using matched normal BAF.
* . Added logOR heterozygosity expression for futher compatibility with the FACETS CBS-based bivariate segmenter.
* Discarded the germline prediction from Segment.ASCAT(), now included in each ofthe *.Process() functions (to be more source-specific, mainly for WES).
* All : Removed timestamped dir that made things tricky on some systems (I'm looking at you, Snakemake!)
* All : Removed "EaCoN." prefix from most functions (less self-centric...)
* All : Took care of vectors and columns that could be converted to factor or integer (to free some RAM up).
* All : Added missing support for manual PELT penalty (only asymptotic mode was considered when SER.value was numeric).
* SNP6 : Revamped BAF homozygous calling and rescaling.
* Defined the novel sets of default parameters for all supported technologies.
* Redacted the README.md
v0.2.13 (20180531) *SunIsBack*
------------------
* WES : Added more data to the BIN RDS output (counts with the reference genome nucleotide for both test and ref BAMs). This is in order to 1) filter out on a minimum alternative allele count 2) allow the use of other segmenters that do not rely on BAF but rather on AD (like PSCBS) or logOR (like FACETS).
* Modified the subthreading scheme for EaCoN.WES.Bin() : now each subthread has its own connection to the BAM files. This allows each thread to work fully (but increases simultaneous IO).
* Now each SNP variant has its corresponding bin index, which will allow to perform density-based selection like in FACETS.
v0.2.12-3 (20180409) *VaricellaEnding*
--------------------
* Corrected BAF values error introduced with wave-effect normalization for WES data.
* Now the WES normalization plots indicate good MAD / SSAD values !
* Removed the EaCoN.WES.Process() wrapper that is not interesting anymore and problematic to update.
v0.2.12-2 (20180322) *Origins*
--------------------
* Added the "raw" (ie, pre-re-normalized) l2r values in the processed.RDS object, to ease the computation of the wave data for WES.
* Cleaned a bit of code.
* Added support for the wave-effect normalization for WES data.
v0.2.12-1 (20180319) *Decompress!*
--------------------
* Added support for gz, bz2 or zip -compressed CEL files (decompressed to TMPDIR).
v0.2.12 (20180313) *FirstWave*
------------------
* Added support for wave-effect renormalization (should work with all Affy arrays, but pre-computed data only available for hg19 build as of now).
* Exported re-normalization procedure to a generic function (easier to maintain code across different array designs).
* Consequently renamed the fit_functions.R script to renorm_functions.R.
v0.2.11 (20180228) *Encore!*
------------------
* Corrected interpretation/conversion of gender prediction by Affymetrix on SNP6 data.
* Changed default gender for WES data from "XX" to "NA".
* Removed call of EaCoN.ASCN from EaCoN.Segment. Now the two operations are strictly considered as different steps.
* Added nsubthread to EaCoN.ASCN (to test multiple gammas in parallel).
* Corrected variable typo in EaCoN.WES.Bin that made it fail at the coverage statistics computation.
* Now the coverage plot is written at its expected position.
v0.2.10 (20180220) *PapaPapi*
------------------
* Added a BAF scaling routine (based on density of homozygous probes) to germline prediction function, in order to rescue unscaled BAF from some male rawcopy-normalized CSHD profiles.
* Removed the "prior" parameter from Eacon.Segment (for germline prediction) : now forced to "G" (genomic).
* Added missing "raw" plot (1st panel) for CytoScan and SNP6 arrays.
* Hid the "solo" and "ldb" options from EaCoN::Annotate() help page (but still present for GR MP work).
* Added an "autor.name" option to EaCoN::Annotate().
* Fixed chr display in some tables in the HTML report.
* Fixed meta name typo that made homozygous BAF segments not detected as such in HTML report plots.
* Moved genomic length formatting in HTML report tables from pure R to DT::formatCurrency (should allow to restore sorting possibility lost with R formatting ?)
v0.2.9 (20180213) *TumorBoost*
-----------------
* Added nsubthread parameter to slow WES functions, to allow multithreading at the chromosome level.
* Added optional (recommended, active by default) BAF normalization using TumorBoost for WES data(as implemented in aroma.light).
* Added coverage of bins and SNP positions for test and ref as QC for WES data. A new plot is generated for several minimum depth values.
* Implementing universal chromosome names support for WES data (to avoid problems with non-canonical chr names like used in the TCGA data, by example) using BSgenome, and for microarray data too. Now, "chromosomes" is only required by EaCoN::Annotate() for plotting karyotypic view and per-chr plots.
* Accordingly changed bits of code in plots to display 'new' chromosome names and discard reference chr not covered in the source design. Now chr names appear in tom-bottom flip-flop on genomic plots.
* Again, modified the reporting parts of EaCoN::Annotate() to handle new chr structure.
* Introduced a functionality of the ASCAT package to detect gaps based on a minimum size, implemented as the 'mingap' parameter un EaCoN.Segment(). This allows to avoid coverage of gaps like some human chr centromeres by segmentation. Default is 5 Mb.
* Removed startup messages when running EaCoN.*.Process() functions.
* Fixed PELT segmentation minimum segment length to 5 (to make it independant from the minsegLen parameter dedicated to germline prediction, which may have to be truly variable depending on the data source).
* Corrected error that generated an ugly (though harmless) warning in EaCoN::EaCoN.ASCN() (under-dimensioned the NA-filled returned vector when ASCN failed, since adding more ploidy measurement methods).
v0.2.8 (20180124) *Krank*
-----------------
* Corrected a damn error in WES binning routine, which changed bin order (bad sort from numerical to alphanumerical ...) !
* Converted some messages in EaCoN.WES.Bin.Batch() that still did not use tmsg().
* Added the {.errorhandling = "remove"} option to *.Batch functions.
* Reformed the handling of the genome version for WES, now completely compatible with BSgenome.
* Corrected a bug from ASCAT where ploidy was wrong in the savec ASCN object (despite being OK on plots !).
* Proposed alternative measures for ploidy over the one computed by ASCAT (A) : median CN value (M), most-width CN value (MW) and width-weighted CN (WW).
* Modified CN plots, and activated the replot of the RAW CN profile (earlier versions kept default ASCAT plot).
* Edited the gammaEval plot to include the ploidy metrics as third panel.
* Renamed the SER.PELT.pen option to SER.pen.
* Added missing 'outfile = ""' parameter to parallel::makeCluster in all WES *.Batch() functions, that prevented forwarding messages to the node.
* Added BSgenome.Hsapiens.1000genomes.hs37d5 to the list of valid genomes.
* Added corresponding refGene table rdata (which is a simple spoof of hg19 with different chrom names).
* Recompressed all refGene table rdata to xz.
* Set a fixed seed (123456) for mclust section.
* Fixed case in germline prediction where execution halted if no variant (thus, no BAF) exist for a chromosome.
* Changed compression of all RDS files to xz (from bzip2) as on our data types we gain 30 to 50% at a cost of few seconds.
v0.2.7 (20180109) *HappyNewYear*
-----------------
* Added a BAF level cutter to germline prediction, to deal with noisy (smeary/blurry) profiles. May help WES data too !
* Added a new function to change the global option "bitmapType" (to allow plotting on stations without X, changing from default "Xlib" to "cairo")
* Modified all foreach/doParallel calls to set the changed bitmapType value to any forked/child thread.
v0.2.6 (20171222) *RoseSweety*
-----------------
* Re-introduced rawcopy's SNP selection to reduce BAF noise (WES in mind...).
* Coincidently added a new 'BAF.filter' parameter to the germline prediction function (passed from Eacon.Segment()) to modify the level of BAF noise filtration (default is .90, ie 90% of probes kept).
* Added few missing conversions of print() to message() for microarray processing functions.
* Added a message giving the number of samples to process for microarray processing functions.
v0.2.5 (20171214) *FeverStopped*
-----------------
* Added the intermediate_files parameter to the report rendering, to avoid (definitively ?) collisions in multithreaded mode with EaCoN.Annotate.ff.Batch().
* Corrected a bug in the HTML QC report when solo = TRUE, where asked solo HTML page was not mandatorily directed to a chromosome with aberrations (rather chr1 by default).
v0.2.4 (20171120) *BackFromRoscoff*
-----------------
* Docs, docs docs. And docs !
* Corrected a bug (appeared in v0.2.3) where chromosomal plots in HTML report were not in good order (by filename rather than chromosome name, despite panel names were good).
* Modified the EaCoN.WES.Bin.Batch() function to mimic the equivalent for microarrays processing (ie, using a target file rather than lists as arguments).
* Rearranged the output BAF segmented table (informative columns selection, reordering and renaming).
v0.2.3 (20171026) *ElevenMonths*
-----------------
* Several minor bug corrections thanks to Thibault.
* Set a try() on text() calls (for chr names on plots) that fail on our cigogne server due to missing font we can't install.
* Modified functions structure to allow easier piping of steps.
* Set a custom tmpdir() in the report RMD to avoid collision of different report in generation when using EaCoN::Annotate.ff.batch().
v0.2.2 (20171004) *SAFIRsBFucked*
-----------------
* Bug correction : undrawn gains/losses on weird (noisy) L2R distributions are now corrected. g.cut and l.cut now have the same absolute value, even for density mode.
* Introduced a new filter for WES (in addition to the low-depth one), based on extreme GC% (<20% or >80%). Corresponding bins are imputed. It represents ~2.5% of 50b bins in a human WES.
* Fragmented code for ASCN which can now be run independantly from ASCPF.
* Fragmented code for WES normalization which can now be run independantly from binning.
v0.2.1 (20170927) *TenMonths*
-----------------
* Bug correction : myGR.ex unknown variable in BED2BIN
* Bug correction : deprecated option "silent=TRUE" removed in suppressWarnings() calls.
v0.2.0 *Second*
------
* IDK (dedicace to affymetrix power tools help writer(s)!)
v0.1.0 *First*
------
* First release
* Renamed A2p (ASCAT 2 "plus") to EaCoN (Easy Copy Number !).
* Integrated code from affy.CN.process and WES.CN.process packages.
* Added a second-step centralization (based on normal segments post-calling).
(logs from A2p)
v0.0.4-1 (201709xx) *ThreeGoldenYears*
-----------------
* Exported all functions related to WES to a new package 'WES.CN.process'.
* Modified report to handle WES results.
* Completely rewrote methods for the germline prediction. It's now faster, more robust and compatible with both microarrays and WES data using the same parameters. Reduced the number of parameters, and defaults should fit most cases (I tested so far).
v0.0.3-1 (20170629) *BBQ*
-----------------
* Debugging problems appeared with a newer version of the mclust package.
* Reverting some changes made to the BAF segmentation using PELT.
* Rewrote A2p.BedCheck() using GenomicRanges (faster and more accurate).
v0.0.3-0 (20170511) *PupilsPair*
-----------------
* Added new functions/methods to prepare WES data :
* . A2p.BAMConv : to generate RD (read depth) and SNP (single nucleotide polymorphisms) data from a pair (normal + tumor) of WES BAM files. This uses samtools (v1.4 mpileup), now included in the package as a linux x86_64 binary, and a in-house perl script included too.
* . A2p.ExomeRun : to perform L2R GC-content normalization, BAF selection and generate files compatible with ASCAT.
* . A2p.BAMConv.ExomeRun : as a wrapper to chained A2p.BAMConv > A2p.ExomeRun in a single run.
* . A2p.BedGC and A2p.BedGC.chr : to compute GC-content tracks from a (binned) bed, using BioStrings. Both functions perform the same task, but the A2p.BedGC.chr is recommended as it works by chromosome (instead of at the whole genome level), thus consumes much less RAM than A2p.BedGC.
* . A2p.BedCheck : To (control and) clean a be file, removing duplicates, merging overlaps and sorting by chr > start > end.
* Renamed functions for the sake of homogeneity.
* Added the generation of a CBS-like (with extended columns for allelic CN, ploidy and GoF) for ASCN results.
v0.0.2-6 (20170426) *FirstByMyself*
-----------------
* Renamed some options of A2prun() concerning the germline prediction, to avoid confusion with other parameters (all in all, 3 different options refer to penalties...).
* Tweaked germline prediction, mostly to adapt it to WES data. Rough BAF segmentation by PELT is now much better looking to the eye. Also changed the mclust parameters (clusters computed on bsm, max clusters = 3, model = "E"). Also changed the unizomy rescue routine, now computed AFTER calling the undecided probes.
* Modified the ascat.plotGenotypes2() function (simpler, faster code).
* Reworked the HomoCut plot : now at the same size as other plots, includes BAF clusters as well.
* Pulled out old versions of the germline prediction main function ascat.predictGermlineGenotypes.auto().
v0.0.2-5 (20170303) *KidBookz*
-----------------
* Other slight modifications to the report.
* Now BAF is rounded in report segmentation table.
* Added a pc.outliers entry to the returned list of celstruc(), to display outliers as a fraction of all cels in the report
* Set penalty evaluation method to "BIC0" (was "BIC") in the PELT (cpt.var) segmentation of BAF signals, for a better sensitivity (in particular for WES data).
v0.0.2-4 (20170302) *Plagio*
-----------------
* Filled several requests concerning the HTML report (submitted by Etienne Rouleau).
* Corrected a bug where BAF calling was not properly made for some homozygous regions, as their value (numeric) was compared to homocut as read in the META file (character).
* Changed again nrf to 0.5 by default.
v0.0.2-3 (20170220) *WolfBurner*
-----------------
* Increased sensitivity for PELT segmentation in smallevents rescue mode (set to AIC0).
* Deactivated ascn for A2p.run() by default as non-required for MP results.
* Added cool export buttons to the report.
* Finaly solved the import warnings at package loading.
* Synched meta handling for OncoScan when tags were present in a CEL file but not in the other one.
v0.0.2-2 (20170217) *SleepingBeauty*
-----------------
* Removed the over-complicated routine for BAF calling, replaced by a simple cut-off as a new parameter : homoCut, defaults to 0.05.
* Debugged a problem with mclust on small populations, the try() method was not properly implemented.
v0.0.2-1 (20170215) *LostBread*
-----------------
* Merged BAF and L2R segmentation results in the report for the targets table.
* Several ameliorations in the HTML report :
* . Page width/margins hacks by Yannick BOURSIN.
* . Better readability of scan date in array info table.
* . More readable plots, especially for BAF (slightly increased ylim and line width, nice colors, ...).
* . Added germline prediction plot.
* . Fixed Symbol column in the target table (usefull for small-res 4:3 screens).
* . Added missing parameters in the last last pane.
v0.0.2-0 (20170214) *CoulFaZ Edition*
-----------------
* Tuned the "auto" mode for germline prediction, adjusting PELT (BIC on bsm with values below 0 set to 0) and mclust (tbsam with 5 classes max on direct values), plus handling of special cases (small populations).
* Changed default value for "proportionOpen" from 0.03 to 0.05 (tested for CytoScanHD, to evaluate for OncoScan).
* Added BAF calling with corresponding colors for segments (homo / hetero / unbalanced status).
* Added BAF support in the report (table, plots).
* All plots are now generated using Cairo.
v0.0.1-4 (20170210)
-----------------
* Modified the "auto" mode for germline prediction, using PELT with BIC and mclust with 3 classes max on bsm with values below 0 set to 0.
v0.0.1-3 (20170208)
-----------------
* Changed the 'auto' mode for the germline prediction : now homozygous probes are directly taken from the mclust results.
* Added a routine to "rescue" unizomy BAF regions from the PELT/Mclust segmentation/clustering (when fraction of homozygous probes is above 80%). This should allow detection of unizomy regions.
* Changed rectangles border for the karyotypic l2r plot from transparent to same color as internal (keeping the alpha), to prevent invisible small gains/losses on the plot.
* Homogeneized sampe parameter names (CELs.list changed to CEL.list.file in A2p.annotate).
* Rounded (ceiling) the median width in the genomic instability table.
v0.0.1-2 (20170203)
-----------------
* Added a first "waveeffect" normalization track (from P25_AUAU2) for OncoScan arrays.
* Added the automatic HTML report to A2p.annotate()
* Corrected a rare bug in the centralization (method "l2r.centeredpeak") when a peak is found on an edge of the distribution.
* Corrected a bug (loss of dimensionality) when the gcdata or wavedata objects had a single track.
* Removed parameter "genome" from A2p.annotate() as its value can be obtained from the META file.
* Added intensity plot(s) from the CEL files (needed to get the median intensity for CytoScan arrays).
* Corrected some bugs considering meta tags.
v0.0.1-1 (20170120)
-----------------
* Added a routine based on PELT to rescue small events ignored by ASCAT.
* Restored detection of gender for data normalized with APT 1.16.1.
* Intermediate version to release A2p.annotate() for automatic SOLO.
v0.0.1-0 (20170119)
-----------------
* Added the cnchp2ascat() function to support results produced by package apt.cytoscan.1.16.1. This donction is built from oschp2ascat().
* Added controls in oschp2ascat() to check correspondance between ProbesetNames in GCdata and analyzed sample.
* Renamed first argument of A2p.batchrun() for clarity.
* Rebuilt the two GC-related Rdata packs, which are now ordered by numerical chromosomes (instead of alphanumerical).
* Corrected some errors/typos/inconsistences in doc.
v0.0.0-4 (20170117)
-----------------
* Added a function to generate a table for target genes.
* Patched to support the 'OncoScan_CNV' design.
v0.0.0-3 (20161221)
-----------------
* Modified oschp2ascat() to work in batch mode using a directory (but not recommended yet).
* oschp2ascat() is now compatible with OSCHP files generated for CytoScanHD arrays (maybe with CytoScan750K too but untested.
* Consequently, added GC-content tracks for CytoScanHD to the package.
* Now GC-content data are embedded as datasets rather than TSV files.
* oschp2ascat() is now faster in multihreaded mode when GC.renorm is TRUE (by setting .inorder = FALSE in foreach call).
* Re-centering is now possible but limited to a basis from the log2(ratio) profile, with 4 possible methods (default is 'l2r.centeredpeak) :
* . "recenter = [value]" then this l2r value is used to shift the profile.
* . "recenter = 'l2r.median'" then the median of the l2r profile is used.
* . "recenter = 'l2r.mainpeak'" then the most populated peak from the l2r density is used.
* . "recenter = 'l2r.centeredpeak'" then the most centered peak from the l2r density is used (like in GC5).
* Added a wrapper to run A2p.run in batch.
* Resolved some bugs in this new function.
* Reshaped oschp2ascat() to reduce memory imprint with numerous threads.
v0.0.0-2 (20161027)
-----------------
* Now the GC tracks file for OS format is included in the package (included in the 'inst' folder).
* Bug solved : the workidir is set back to its initial value when returning the main function run.ascat()
(in v0.0-1, results were written in [infinite] recursive folders when processing multiple samples...).
* Better handling of sample name, now set to default NULL in all functions.
* Added the rawcopy2ascat() function as a start for the implementation of CytoScan HD support.
v0.0.0-1 (20161017)
----------------
* First early-beta-wip-ongoing-under-construction-use-at-your-own-risk-i-decline-all-responsibility version.