Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] hpcSPAdes rebase, cleanup & implementation #1380

Open
wants to merge 102 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
f285a3c
CMAKE: Hook in MPI
eodus Oct 19, 2018
673dd41
Add MPI runtime detection
asl Dec 17, 2018
0219be9
CMAKE: Install spades-hpc
eodus Aug 14, 2018
0682203
Add mpi_console_writer log_writer interface impl
eodus Dec 27, 2018
f4be241
Factor out mpi log writers as well
asl Sep 25, 2020
c98a426
Add mpi partask core
asl Sep 25, 2020
b4b9b1f
Add MPI test
asl May 26, 2020
10396a4
User-friendly rank reporting
asl Sep 28, 2020
049271a
Rudimentrary MPI stage manager & MPI stage
asl Sep 25, 2020
22fd17f
Add TestMPI stage
eodus Nov 7, 2018
b1485ed
Separate out SPAdes and hpcSPAdes binaries
asl Sep 28, 2020
68e89b4
Even better SPAdes / hpcSPAdes separation
asl Sep 28, 2020
5b8f287
read_converter: changable chunk_num
olga24912 Oct 7, 2020
903dff4
MPI: sequence mapper notifier
olga24912 Nov 2, 2020
688fdcd
MPI mismatch_corrector
olga24912 Oct 8, 2020
b6f8f22
MPI: Gap closer
olga24912 Nov 2, 2020
0490ca2
MPI: pair_info_count
olga24912 Nov 5, 2020
288cca5
Move implementation to .cpp. Small cleanup while there
asl Feb 26, 2021
c8d6fb3
Allow name ot be overriden
asl Feb 26, 2021
f3cb336
Factor out implementation into a separate class
asl Feb 26, 2021
adea3d5
Simplify parallel processing
asl Feb 26, 2021
1fba456
Simplify
asl Feb 26, 2021
ed7fe74
Ensure the streams are fresh
asl Feb 26, 2021
335c06f
Add MPI construction stage
asl Feb 26, 2021
dfd6f07
First real MPI construction step: collect k-mer coverage in parallel
asl Feb 27, 2021
7739944
add comment to DistanceEstimation
olga24912 Aug 4, 2021
60b8dd1
BinRead/Write for PairedIndices
olga24912 Aug 5, 2021
e00a2a8
Distance Estimator MPI
olga24912 Aug 5, 2021
7d193d9
Distance Estimator MPI stage
olga24912 Aug 5, 2021
a08fd05
DistanceEstimator MPI wrapper
olga24912 Aug 5, 2021
8ea28ca
MPI GraphCondensing
olga24912 Mar 7, 2021
abf09c4
remove Seq from extansion index
olga24912 May 14, 2021
cc1577f
remove friend from UnbranchingPathExtractor
olga24912 May 14, 2021
2daac3f
style fix
olga24912 Jul 21, 2021
441b189
MPI EarlyTipClipper
olga24912 Jun 12, 2021
94ae6f7
MPI Build Extension Index
olga24912 Jul 22, 2021
3334aa7
comments for MergeKmerFileTask
olga24912 Jul 22, 2021
cdb191d
verification that after split the #buckets the same in all storages
olga24912 Jul 22, 2021
8bc46ff
make kpostorage const
olga24912 Jul 22, 2021
96c2414
comment for release_all
olga24912 Jul 22, 2021
dde2892
add comment for Splitter
olga24912 Jul 22, 2021
a6f3691
resize one time in MergeKMers
olga24912 Jul 22, 2021
d2ad40b
VERIFY kmers in kmer storage are sorted and unique
olga24912 Jul 22, 2021
9b2a16b
fix warnings
olga24912 Jul 22, 2021
f148be3
MPI KMerCounter
olga24912 Jul 26, 2021
f4fa3b6
close stream after use
olga24912 Jul 27, 2021
1955316
constat for contig output stage
olga24912 Jul 27, 2021
ff707c0
add sync for read conversion
olga24912 Jul 27, 2021
b737093
MPI ATtipClipper
olga24912 Aug 31, 2021
556666d
reduce code dublication
olga24912 Sep 2, 2021
d689e1f
fix pair_info_counter
olga24912 Oct 14, 2021
fe27f20
detach edge index on load
olga24912 Oct 27, 2021
ed7c152
detach edge index at the end of pair_info_counter
olga24912 Oct 28, 2021
61b8318
create lib Spades-MPI
olga24912 Oct 5, 2021
39b5ece
separate distEst
olga24912 Oct 12, 2021
6849808
separate construction_mpi
olga24912 Oct 14, 2021
3591808
separate test_mpi
olga24912 Oct 14, 2021
3765393
update distEst Arhitecture
olga24912 Nov 19, 2021
df328d8
make SeqMapperNot in GCMPI as in GC
olga24912 Nov 23, 2021
9ccb1df
GapCloserBase
olga24912 Nov 23, 2021
d101a7b
Separate MPI gap closer
olga24912 Nov 23, 2021
47c28e0
mismatch_corrector with functor
olga24912 Nov 23, 2021
b5817c4
declarate MismatchShallNotPass to hpp
olga24912 Nov 23, 2021
6c541d1
separate MPI mismatch_correction
olga24912 Nov 23, 2021
be0eeee
make pair_info_count consistent with master version
olga24912 Nov 24, 2021
18acaa2
functor for FillEdgePairFilter
olga24912 Nov 24, 2021
ccf2e80
separate PairInfoCount MPI
olga24912 Nov 25, 2021
520f6f1
MapLibFabric
olga24912 Nov 27, 2021
0ebac00
Separate SeqMapperNotifier
olga24912 Nov 29, 2021
28927de
PerfectHashMapperBuilder MPI
olga24912 Nov 30, 2021
28da1f1
move mpi_kmer_index_builder to mpi dir
olga24912 Nov 30, 2021
9735497
separate perfect_hash_map_builder_mpi
olga24912 Nov 30, 2021
26da67b
move kmer_extension_index_builder_mpi to mpi dir
olga24912 Nov 30, 2021
932636d
Move partask and mpi stage to hpcSPAdes
asl Sep 12, 2024
790d295
run_on_load stage type
olga24912 Dec 2, 2021
db9db37
namespace spaces
olga24912 Dec 2, 2021
ddfa32c
rename function by code style
olga24912 Dec 2, 2021
3262e35
delete mpi from local pipeline
olga24912 Dec 6, 2021
f3772ec
separate logger mpi
olga24912 Dec 6, 2021
751b782
Add time tracer annotations for MPI stage manager. Some cleanup here …
asl Jan 11, 2022
b4bf0e5
Do not overwrite time traces from different nodes
asl Jan 11, 2022
0a0f75b
Cleanup
asl Jan 11, 2022
b833f1c
Time tracing for partask
asl Jan 11, 2022
eb09ca0
Better time tracing
asl Jan 11, 2022
864332a
More information
asl Jan 11, 2022
a20187f
Annotate PathExtend
asl Jan 11, 2022
d4d8a8d
A bit more verbosity
asl Jan 11, 2022
70d957c
More events + some cleanups
asl Jan 11, 2022
7a01e58
call process lib func in Mismatch Corrector
olga24912 Jan 13, 2022
ea29617
fix: allreduce only in sync
olga24912 Apr 1, 2022
dbe03b4
fix: block size/NNodes
olga24912 Apr 4, 2022
ae78af7
sequence mapper notifier MPI in paired info counter
olga24912 Apr 6, 2022
5dbc8fc
PairedInfoCounter SeqMapNot MPI
olga24912 Apr 6, 2022
b51eb6c
split streams on allthreads cnt
olga24912 Apr 11, 2022
2d31adf
Move MPI detection down to project
asl Sep 12, 2024
155b203
Normalize include paths
asl Sep 12, 2024
220ddad
Add spades.py bits
asl Sep 18, 2024
f9ed062
Add hpcSPAdes to list of known projects
asl Sep 18, 2024
fcaca11
Add SLURM executor
asl Sep 21, 2024
fcf0c5c
Better job names
asl Sep 21, 2024
9573ebf
Fix some defaults
asl Sep 21, 2024
a7d7ecf
Run stuff via srun by default
asl Sep 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/CMakeListsInternal.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ if (SPADES_BUILD_INTERNAL)
add_subdirectory(test/debruijn)
add_subdirectory(test/examples)
add_subdirectory(test/adt)
add_subdirectory(test/mpi)
else()
add_subdirectory(test/include_test EXCLUDE_FROM_ALL)
add_subdirectory(test/debruijn EXCLUDE_FROM_ALL)
add_subdirectory(test/mpi EXCLUDE_FROM_ALL)
add_subdirectory(test/adt EXCLUDE_FROM_ALL)
add_subdirectory(test/examples EXCLUDE_FROM_ALL)
endif()
1 change: 1 addition & 0 deletions src/cmake/includes.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ include_directories(SYSTEM "${Boost_INCLUDE_DIRS}")
if (SPADES_USE_TCMALLOC)
include_directories("${GOOGLE_PERFTOOLS_INCLUDE_DIR}")
endif()

if (SPADES_USE_JEMALLOC)
include_directories("$<TARGET_FILE_DIR:jemalloc-static>/../include")
endif()
2 changes: 1 addition & 1 deletion src/cmake/proj.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

# Side-by-side subprojects layout: automatically set the
# SPADES_EXTERNAL_${project}_SOURCE_DIR using SPADES_ALL_PROJECTS
set(SPADES_ALL_PROJECTS "spades;hammer;ionhammer;corrector;spaligner;spades_tools;binspreader;pathracer")
set(SPADES_ALL_PROJECTS "spades;hammer;ionhammer;corrector;spaligner;spades_tools;binspreader;pathracer;hpcspades")
set(SPADES_EXTRA_PROJECTS "mts;online_vis;cds_subgraphs")
set(SPADES_KNOWN_PROJECTS "${SPADES_ALL_PROJECTS};${SPADES_EXTRA_PROJECTS}")
set(SPADES_ENABLE_PROJECTS "" CACHE STRING
Expand Down
14 changes: 14 additions & 0 deletions src/common/alignment/long_read_mapper.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,20 @@ class LongReadMapper: public SequenceMapperListener {
return g_;
}

void Serialize(std::ostream &os) const override {
storage_.BinWrite(os);
}

void Deserialize(std::istream &is) override {
storage_.BinRead(is);
}

void MergeFromStream(std::istream &is) override {
PathStorage<Graph> remote(g_);
remote.BinRead(is);
storage_.AddStorage(remote);
}

private:

void ProcessSingleRead(size_t thread_index, const omnigraph::MappingPath<EdgeId>& mapping, const io::SingleRead& r);
Expand Down
18 changes: 18 additions & 0 deletions src/common/alignment/rna/ss_coverage_filler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
#include "alignment/sequence_mapper_notifier.hpp"
#include "assembly_graph/paths/mapping_path.hpp"

#include "io/binary/binary.hpp"

namespace debruijn_graph {

class SSCoverageFiller: public SequenceMapperListener {
Expand Down Expand Up @@ -65,6 +67,22 @@ class SSCoverageFiller: public SequenceMapperListener {
storage_.IncreaseKmerCount(it.first, size_t(it.second));
tmp_storages_[thread_index].Clear();
}

void Serialize(std::ostream &os) const override {
io::binary::BinWrite(os, storage_);
}

void Deserialize(std::istream &is) override {
io::binary::BinRead(is, storage_);
}

void MergeFromStream(std::istream &is) override {
SSCoverageStorage remote(g_);
io::binary::BinRead(is, remote);
for (const auto& it : remote) {
storage_.IncreaseKmerCount(it.first, size_t(it.second));
}
}
};


Expand Down
3 changes: 1 addition & 2 deletions src/common/alignment/sequence_mapper_notifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,8 @@
#include "io/reads/read_stream_vector.hpp"

namespace debruijn_graph {

SequenceMapperNotifier::SequenceMapperNotifier(size_t lib_count)
: listeners_(lib_count)
: listeners_(lib_count)
{}

void SequenceMapperNotifier::Subscribe(SequenceMapperListener* listener, size_t lib_index) {
Expand Down
69 changes: 63 additions & 6 deletions src/common/alignment/sequence_mapper_notifier.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "io/reads/paired_read.hpp"
#include "io/reads/read_stream_vector.hpp"
#include "utils/perf/timetracer.hpp"
#include "utils/stl_utils.hpp"

#include <string>
#include <vector>
Expand All @@ -36,7 +37,23 @@ class SequenceMapperListener {
virtual void ProcessSingleRead(size_t /* thread_index */, const io::SingleReadSeq& /* r */, const omnigraph::MappingPath<EdgeId>& /* read */) {}

virtual void MergeBuffer(size_t /* thread_index */) {}


virtual void Serialize(std::ostream&) const {
VERIFY_MSG(false, "Serialize() is not implemented");
}

virtual void Deserialize(std::istream&) {
VERIFY_MSG(false, "Deserialize() is not implemented");
}

virtual void MergeFromStream(std::istream&) {
VERIFY_MSG(false, "MergeFromStream() is not implemented");
}

virtual const std::string name() const {
return utils::type_name(typeid(*this).name());
}

virtual ~SequenceMapperListener() {}
};

Expand All @@ -56,8 +73,7 @@ class SequenceMapperNotifier {
const SequenceMapperT& mapper, size_t threads_count = 0) {
return ProcessLibrary(streams, 0, mapper, threads_count);
}

private:

template<class ReadType>
void ProcessLibrary(io::ReadStreamList<ReadType>& streams,
size_t lib_index, const SequenceMapperT& mapper, size_t threads_count = 0) {
Expand All @@ -68,7 +84,7 @@ class SequenceMapperNotifier {
threads_count = streams.size();

streams.reset();
NotifyStartProcessLibrary(lib_index, threads_count);
NotifyStartProcessLibrary(lib_index, streams.size());
size_t counter = 0, n = 15;

#pragma omp parallel for num_threads(threads_count) shared(counter)
Expand Down Expand Up @@ -97,9 +113,10 @@ class SequenceMapperNotifier {
counter += size;
}

for (size_t i = 0; i < threads_count; ++i)
for (size_t i = 0; i < streams.size(); ++i)
NotifyMergeBuffer(lib_index, i);

streams.close();
INFO("Total " << counter << " reads processed");
NotifyStopProcessLibrary(lib_index);
}
Expand All @@ -114,7 +131,47 @@ class SequenceMapperNotifier {

void NotifyMergeBuffer(size_t ilib, size_t ithread) const;

std::vector<std::vector<SequenceMapperListener*> > listeners_; //first vector's size = count libs
protected:
std::vector<ListenersContainer> listeners_; //first vector's size = count libs
};

class MapLibBase {
public:
virtual void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::PairedRead>& streams) const = 0;
virtual void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::SingleRead>& streams) const = 0;
virtual void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::SingleReadSeq>& streams) const = 0;
virtual void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::PairedReadSeq>& streams) const = 0;

template<class Streams>
void operator() (SequenceMapperListener* listener, const SequenceMapper<Graph>& mapper, Streams& streams) const {
this->operator() (std::vector<SequenceMapperListener*>(1, listener), mapper, streams);
}
};

class MapLibFunc : public MapLibBase {
public:
void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::PairedRead>& streams) const override {
MapLib(listeners, mapper, streams);
}
void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::SingleRead>& streams) const override {
MapLib(listeners, mapper, streams);
}
void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::SingleReadSeq>& streams) const override {
MapLib(listeners, mapper, streams);
}
void operator() (const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<io::PairedReadSeq>& streams) const override {
MapLib(listeners, mapper, streams);
}

private:
template<class ReadType>
void MapLib(const std::vector<SequenceMapperListener*>& listeners, const SequenceMapper<Graph>& mapper, io::ReadStreamList<ReadType>& streams) const {
SequenceMapperNotifier notifier;
for (auto listener: listeners) {
notifier.Subscribe(listener);
}
notifier.ProcessLibrary(streams, mapper);
}
};

} // namespace debruijn_graph
Expand Down
Loading
Loading