Skip to content

Commit

Permalink
CkIO Reading Capabilities (#3788)
Browse files Browse the repository at this point in the history
* Added the read_stripe field to the Options struct in ckio.h

* Added the read_stripe parameter in Options to the be puped

* Added startReadSession function to the ckio.h file. Will be called analagously to the startSession function that is used for writes.

* Added the message type ReadReadyMsg, as well as the read function that the user will interact with. read() was declared in the ckio.h file, while I defined the ReadReadyMsg class in the .h file as well.

* Added the prepareReadSession function to the director, and laid out some scaffolding for the ReadSession chare array. These changes are in ckio.ciI

* Wrote the prepareReadSessionHelper function, which sets up all of the reader chares.

* Wrote the ReadSession constructor in ckio.C

* implemented the ReadSession constructor that initializes the offset and bytes, as well as the ReadSession::readData function which gets the data from memory into chare array to serve to the user

* Got the reads to work in parallel. Must fix the memory leaks and implement a method of ending a read session.

* Removed the read_id field of the director as it wasn't being used.

* Included a (delete this) call at the of ReadAssembler::shareData so that the chare will clean itself up upon invoking the user provided callback after receiving all the necessary data for the read.

* New test suite written in io_read under tests. Giving a linker error with �[200~https://github.com/elamdf/

* add missing def include

* Added the new test suite and the test file readtest.txt

* Added a clearBuffer() method to the chare array ReadSession in order to clean up that buffer memory. This is a stand-in for an actual freeing of the chare array, as broadcasting ckDestroy seems to break the code.

* Commented out the printing code that tracks the where the CkIO library is in the line of execution.

* io_read: Use relative paths in Makefile

* Added a function that gets the data at offset and num bytes sequentially. Then in testRead will cal that seq read function and verifies that parallel read and sequential read get the same data.

* Made the gathering of the information point to point and not a broadcast in order to reduce amount of messages sent.

* Added the ckDestroy broadcast once closeReadSession is invoked.

* Added some comments to give better insight as to what the test for parallel reads is doing.

* Tests: Readd deleted io test

* Fixed a comment relating to creating the read assembler

Changed comment reading "create read assemble" to "create read assembler" on line 156

Co-authored-by: Ronak Buch <[email protected]>

* Switched buffer=data.data() to _data_buffer.data() because _data_buffer is what the full read request, while data simply contains the data from a specific IO chare.

* Simply memcpy into the vector::data directly instead of allocating a whole new buffer and then copying into the vector

* Removed the clearBuffer broadcast because ckDestroy is being broadcasted instead, which will already free all of the memory

* Removed the commented ckDestroy broadcast, as it is already being called.

* Removed the empty headers directory and the 'touch headers' from Makefile. Also restored the LUSTRE command in the Makefile

* Removed ReadAssembler constructor comments:

* Removed debugging comments from ReadAssembler::shareData
;

* Removed debugging comments from ReadSession::sendData

* Removed comments from ReadSession constructor

* Removed comment that wasn't helpful from impl::Director::prepareReadSessionHelper

* removed debugging statements from impl::Director::::read(), which is called by Ck::IO::read

* Removed the comments from ReadSession::readData that were superfluous:

* Cleaned up white space in the ckio.ci file

* Removed the -verbose from the lb_test Makefile

* Removed the periodic_lb_broadcast_test binary ping

* Removed the period_selection binary from the meta_lb_test

* Removed the extra lines from comment in the ckio.h file

* Changed the shareData method from taking vector<char> to char[] and size_t num_bytes for performance boost.

* Removed the ckouts from debugging

* Changed the std::cerr to CkAbort if opening the file fails in ReadSession::readData

* Removed the unistd.h include that wasn't wrapped in the windows checker

* Removed the comment touching headeres in the Makfile on line 21

* Instead of push_back with vector in sendData, set the size of vector directly and then insert

* Changed the loop to copy data to a memcpy call. Resulted in huge speedup from 3.1sec to 1.7secs when reading 4GB file using benchmark program.

* supported tagging reads in class ReadCompleteMsg, Ck::IO::read, ReadAssembler

* Changed Ck::IO::read functions to be handled by the manager on each PE rather than the Director chare that was on PE 0.

* Edits to the reading tests

* Changed ifstream to fread

* Added fseek when converting the ifstream to the FILE pointers. Also added a dd command in the tests/charm++/io_read/Makefile to automatically create the file large_test.txt when make is ran

* Avoided the extra memcpy in the ReadAssembler::shareData function. Also inserted some logging statements in order to determine when the disk reads are finished.

* Start of implementing the zerocopy API by assigning 0-copy post and assigning tags to the buffers.

* Fixed bug with the start_idx in Manager::read. By doing (offset - session -> offset) instead of offset, it will find correct ReadSession chare because new value is wrt the actual readsession instead of just the file.

* Introduced the CkPostBuffer by deferring a lot of the heavy lifting of read-serving to read assembler. Added tracking of buffer tags on the manager group with getTag and _curr_tag. Adjusted the sendData method to take the buffer tag so that it can correctly call shareData and do the CkMatch. Still have to adjust the shareData method to tag the buffer tag.

* EXPERIMENTAL: trying to find why ckarray:1266 is segfaulting when trying to access pointer locMgr.

* Implemented a reduction on managers after adding the session and readassembler key-value pair to the map. Used SDAG to correctly invoke the callback after the read session has been setup correctly in startReadSession call.

* when the read is done, remove the ReadInfo struct from the map on the ReadAssembler

* Potential bug: was using CkIO library read-tags per ReadAssembler rather than from the manager. Potentially meant if there were multiple sessions open, the Zero Copy would have multiple tags of the same value because 2 separate ReadAssemblers would've assigned the same tag on the PE. By using the manager, even if they're part of different sessions, every read within the CkIO library will have different tags.

* Changed the API of reads; instead of specifying the number of bytes each IO chare will own, instead specify the number of IO chares that you would like. CkIO will calculate the read stripe from there. Any extra bytes that don't divide evenly will be owned by the last IO chare.:

* Changed the default behavior if Options::num_readers is 0

* Moving debugging statements to comments or the DEBUG macro

* Laid out the BufferMap class that will be used to layout how the IO chares will be laid out

* Added a startReadSession that allows a vector of PEs to be passed in for mapping of the IO Buffer chares.

* Added asynchronous reads by spawning a thread to do the fread instead of doing it in the main thread. Use shared future to manage race conditions.

* BUG FIX: If the IO buffers didn't evenly divide the session size and the last buffer stored more data, reads that required that extra bit of data wouldn't work properly because the posting of buffers for the Zero Copy wouldn't include that because they'd only do (read_stripe), when there's extra data on the end. Added a conditional testing if the current buffer being explored is the last one, and then simply set the length of ZC posting to be the read_offset + read_size - (read_stripe * i + session.offset).

* Removed random string creation in the sendData function. Also made sendData a threaded entry method

* Added a polling of the shared_future to see if the data is ready. If the I/O isn't done, suspend the Charm++ UTL and do other work while the I/O, letting the Charm RTS schedule the thread. Might want to add some sort of limit on number of I/O threads allowed per PE

* Fixed edge case in the reads where you read into extra bytes in your read request

* Renamed ReadSession to BufferChares

* Removed random print statements + code that wasn't being used anymore

* Comments for future developers, as well as some ideas for improvements/future work

* Wrote a basic unit test with 3 Buffer Chares, 10 readers, and a 100 byte test file. Ensure that some reads cross buffer chare boundaries

* Added the test recipe to the Makefile - hopefully that will fix all of the failed PRs

* Added CkIO to the conditional that links -pthread iff the platform needs it

* removed the binary that was accidentally check in; hopefully that makes smp tests pass

* Added  flag to the end of the tests for io_Read

* Fix potential mistake in the  for io_read tests

* remove commands not involved in building tests from the iotest rule

* Improve the formatting of CkIO Reading tests, as well as get rid of the rrors

Co-authored-by: Evan Ramos <[email protected]>

* Removed both deprecated and unused functions.

* Removed commented out file read

* Switched the setting of num_readers to numReaders after changing the variable name

* Changed the default value for numReaders if not set

* Switched to opening the file in binary mode so that windows doesn't mangle the newline character

* Trying to combat slow disk times

* got rid of MPI version

* fixed the friend read functino to have the new signature

* memory leak fixed with buffer chare destructor

* debug print statements and things

* debug print statements

* adding debug things but crashing on round 2

* user events debugging, yield macro

* adding io updates with yield and call after implementations

* got rid of almost all the debu statements. Now we must see if we can clean up serveRead and also if we can add an example to the examples folder.

* Added a working test in the tests/ folder and also cleaned up the ckio source code'

* renamed read_stripe to read_stride to reflect it measure how much data buffer chares own

* fixing bug in test

* cleanup test code and add smp

* spacing fix

* adding sdag threading

* switch from open64 (largefile) to plain open

* removing unistd.h

* changing prints to aborts

* updating open permissions to support windows

* adding windows read

* adding O_BINARY to windows call

* adding touch headers to makefile

* merged the conflicts - finally mostly eng ready

---------

Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Zane Fink <[email protected]>
Co-authored-by: Ronak Buch <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: Evan Ramos <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: mtaylo12 <[email protected]>
Co-authored-by: Maya Taylor <[email protected]>
Co-authored-by: Maya Taylor <[email protected]>
Co-authored-by: Mathew Jacob <[email protected]>
Co-authored-by: Maya Taylor <[email protected]>
Co-authored-by: mtaylo12 <[email protected]>
Co-authored-by: Eric Bohm <[email protected]>
  • Loading branch information
18 people authored Mar 26, 2024
1 parent 4f3eed1 commit 4ad4095
Show file tree
Hide file tree
Showing 9 changed files with 1,396 additions and 411 deletions.
1,229 changes: 894 additions & 335 deletions src/libs/ck-libs/io/ckio.C

Large diffs are not rendered by default.

241 changes: 171 additions & 70 deletions src/libs/ck-libs/io/ckio.ci
Original file line number Diff line number Diff line change
@@ -1,76 +1,151 @@
module CkIO {

namespace Ck { namespace IO {
message FileReadyMsg;
message SessionReadyMsg;
message SessionCommitMsg;
}
}
module CkIO
{
namespace Ck
{
namespace IO
{
message FileReadyMsg;
message SessionReadyMsg;
message SessionCommitMsg;
message ReadCompleteMsg;
// used by the read() CkCallback
} // namespace IO
} // namespace Ck

initnode _registerCkIO_impl();
};

module CkIO_impl {
module CkIO_impl
{
include "ckio.h";

namespace Ck { namespace IO {
namespace impl {
readonly CProxy_Director director;
namespace Ck
{
namespace IO
{
namespace impl
{
readonly CProxy_Director director;

mainchare [migratable] Director
{
entry Director(CkArgMsg *);

/// Serialize setting up each file through this chare, so that all PEs
/// have the same sequence
entry void openFile(std::string name, CkCallback opened, Options opts);
entry [reductiontarget] void fileOpened(FileToken file);
entry [reductiontarget] void sessionComplete(FileToken file);

entry void prepareWriteSession(FileToken file, size_t bytes, size_t offset,
CkCallback ready, CkCallback complete) {
serial {
prepareWriteSession_helper(file, bytes, offset, ready, complete);
}
when sessionReady[files[file].sessionID](CkReductionMsg *m) serial {
delete m;
ready.send(new SessionReadyMsg(Session(file, bytes, offset,
files[file].session)));
}
};
entry void prepareWriteSession(FileToken file, size_t bytes, size_t offset,
CkCallback ready, const char commitData[commitBytes],
size_t commitBytes, size_t commitOffset,
CkCallback complete) {
serial {
CkCallback committed(CkIndex_Director::sessionDone(NULL), thisProxy);
committed.setRefnum(++sessionID);
prepareWriteSession_helper(file, bytes, offset, ready, committed);
}
when sessionReady[files[file].sessionID](CkReductionMsg *m) serial {
delete m;
ready.send(new SessionReadyMsg(Session(file, bytes, offset,
files[file].session)));
}
when sessionDone[files[file].sessionID](CkReductionMsg *m) serial {
delete m;
impl::FileInfo* info = CkpvAccess(manager)->get(file);
CmiInt8 ret = CmiPwrite(info->fd, commitData, commitBytes, commitOffset);
if (ret != commitBytes)
fatalError("Commit write failed", info->name);
complete.send(CkReductionMsg::buildNew(0, NULL, CkReduction::nop));
}
};
entry void sessionReady(CkReductionMsg *);
entry void sessionDone(CkReductionMsg *);
entry void close(FileToken token, CkCallback closed);
}
mainchare[migratable] Director
{
entry Director(CkArgMsg*);

group [migratable] Manager
{
entry Manager();
entry void run() {
while (true) {
/// Serialize setting up each file through this chare, so that all PEs
/// have the same sequence

entry void openFile(std::string name, CkCallback opened, Options opts);
entry[reductiontarget] void fileOpened(FileToken file);
entry[reductiontarget] void sessionComplete(FileToken file);

// the method used by the director which is used to close a readsession
entry void closeReadSession(Session, CkCallback);

entry void prepareReadSession(FileToken file, size_t bytes, size_t offset,
CkCallback ready){
// the director sets up a read session
serial{prepareReadSessionHelper(file, bytes, offset, ready, std::vector<int>());
}
when sessionReady[files[file].sessionID](CkReductionMsg* m) serial
{ // after the ReadSession chare array is set up, invoke the ready callback
delete m;
Session s(file, bytes, offset, files[file].read_session);
CProxy_ReadAssembler ra = CProxy_ReadAssembler::ckNew(s);
managers.addSessionReadAssemblerMapping(s, ra, ready);
}

when addSessionReadAssemblerFinished(CkReductionMsg* msg) serial
{
ready.send(
new SessionReadyMsg(Session(file, bytes, offset, files[file].read_session)));
delete msg;
}

}; // namespace impl

entry void prepareReadSession(FileToken file, size_t bytes, size_t offset,
CkCallback ready, std::vector<int> pes_to_map){
serial{prepareReadSessionHelper(file, bytes, offset, ready, pes_to_map);
} // namespace IO

when sessionReady[files[file].sessionID](CkReductionMsg* m) serial
{ // after the ReadSession chare array is set up, invoke the ready callback
delete m;
Session s(file, bytes, offset, files[file].read_session);
CProxy_ReadAssembler ra = CProxy_ReadAssembler::ckNew(s);
managers.addSessionReadAssemblerMapping(s, ra, ready);
}

when addSessionReadAssemblerFinished(CkReductionMsg* msg) serial
{
ready.send(
new SessionReadyMsg(Session(file, bytes, offset, files[file].read_session)));
delete msg;
}

}; // namespace Ck

entry void prepareWriteSession(FileToken file, size_t bytes, size_t offset,
CkCallback ready, CkCallback complete){
serial{prepareWriteSession_helper(file, bytes, offset, ready, complete);
}
when sessionReady[files[file].sessionID](CkReductionMsg* m) serial
{
delete m;
ready.send(new SessionReadyMsg(Session(file, bytes, offset, files[file].session)));
}
}
;
entry void prepareWriteSession(FileToken file, size_t bytes, size_t offset,
CkCallback ready, const char commitData[commitBytes],
size_t commitBytes, size_t commitOffset,
CkCallback complete){
serial{CkCallback committed(CkIndex_Director::sessionDone(NULL), thisProxy);
committed.setRefnum(++sessionID);
prepareWriteSession_helper(file, bytes, offset, ready, committed);
}
when sessionReady[files[file].sessionID](CkReductionMsg* m) serial
{
delete m;
ready.send(new SessionReadyMsg(Session(file, bytes, offset, files[file].session)));
}
when sessionDone[files[file].sessionID](CkReductionMsg* m) serial
{
delete m;
impl::FileInfo* info = CkpvAccess(manager)->get(file);
CmiInt8 ret = CmiPwrite(info->fd, commitData, commitBytes, commitOffset);
if (ret != commitBytes)
fatalError("Commit write failed", info->name);
complete.send(CkReductionMsg::buildNew(0, NULL, CkReduction::nop));
}
}
;

entry void sessionReady(CkReductionMsg*);
entry void sessionDone(CkReductionMsg*);
entry void close(FileToken token, CkCallback closed);
entry void addSessionReadAssemblerFinished(CkReductionMsg* msg);
}

// class tht will be used to assemble a specific read call
group ReadAssembler
{
// stores the parameters of the read call it is tasked with building
entry ReadAssembler(Session session);

// the method by which ReadSession objects can send their data for the read
// when invoked, avoid a sender-side copy
entry void shareData(int read_tag, int buffer_tag, size_t read_chare_offset,
size_t num_bytes, nocopypost char data[num_bytes]);
};

group[migratable] Manager
{
entry Manager();
entry void run()
{
while (true)
{
case {
when openFile[opnum](unsigned int opnum_,
FileToken token, std::string name, Options opts)
Expand All @@ -84,6 +159,8 @@ module CkIO_impl {
entry void openFile(unsigned int opnum,
FileToken token, std::string name, Options opts);
entry void close(unsigned int opnum, FileToken token, CkCallback closed);

entry void addSessionReadAssemblerMapping(Session session, CProxy_ReadAssembler ra, CkCallback ready);
};

array [1D] WriteSession
Expand All @@ -92,12 +169,36 @@ module CkIO_impl {
entry void forwardData(const char data[bytes], size_t bytes, size_t offset);
entry void syncData();
};


group Map : CkArrayMap
{
entry Map();
};
}
array [1D] BufferChares{
entry BufferChares(FileToken file, size_t offset, size_t bytes, size_t num_readers);
// way for the BufferChares object to send bytes over to the ReadAssembler object ra when serving a specific read

entry void sendData(int read_tag, int buffer_tag, size_t offset, size_t bytes, CProxy_ReadAssembler ra, int pe);

entry void sendDataHandler(int read_tag, int buffer_tag, size_t offset, size_t bytes, CProxy_ReadAssembler ra, int pe);
entry [threaded] void monitorRead();

entry void bufferReady() {
while (true) {
when sendData(int read_tag, int buffer_tag, size_t offset, size_t bytes, CProxy_ReadAssembler ra, int pe) {
serial{thisProxy[thisIndex].sendDataHandler(read_tag, buffer_tag, offset, bytes, ra, pe);}
}
}
}

entry [reductiontarget] void printTime(double time_taken);
};

group Map :
CkArrayMap { entry Map(); };
}
group BufferNodeMap : CkArrayMap
{
entry BufferNodeMap();
entry BufferNodeMap(std::vector<int> processors);
};
}
}
}
Loading

0 comments on commit 4ad4095

Please sign in to comment.