-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable cf uses separete write buffer manager #343
Conversation
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
cffafd6
to
004be9b
Compare
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
/run-test |
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
/run-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems way too complex for such a simple functionality. Why would be the problem if we just put a manager in ColumnFamilyOptions
? You will need an additional spinlock to protect the vector holding all managers so that we could access them in PreprocessWrite
. But I think in total it should be done in less than 50LOC.
PTAL /cc @tabokie @Connor1996 |
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
057e182
to
21b9ae5
Compare
db/column_family.cc
Outdated
@@ -1589,7 +1590,7 @@ ColumnFamilySet::ColumnFamilySet(const std::string& dbname, | |||
const ImmutableDBOptions* db_options, | |||
const FileOptions& file_options, | |||
Cache* table_cache, | |||
WriteBufferManager* _write_buffer_manager, | |||
WriteBufferManager*, /*non-used parameter*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid changing the tests, we should make it fallback to old behavior if cf_options.wbm is null. In specific, ColumnFamilySet
should still accept and store a wbm, but ColumnFamilyData
ctor might take a different wbm depending on the specific cf_options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion and done.
db/db_impl/db_impl.cc
Outdated
} | ||
} | ||
if (!already_exists) { | ||
write_buffer_manager_.push_back(cf_options.write_buffer_manager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need a spinlock to modify a vector.
db/db_impl/db_impl.h
Outdated
@@ -2282,7 +2281,7 @@ class DBImpl : public DB { | |||
|
|||
Directories directories_; | |||
|
|||
WriteBufferManager* write_buffer_manager_; | |||
std::vector<std::shared_ptr<WriteBufferManager>> write_buffer_manager_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use autovector<Manager*> and put DB level manager separately from CF level managers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion and done.
db/db_impl/db_impl_write.cc
Outdated
status = Status::Incomplete("Write stall"); | ||
} else { | ||
InstrumentedMutexLock l(&mutex_); | ||
WriteBufferManagerStallWrites(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity, we don't need to allow stalling from cf level wbm. For default and write cf, we will still use DB level wbm and reuse the old code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion and done.
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
// in DBImpl. | ||
// | ||
// Default: null | ||
std::shared_ptr<WriteBufferManager> cf_write_buffer_manager = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::shared_ptr<WriteBufferManager> cf_write_buffer_manager = nullptr; | |
std::shared_ptr<WriteBufferManager> write_buffer_manager = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Option inherited from DBOptions and ColumnFamilyOptions, so I use different name for it to avoid trouble.
db/db_test_util.h
Outdated
const Options& options); | ||
void CreateColumnFamilies( | ||
const std::vector<std::string>& cfs, const Options& options, | ||
std::unordered_map<std::string, std::shared_ptr<WriteBufferManager>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the intended use case of this function. If you have different options for each cf, use ReopenWithColumnFamilies(const std::vector<std::string>& cfs, const std::vector<Options>& options);
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
db/db_impl/db_impl_write.cc
Outdated
@@ -2282,6 +2287,12 @@ size_t DBImpl::GetWalPreallocateBlockSize(uint64_t write_buffer_size) const { | |||
bsize = std::min<size_t>(bsize, buffer_size); | |||
} | |||
} | |||
for (auto manager : cf_based_write_buffer_manager_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be min(db_write_buffer_size, sum(cf_wbm_size) + db_wbm_size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
db/db_impl/db_impl.h
Outdated
// Note: std::shared_ptr<WriteBufferManager> is store in ColumnfamilyOptions | ||
// which is destroyed before DBimpl, so we use | ||
// std::shared_ptr<WriteBufferManager> here in the vector. | ||
autovector<std::shared_ptr<WriteBufferManager>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment that this is only modified in Open, and lock is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
db/db_impl/db_impl.h
Outdated
// which is destroyed before DBimpl, so we use | ||
// std::shared_ptr<WriteBufferManager> here in the vector. | ||
autovector<std::shared_ptr<WriteBufferManager>> | ||
cf_based_write_buffer_manager_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If CreateColumnFamily passed in an option not in the list, return error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
db/db_impl/db_impl.h
Outdated
// For simplicity, CF based write buffer manager does not support stall the | ||
// write. | ||
// Note: std::shared_ptr<WriteBufferManager> is store in ColumnfamilyOptions | ||
// which is destroyed before DBimpl, so we use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you had any actual error caused by not using shared_ptr
? All code using WriteBufferManager
should be destroyed before VersionSet
, there shouldn't be any problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, UnregisterDB is used after VersionSet is destroyed.
Signed-off-by: SpadeA-Tang <[email protected]>
Signed-off-by: SpadeA-Tang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LG
db/column_family.cc
Outdated
options.cf_write_buffer_manager != nullptr | ||
? options.cf_write_buffer_manager.get() | ||
: write_buffer_manager_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this block out looks better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -685,6 +685,9 @@ Status DBImpl::CloseHelper() { | |||
delete txn_entry.second; | |||
} | |||
|
|||
for (auto m : cf_based_write_buffer_manager_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment the ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
db/db_impl/db_impl.h
Outdated
@@ -419,8 +419,7 @@ class DBImpl : public DB { | |||
virtual Status GetSortedWalFiles(VectorLogPtr& files) override; | |||
virtual Status GetCurrentWalFile( | |||
std::unique_ptr<LogFile>* current_log_file) override; | |||
virtual Status GetCreationTimeOfOldestFile( | |||
uint64_t* creation_time) override; | |||
virtual Status GetCreationTimeOfOldestFile(uint64_t* creation_time) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this line, need to commit without running clang-format locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: SpadeA-Tang <[email protected]>
@@ -1625,6 +1625,22 @@ Status DBImpl::Open(const DBOptions& db_options, const std::string& dbname, | |||
} | |||
|
|||
DBImpl* impl = new DBImpl(db_options, dbname, seq_per_batch, batch_per_txn); | |||
for (auto cf : column_families) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of this block of code is quite isolated. I think it's better to wrap a private function for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not, adding a member function has both mental overhead and compilation overhead. Plus this block can't be exactly reused elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the major benefit of a private function is better readability: people can skip the detail when they don't care the very detail of the block. Other benefits include better testability, and more modular code.
Of course adding a comment for the block serves the same purpose in terms of readability, but old-school programmers typically prefer a function and even chat-gpt also recommends a wrap function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is always a trade-off to extract a sub-routine. Reading function has a significant context switch overhead, and for non-pure function that modifies states it is mostly needed to read the implementation of the function. In RocksDB codebase it is generally preferred to use inlined code rather than many small functions. Some believe it helps with code quality.
@@ -2278,6 +2283,9 @@ size_t DBImpl::GetWalPreallocateBlockSize(uint64_t write_buffer_size) const { | |||
if (immutable_db_options_.write_buffer_manager) { | |||
size_t buffer_size = | |||
immutable_db_options_.write_buffer_manager->flush_size(); | |||
for (auto manager : cf_based_write_buffer_manager_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to update buffer_size for WAL here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the flush size of write buffer manager is small compared with write buffer size, we will switch memtable before write bytes up to write buffer size. Use min here can avoid wasty pre allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but here you're using += instead of min<> of cf_based_write_buffer_manager_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because all CF will write in the same WAL file. Imaging we have 1M write-buffer-limit for CF1 but 1GB write-buffer-limit for CF2.
/merge |
Signed-off-by: Spade A <[email protected]>
Signed-off-by: Spade A <[email protected]> Signed-off-by: Yang Zhang <[email protected]>
Signed-off-by: Spade A <[email protected]> Signed-off-by: Yang Zhang <[email protected]>
This PR enables different write buffer managers to manager different cfs.