-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable cf uses separete write buffer manager #343
Changes from all commits
5079707
e3ada2d
fadfef3
6ae46ff
cd5a294
af33283
ab43653
b25567e
033d847
199d704
c47e7d8
004be9b
30b7862
41775e2
0095aa8
964cf81
64fbf24
e373067
4ce4cef
e67df36
755c10f
79ac364
62cbf09
6f6d326
6b5aabd
d621893
e29df90
37d455c
9d24a9f
a1ca8a9
c38c417
9924742
8fbaf5c
85d43bf
21b9ae5
4f509cf
8ecc649
947c94b
ebb40a1
d79241f
5e404af
35caf14
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1625,6 +1625,22 @@ Status DBImpl::Open(const DBOptions& db_options, const std::string& dbname, | |
} | ||
|
||
DBImpl* impl = new DBImpl(db_options, dbname, seq_per_batch, batch_per_txn); | ||
for (auto cf : column_families) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logic of this block of code is quite isolated. I think it's better to wrap a private function for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I prefer not, adding a member function has both mental overhead and compilation overhead. Plus this block can't be exactly reused elsewhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the major benefit of a private function is better readability: people can skip the detail when they don't care the very detail of the block. Other benefits include better testability, and more modular code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it is always a trade-off to extract a sub-routine. Reading function has a significant context switch overhead, and for non-pure function that modifies states it is mostly needed to read the implementation of the function. In RocksDB codebase it is generally preferred to use inlined code rather than many small functions. Some believe it helps with code quality. |
||
if (cf.options.cf_write_buffer_manager != nullptr) { | ||
auto* write_buffer_manager = cf.options.cf_write_buffer_manager.get(); | ||
bool already_exist = false; | ||
for (auto m : impl->cf_based_write_buffer_manager_) { | ||
if (m == write_buffer_manager) { | ||
already_exist = true; | ||
break; | ||
} | ||
} | ||
if (!already_exist) { | ||
impl->cf_based_write_buffer_manager_.push_back(write_buffer_manager); | ||
} | ||
} | ||
} | ||
|
||
s = impl->env_->CreateDirIfMissing(impl->immutable_db_options_.GetWalDir()); | ||
if (s.ok()) { | ||
std::vector<std::string> paths; | ||
|
@@ -1896,20 +1912,34 @@ Status DBImpl::Open(const DBOptions& db_options, const std::string& dbname, | |
} | ||
if (s.ok()) { | ||
impl->StartPeriodicWorkScheduler(); | ||
|
||
// Newly created handles are already registered during | ||
// `CreateColumnFamily`. We must clear them all to avoid duplicate | ||
// registration. | ||
if (impl->write_buffer_manager_) { | ||
// Newly created handles are already registered during | ||
// `CreateColumnFamily`. We must clear them all to avoid duplicate | ||
// registration. | ||
impl->write_buffer_manager_->UnregisterDB(impl); | ||
for (auto* cf : *handles) { | ||
} | ||
for (auto m : impl->cf_based_write_buffer_manager_) { | ||
m->UnregisterDB(impl); | ||
} | ||
|
||
for (size_t i = 0; i < (*handles).size(); ++i) { | ||
auto cf_opt = column_families[i].options; | ||
|
||
auto* cf = (*handles)[i]; | ||
std::string cf_name = cf->GetName(); | ||
auto* write_buffer_manager = cf_opt.cf_write_buffer_manager != nullptr | ||
? cf_opt.cf_write_buffer_manager.get() | ||
: impl->write_buffer_manager_; | ||
if (write_buffer_manager) { | ||
if (cf->GetName() == kDefaultColumnFamilyName) { | ||
impl->write_buffer_manager_->RegisterColumnFamily( | ||
impl, impl->default_cf_handle_); | ||
write_buffer_manager->RegisterColumnFamily(impl, | ||
impl->default_cf_handle_); | ||
} else if (cf->GetName() == kPersistentStatsColumnFamilyName) { | ||
impl->write_buffer_manager_->RegisterColumnFamily( | ||
write_buffer_manager->RegisterColumnFamily( | ||
impl, impl->persist_stats_cf_handle_); | ||
} else { | ||
impl->write_buffer_manager_->RegisterColumnFamily(impl, cf); | ||
write_buffer_manager->RegisterColumnFamily(impl, cf); | ||
} | ||
} | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1304,6 +1304,11 @@ Status DBImpl::PreprocessWrite(const WriteOptions& write_options, | |
if (UNLIKELY(status.ok() && write_buffer_manager_->ShouldFlush())) { | ||
write_buffer_manager_->MaybeFlush(this); | ||
} | ||
for (auto write_buffer_manager : cf_based_write_buffer_manager_) { | ||
if (UNLIKELY(status.ok() && write_buffer_manager->ShouldFlush())) { | ||
write_buffer_manager->MaybeFlush(this); | ||
} | ||
} | ||
|
||
if (UNLIKELY(status.ok() && !trim_history_scheduler_.Empty())) { | ||
InstrumentedMutexLock l(&mutex_); | ||
|
@@ -2278,6 +2283,9 @@ size_t DBImpl::GetWalPreallocateBlockSize(uint64_t write_buffer_size) const { | |
if (immutable_db_options_.write_buffer_manager) { | ||
size_t buffer_size = | ||
immutable_db_options_.write_buffer_manager->flush_size(); | ||
for (auto manager : cf_based_write_buffer_manager_) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need to update buffer_size for WAL here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the flush size of write buffer manager is small compared with write buffer size, we will switch memtable before write bytes up to write buffer size. Use min here can avoid wasty pre allocation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but here you're using += instead of min<> of cf_based_write_buffer_manager_ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is because all CF will write in the same WAL file. Imaging we have 1M write-buffer-limit for CF1 but 1GB write-buffer-limit for CF2. |
||
buffer_size += manager->flush_size(); | ||
} | ||
if (buffer_size > 0) { | ||
bsize = std::min<size_t>(bsize, buffer_size); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment the ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done