Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(wal): optimize the start-up and shutdown of WAL #1625

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
import java.util.concurrent.ExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicIntegerFieldUpdater;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.locks.Lock;
import java.util.function.Function;
Expand Down Expand Up @@ -115,7 +116,11 @@ public class BlockWALService implements WriteAheadLog {
public static final int WAL_HEADER_CAPACITY = WALUtil.BLOCK_SIZE;
public static final int WAL_HEADER_TOTAL_CAPACITY = WAL_HEADER_CAPACITY * WAL_HEADER_COUNT;
private static final Logger LOGGER = LoggerFactory.getLogger(BlockWALService.class);
private final AtomicBoolean started = new AtomicBoolean(false);
private static final int WAL_STATE_INIT = 1;
private static final int WAL_STATE_STARTED = 2;
private static final int WAL_STATE_SHUTING_DOWN = 3;
private static final int WAL_STATE_SHUTDOWN = 4;
Copy link
Contributor

@Chillax-0v0 Chillax-0v0 Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use AtomicReference to maintain the state of WAL (as I mentioned above), then these int constants can also be replaced with a separate enum class.

private static final AtomicIntegerFieldUpdater<BlockWALService> WAL_STATE = AtomicIntegerFieldUpdater.newUpdater(BlockWALService.class, "state");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no need to use AtomicIntegerFieldUpdater here; an AtomicReference<State> should suffice. Here are a few reasons:

  • In our scenario, there won't be many WAL instances simultaneously (in fact, in most cases, there is only one). So a global static updater is unnecessary.
  • Compared to AtomicIntegerFieldUpdater, AtomicReference can provide better code readability and maintainability.
  • I conducted a simple benchmark, and there is no significant performance difference between them (in fact, for the most frequently called get method in our use case, AtomicReference performs slightly better).

private final AtomicBoolean resetFinished = new AtomicBoolean(false);
private final AtomicLong writeHeaderRoundTimes = new AtomicLong(0);
private final ExecutorService walHeaderFlusher = Threads.newFixedThreadPool(1, ThreadUtils.createThreadFactory("flush-wal-header-thread-%d", true), LOGGER);
Expand All @@ -132,7 +137,7 @@ public class BlockWALService implements WriteAheadLog {
* It is always aligned to the {@link WALUtil#BLOCK_SIZE}.
*/
private long recoveryCompleteOffset = -1;

private volatile int state = WAL_STATE_INIT;
private BlockWALService() {
}

Expand Down Expand Up @@ -273,10 +278,34 @@ private void parseRecordBody(long recoverStartOffset, RecordHeader readRecordHea

@Override
public WriteAheadLog start() throws IOException {
if (started.get()) {
LOGGER.warn("block WAL service already started");
return this;
switch (WAL_STATE.get(this)) {
case WAL_STATE_INIT:
if (WAL_STATE.compareAndSet(this, WAL_STATE_INIT, WAL_STATE_STARTED)) {
boolean success = false;
try {
doStart();
success = true;
} finally {
if (!success) {
WAL_STATE.compareAndSet(this, WAL_STATE_STARTED, WAL_STATE_INIT);
LOGGER.warn("block WAL service started fail");
}
}
}
break;
case WAL_STATE_STARTED:
LOGGER.warn("block WAL service already started");
break;
Copy link
Contributor

@Chillax-0v0 Chillax-0v0 Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the "started" state is primarily used to check whether the metadata required for WAL operation (e.g., WAL Header) is ready (see callers of checkStarted), in order to avoid data corruption caused by premature writes.

Therefore, the state can only be set to "started" after the doStart method executes successfully.

Adding an intermediate "starting" state might resolve the issue.

case WAL_STATE_SHUTING_DOWN:
case WAL_STATE_SHUTDOWN:
throw new IllegalStateException("block WAL service already shutdown");
default:
throw new IllegalStateException("invalid WAL state");
}
return this;
}

public void doStart() throws IOException {
StopWatch stopWatch = StopWatch.createStarted();

walChannel.open(channel -> Optional.ofNullable(tryReadWALHeader(walChannel))
Expand All @@ -295,10 +324,7 @@ public WriteAheadLog start() throws IOException {

header.setShutdownType(ShutdownType.UNGRACEFULLY);
walHeaderReady(header);

started.set(true);
LOGGER.info("block WAL service started, cost: {} ms", stopWatch.getTime(TimeUnit.MILLISECONDS));
return this;
}

private void registerMetrics() {
Expand Down Expand Up @@ -370,12 +396,33 @@ private void walHeaderReady(BlockWALHeader header) {

@Override
public void shutdownGracefully() {
StopWatch stopWatch = StopWatch.createStarted();

if (!started.getAndSet(false)) {
LOGGER.warn("block WAL service already shutdown or not started yet");
return;
for (; ; ) {
int state = WAL_STATE.get(this);
if (state == WAL_STATE_SHUTDOWN) {
LOGGER.warn("block WAL service already shutdown");
return;
}
if (state == WAL_STATE_SHUTING_DOWN
|| WAL_STATE.compareAndSet(this, state, WAL_STATE_SHUTING_DOWN)) {
break;
}
}
int state = WAL_STATE.get(this);
if (WAL_STATE.compareAndSet(this, state, WAL_STATE_SHUTDOWN)) {
boolean success = false;
try {
doShutdown();
success = true;
} finally {
if (!success) {
WAL_STATE.compareAndSet(this, state, WAL_STATE_SHUTING_DOWN);
}
}
}
}

private void doShutdown() {
StopWatch stopWatch = StopWatch.createStarted();
walHeaderFlusher.shutdown();
try {
if (!walHeaderFlusher.awaitTermination(5, TimeUnit.SECONDS)) {
Expand All @@ -393,6 +440,7 @@ public void shutdownGracefully() {
walChannel.close();

LOGGER.info("block WAL service shutdown gracefully: {}, cost: {} ms", gracefulShutdown, stopWatch.getTime(TimeUnit.MILLISECONDS));

}

@Override
Expand Down Expand Up @@ -522,7 +570,7 @@ private CompletableFuture<Void> trim(long offset, boolean internal) {
}

private void checkStarted() {
if (!started.get()) {
if (WAL_STATE.get(this) != WAL_STATE_STARTED) {
throw new IllegalStateException("WriteAheadLog has not been started yet");
}
}
Expand Down
Loading