Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit failover events into the job autoscaler events to allow autoscaling decisions #676

Merged
merged 10 commits into from
Jul 10, 2024

Conversation

hmitnflx
Copy link
Collaborator

@hmitnflx hmitnflx commented Jun 8, 2024

Context

Explain context and other details for this pull request.

Checklist

  • ./gradlew build compiles code correctly
  • Added new tests where applicable
  • ./gradlew test passes all tests
  • Extended README or added javadocs where applicable

Copy link

github-actions bot commented Jun 8, 2024

Test Results

534 tests  ±0   528 ✅ ±0   7m 55s ⏱️ -2s
139 suites ±0     6 💤 ±0 
139 files   ±0     0 ❌ ±0 

Results for commit 6ad867a. ± Comparison against base commit 985c5e9.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jun 8, 2024

Uploaded Artifacts

To use these artifacts in your Gradle project, paste the following lines in your build.gradle.

resolutionStrategy {
    force "io.mantisrx:mantis-client:0.1.0-20240710.230651-548"
    force "io.mantisrx:mantis-common:0.1.0-20240710.230651-547"
    force "io.mantisrx:mantis-common-serde:0.1.0-20240710.230651-547"
    force "io.mantisrx:mantis-discovery-proto:0.1.0-20240710.230651-547"
    force "io.mantisrx:mantis-network:0.1.0-20240710.230651-547"
    force "io.mantisrx:mantis-runtime:0.1.0-20240710.230651-548"
    force "io.mantisrx:mantis-remote-observable:0.1.0-20240710.230651-548"
    force "io.mantisrx:mantis-runtime-loader:0.1.0-20240710.230651-548"
    force "io.mantisrx:mantis-rxcontrol:0.1.0-20240710.230651-21"
    force "io.mantisrx:mantis-runtime-executor:0.1.0-20240710.230651-83"
    force "io.mantisrx:mantis-shaded:0.1.0-20240710.230651-546"
    force "io.mantisrx:mantis-testcontainers:0.1.0-20240710.230651-217"
    force "io.mantisrx:mantis-connector-iceberg:0.1.0-20240710.230651-546"
    force "io.mantisrx:mantis-connector-kafka:0.1.0-20240710.230651-548"
    force "io.mantisrx:mantis-connector-publish:0.1.0-20240710.230651-547"
    force "io.mantisrx:mantis-control-plane-client:0.1.0-20240710.230651-547"
    force "io.mantisrx:mantis-connector-job:0.1.0-20240710.230651-548"
    force "io.mantisrx:mantis-control-plane-dynamodb:0.1.0-20240710.230651-8"
    force "io.mantisrx:mantis-control-plane-core:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-core:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-control-plane-server:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-jobconnector-sample:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-mantis-publish-sample:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-synthetic-sourcejob:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-twitter-sample:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-groupby-sample:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-wordcount:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-publish-core:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-publish-netty:0.1.0-20240710.230651-540"
    force "io.mantisrx:mantis-publish-netty-guice:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-server-agent:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-server-worker-client:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-source-job-kafka:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-source-job-publish:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-examples-sine-function:0.1.0-20240710.230651-541"
    force "io.mantisrx:mantis-control-plane-store-dynamodb:0.1.0-20240710.230651-35"
}

@hmitnflx hmitnflx had a problem deploying to Integrate Pull Request June 11, 2024 11:34 — with GitHub Actions Failure
@hmitnflx hmitnflx changed the base branch from master to scaling June 11, 2024 23:43
@hmitnflx hmitnflx changed the title [Not Ready For Review] Failover Emit failover events into the job autoscaler events to allow autoscaling decisions Jun 11, 2024
@@ -341,7 +341,12 @@ public int getDesiredWorkersForScaleUp(final int increment, final int numCurrent
return numCurrentWorkers;
} else {
final int maxWorkersForStage = stageSchedulingInfo.getScalingPolicy().getMax();
desiredWorkers = Math.min(numCurrentWorkers + increment, maxWorkersForStage);
if (reason == ScalingReason.AutoscalerManager) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: this condition is no longer correct.

@@ -86,7 +86,7 @@ public enum ScalingReason {
RPS,
JVMMemory,
SourceJobDrop,
FailoverAware
AutoscalerManager
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoscalerManager vs RPS/JVMMemory seems a bit strange?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "AutoscalerManagerEvent"

@@ -39,6 +39,10 @@ default boolean isScaleDownEnabled() {
return true;
}

default double getCurrentValue() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a document here about why it's -1?

Base automatically changed from scaling to master July 4, 2024 18:11
@hmitnflx hmitnflx temporarily deployed to Integrate Pull Request July 9, 2024 06:57 — with GitHub Actions Inactive
@@ -416,6 +417,19 @@ public void onNext(MetricData metricData) {
};
}

private void maybeEmitAutoscalerManagerEvent(int numWorkers) {
final double currentValue = jobAutoscalerManager.getCurrentValue();
if (currentValue >= 0.0 && currentValue <= 100.0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a inline doc about this logic? why it's 0 to 100?

@hmitnflx hmitnflx temporarily deployed to Integrate Pull Request July 10, 2024 23:01 — with GitHub Actions Inactive
@hmitnflx hmitnflx temporarily deployed to Integrate Pull Request July 10, 2024 23:06 — with GitHub Actions Inactive
@hmitnflx hmitnflx merged commit c1c7f34 into master Jul 10, 2024
7 checks passed
@hmitnflx hmitnflx deleted the failover branch July 10, 2024 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants