Better visibility into test failures over time #11217

andrross · 2023-11-15T17:09:26Z

I've created a script that crawls the OpenSearch Jenkins builds to find test failures, but only for the Gradle checks that run on code after it is pushed to the main branch. This filters out failures that are due to unmerged code in work-in-progress PRs.

I've included below the output after crawling 2000 recent builds (approx. Oct 16 - Nov 14). This data is very hard to follow, but one thing in particular stands out: SearchQueryIT.testCommonTermsQuery is a frequently failing test, but only since build 29184 (Oct 28). There are no failures before that, which strongly suggests something was changed around Oct 28 that introduced the flakiness. ~~I haven't started to look but I suspect we'll be able to find the cause pretty quickly given that there is a point in time to start looking at.~~ Update Nov 16: the root cause was an unrelated change for concurrent search randomly increased the number of deleted documents and exposed some underlying brittleness in this test: #11233 Diagnosing the root cause was a bit tricky and required diving into the specifics of how the common terms query works, but it was indeed much simpler once the flakiness was correlated to a small date range and then a specific commit.

Surely there are better tools for visualizing test reports over time, perhaps already built into Jenkins? Also, we don't push that many commits so the sample size on builds after pushes to main isn't that large. Something like a nightly job to run the test suite 10 or 50 or 100 times and create a report on failures would help to quickly surface newly introduced flakiness.

$ ruby ~/flaky-test-finder-push-trigger-main.rb -s 27990 -e 29990

24 org.opensearch.indices.replication.SegmentReplicationIT.testSendCorruptBytesToReplica (28239,28239,28239,28239,28645,28645,28645,28645,28702,28702,28702,28702,28875,28875,28875,28875,28894,28894,28894,28894,28897,28897,28897,28897)
17 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering} (28276,28276,28276,28276,28278,28278,28278,28278,28765,28962,28962,28962,28962,28989,28989,28989,28989)
16 org.opensearch.repositories.s3.S3BlobStoreRepositoryTests.testRequestStats (28259,28259,28259,28259,28276,28276,28276,28276,28316,28316,28316,28316,28368,28368,28368,28368)
12 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"true"}} (28051,28184,28251,28481,28502,28576,28727,28765,28766,28797,28841,28894)
9 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock (28051,28576,28702,28713,28875,28897,29428,29666,29846)
9 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.nodes/10_basic/Test cat nodes output} (28276,28276,28276,28276,28278,28278,28278,28278,28765)
9 org.opensearch.index.shard.RemoteIndexShardTests.classMethod (28716,28716,28897,28897,28966,28966,29666,29666,29666)
8 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"false"}} (28051,28481,28576,28765,28766,28797,28841,28894)
7 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.nodes/10_basic/Additional disk information} (28276,28276,28276,28276,28278,28278,28765)
7 org.opensearch.search.query.SearchQueryIT.testCommonTermsQuery {p0={"search.concurrent_segment_search.enabled":"true"}} (29184,29324,29343,29378,29506,29846,29954)
7 org.opensearch.search.query.SearchQueryIT.testCommonTermsQuery {p0={"search.concurrent_segment_search.enabled":"false"}} (29184,29324,29343,29378,29506,29846,29954)
6 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.classMethod (28797,28797,28797,28841,28841,28841)
6 org.opensearch.cluster.service.MasterServiceTests.testClusterStateBatchedUpdates (28899,28905,28966,28989,28994,29003)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all} (28765,28989,28989,28989,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats filtered using both includes and excludes filters} (28278,28278,28278,28278,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/30_discovery/Discovery stats} (28765,28962,28966,28989,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.allocation/10_basic/Node ID} (28276,28276,28276,28276,28278)
4 org.opensearch.cluster.MinimumClusterManagerNodesIT.classMethod (28897,28897,28897,28897)
4 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation (28765,28766,29432,29508)
3 org.opensearch.index.shard.RemoteIndexShardTests.testSegRepSucceedsOnPreviousCopiedFiles (28716,28897,28966)
3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.testCancelReplicationWhileFetchingMetadata (29070,29132,29274)
3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.classMethod (29070,29132,29378)
3 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - blank} (28278,28765,28962)
3 org.opensearch.remotestore.RemoteIndexRecoveryIT.testSnapshotRecovery (28481,29432,29655)
3 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled (28502,29561,29666)
3 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testFullRestartDuringReplication (28671,28716,29561)
3 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all} (28702,28875,29132)
3 org.opensearch.search.aggregations.bucket.DiversifiedSamplerIT.testNestedDiversity {p0={"search.concurrent_segment_search.enabled":"true"}} (28706,28727,29343)
3 org.opensearch.search.aggregations.bucket.DiversifiedSamplerIT.testSimpleDiversity {p0={"search.concurrent_segment_search.enabled":"true"}} (28706,28727,29343)
2 org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterRestoreGlobalMetadata (29595,29655)
2 org.opensearch.index.shard.RemoteIndexShardTests.testRepicaCleansUpOldCommitsWhenReceivingNew (28239,29293)
2 org.opensearch.indices.replication.SegmentReplicationSuiteIT.classMethod (28716,29561)
2 org.opensearch.search.nested.SimpleNestedIT.testSimpleNestedSortingWithNestedFilterMissing {p0={"search.concurrent_segment_search.enabled":"true"}} (28682,29508)
1 org.opensearch.search.profile.query.QueryProfilerTests.testBasic {p0=5} (29044)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadBlobWithRetries (29132)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimaryMultipleReplicaShards (29132)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs (29132)
1 org.opensearch.index.reindex.ReindexBasicTests.testMultipleSources (29177)
1 org.opensearch.index.reindex.ReindexBasicTests.testFiltering (29177)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadNonexistentBlobThrowsNoSuchFileException (29184)
1 org.opensearch.action.admin.indices.create.RemoteShrinkIndexIT.testCreateShrinkIndex (29279)
1 org.opensearch.action.admin.indices.create.RemoteShrinkIndexIT.classMethod (29279)
1 org.opensearch.discovery.ClusterDisruptionIT.classMethod (29293)
1 org.opensearch.search.SearchWeightedRoutingIT.testSearchAggregationWithNetworkDisruption_FailOpenEnabled (29293)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadRangeBlobWithRetries (29324)
1 org.opensearch.monitor.fs.FsHealthServiceTests.testFailsHealthOnHungIOBeyondHealthyTimeout (29324)
1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.testCancelReplicationWhileSyncingSegments (29378)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@521ba38f} (29417)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndexToN (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitFromOneToN (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitIndexPrimaryTerm (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod (29536)
1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled (29595)
1 org.opensearch.index.shard.RemoteIndexShardTests.testSegmentReplication_With_EngineClosedConcurrently (29666)
1 org.opensearch.index.shard.IndexShardTests.testCommitLevelRestoreShardFromRemoteStore (29729)
1 org.opensearch.index.translog.RemoteFsTranslogTests.testMetadataFileDeletion (28027)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@1d1c37d5} (29821)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testWriteLargeBlob (28051)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@c83ed77} (28521)
1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"false"}} (28576)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimarySingleReplica (28671)
1 org.opensearch.remotestore.multipart.RemoteStoreMultipartIT.testRestoreSnapshotToIndexWithSameNameDifferentUUID (28706)
1 org.opensearch.search.basic.SearchWithRandomIOExceptionsIT.testRandomDirectoryIOExceptions {p0={"search.concurrent_segment_search.enabled":"true"}} (28706)
1 org.opensearch.search.basic.SearchWithRandomIOExceptionsIT.classMethod (28706)
1 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testBasicReplication (28716)
1 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testDeleteIndexWhileReplicating (28716)
1 org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterStateRestore (28727)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indexing doc_status} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/50_indexing_pressure/Indexing pressure stats} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - recovery} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats level} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/50_indexing_pressure/Indexing pressure memory limit} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - multi} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indices _all} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - one} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/40_store_stats/Store stats} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.fielddata/10_basic/Test cat fielddata output} (28765)
1 org.opensearch.test.rest.ClientYamlTestSuiteIT.test {p0=search.aggregation/20_terms/string profiler via global ordinals} (28765)
1 org.opensearch.action.bulk.BulkIntegrationIT.testDeleteIndexWhileIndexing (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkWithWriteIndexAndRouting (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testDocIdTooLong (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkIndexCreatesMapping (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkWithGlobalDefaults (28797)
1 org.opensearch.search.functionscore.DecayFunctionScoreIT.classMethod (28813)
1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue (28841)
1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback (28875)
1 org.opensearch.client.PitIT.testDeleteAllAndListAllPits (28899)
1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testContainerCreationAndDeletion (29044)

The text was updated successfully, but these errors were encountered:

andrross · 2024-01-30T19:07:53Z

The request here is very similar to this older issue: #3713

peternied · 2024-03-08T23:14:17Z

@andrross I created a repo [1] that collects project health information and published reports to its repo ever day - see the latest reports at https://github.com/peternied/contribution-rate?tab=readme-ov-file#reports

One such report is a last 30 days top failing test - here is the March 8 report. It will keep updating every day. Feel free to contribute any kind of reports you'd like to see.

[1] https://github.com/peternied/contribution-rate

prudhvigodithi · 2024-05-10T22:40:13Z

Hey @andrross @peternied we now have the gradle metrics published to OpenSearch Gradle Check Metrics dashboard, this is part of surfacing opensearch-metrics to community. Please check the current supported metrics. How about we expand this and add more metrics as required? and also we use this data for creating triggers like GitHub issues, comments etc.
@bbarani @Pallavi-AWS @dblock

andrross · 2024-05-17T20:16:51Z

@prudhvigodithi Do you think it would make sense to add some details about using the gradle check metrics dashboard to help investigate and fix flaky failures in either TESTING.md or DEVELOPER_GUIDE.md?

@dreamer-89 created a great list in #3713:

Identify top hitter for prioritization.
Identify commit introduced a flaky test or increase freq of existing test failure.
Build failure trend to identify health of software.
Developers impacted due to flaky tests.
Test history.

I think if we document somewhere how to use the new dashboard to solve those problems then we can close both of these issues as completed.

prudhvigodithi · 2024-05-30T20:30:14Z

Thanks @andrross and @dreamer-89, based on the list you have I have modified the gradle check workflow with new fields and created some new visualizations based on the indexed data (Thanks to @rishabh6788 for setting up the initial flow), please check the link OpenSearch Gradle Check Metrics.

Identify top hitter for prioritization

For this I have created a pie chart with the top test_class that has the majority of the failures, this chart should also have the top failing tests within this test_class, we can further slice and dice the data for getting the list of PR's and owners or with post merge, that has top failing tests upon filter.

Identify commit introduced a flaky test or increase freq of existing test failure

The following data tables should have the git commit, the associated PR and the PR owner with all the failing test details, we can filter per PR or commit to get the details of failed tests. The new visualization Gradle Check - Top test class failures with Post Merge also has the flaky test information, its associated commitID and PR (with owner) that was merged with this commitID with post merge (gradle check that ran after the PR is merged) action. We should be able to further drill down with test name or the test class name for more details.

Build failure trend to identify health of software

For this the dashboard has a TSVB and line chart with the trend for the failure tests, this can be again further filtered with test name, test class, commitID, PR and with executions with Post merge.

Developers impacted due to flaky tests

The entire visualizations can be filtered with PR owner, PR number or commitID. The results has the hyperlinks for the GitHub PR or commit where one can see the comments and other users. The dashboards also has the PR owner attached to see impacted user. The visualizations also has the hyperlinks with the jenkins build data where one can see all the stack trace details for the failed tests (example 39487).

Test history

All the visualizations in dashboard can be filtered by date range, using OpenSearch we get this out of the box :)
With this we can go back and see the trends and infer results based on it.

Adding @peternied @getsaurabh02 @dblock @Pallavi-AWS @reta

reta · 2024-05-31T12:54:49Z

@prudhvigodithi @rishabh6788 it looks great, thank you so much folks for putting it all together

andrross · 2024-05-31T16:33:42Z

@prudhvigodithi @rishabh6788 it looks great, thank you so much folks for putting it all together

Agreed, this is awesome!

prudhvigodithi · 2024-05-31T21:28:17Z

Thanks @reta and @andrross, I have a PR created with some details added to the DEVELOPER_GUIDE.md regarding this dashboard #13919, please check.

prudhvigodithi · 2024-05-31T22:19:19Z

Next step moving forward for surfacing the test failures as GitHub Issues instead of creating a very generic issue like #13893 (coming from https://github.com/opensearch-project/OpenSearch/blob/main/.github/workflows/gradle-check.yml#L161-L168) which sometimes fails to execute https://github.com/opensearch-project/OpenSearch/actions/runs/9320653340/job/25657907035, how about we use the following data table information to create a GitHub issue.

Here is the example: After finding the failed tests from Post Merge Actions

We should start by creating an issue at a test class level NestedQueryBuilderTests, link and keep updating all the commits and PR information to the issue created for NestedQueryBuilderTests.

1st to the issue created for NestedQueryBuilderTests, we can link all the post merge failures and commits.

2nd on the same issue for NestedQueryBuilderTests, we can add the failed tests which are part of NestedQueryBuilderTests and Jenkins build information for stacktrace.

3rd on the same issue, we can add other PR's information where this has or has been failing.

I'm open for ideas on whom to assign this created issue? Should we just keep it open without any assignee as each issue will have multiple PR and commits information. later during triaging the maintainer should be able to identify the right team/user and add as assignee.

Moving forward we can have a logic to auto close the created issue if in last 30 days there is no failure for the test class (NestedQueryBuilderTests in above example) found in post merge Gradle Check build and reopen as required.

@andrross @reta @dblock @getsaurabh02 @peternied let me know your thoughts on this.

Thank you

andrross · 2024-06-01T00:19:32Z

Now that we have the metrics and the updated developer guide, I'm going to close this and issue #3713. If anyone thinks there is more to do here please reopen or open a new issue. Thanks!

dblock · 2024-06-03T12:22:19Z

This is great!

Next step moving forward for surfacing the test failures as GitHub Issues instead of creating a very generic issue

I can't wait for this. It's something developers spend a lot of time on.

I'm open for ideas on whom to assign this created issue? Should we just keep it open without any assignee as each issue will have multiple PR and commits info

My 0.02c:

Open an issue if it doesn't exist (saves everyone a ton of time).
Comment on both the PR where gradle check failed and the issue. This way when we look at a flaky test we know how often it fails.
No need to auto assign anything to anyone. I find this super annoying today because when I merge someone's PR I am auto-assigned a gradle build failure that I cannot do anything about.

andrross · 2024-06-03T16:05:10Z

@prudhvigodithi I agree with @dblock, no need to auto assign the created issues.

prudhvigodithi · 2024-06-03T17:22:00Z

Thanks @dblock,

Open an issue if it doesn't exist (saves everyone a ton of time).

The idea is to open an issue (update if already exists) for each test class failure which failed on Post Merge Actions. The Post Merge Action failures are for sure the flaky ones. Dont want to keep creating the issues (at least initially) on every failed test on PR creation as the failures can be legit for a PR.

2. Comment on both the PR where gradle check failed and the issue. This way when we look at a flaky test we know how often it fails.

Since the Post Merge Action Gradle check is executed after the PR is merged, the suggestion here is to comment on the closed PR and link that back to the Issue created, with this we can have datapoints of the PR's added to the issue, is my understanding correct here @dblock @andrross ?

3. No need to auto assign anything to anyone. I find this super annoying today because when I merge someone's PR I am auto-assigned a gradle build failure that I cannot do anything about.

Make sense.

Thank you

dblock · 2024-06-03T17:37:21Z

The idea is to open an issue (update if already exists) for each test class failure which failed on Post Merge Actions. The Post Merge Action failures are for sure the flaky ones. Dont want to keep creating the issues (at least initially) on every failed test on PR creation as the failures can be legit for a PR.

Makes sense. What would be massively useful is to link existing flaky test issues when they fail in PRs. So maybe this could be the action run for every PR gradle check failure:

For any failed test, lookup an existing flaky test issue, if it exists, comment.
For any new failure, highlight it in comments with something like "new flaky test? please check and open one manually".

Since the Post Merge Action Gradle check is executed after the PR is merged, the suggestion here is to comment on the closed PR and link that back to the Issue created, with this we can have datapoints of the PR's added to the issue, is my understanding correct here @dblock @andrross ?

I think this is unnecessary because the PR most definitely didn't cause the flaky test and once merged nobody is going to be looking at it. I recommend doing (1) and (2) above.

prudhvigodithi · 2024-06-03T19:00:10Z

Since this issue is closed I have created a new issue #13950 (comment) for this topic of surfacing the flaky tests as github issues and we continue our discussion there.
Thanks
@getsaurabh02 @dblock @andrross @reta

msfroh · 2024-06-03T22:44:48Z

@prudhvigodithi, I just checked out the dashboard for the first time. It is amazing!

There is a little noise from cases where the open PR introduced failures. For example, looking at the last 7 days, it looks like ClientYamlTestSuiteIT is buggy, but most of that is coming from one build (https://build.ci.opensearch.org/job/gradle-check/39413/testReport/).

I added a new test that was failing due to a type mismatch, so I tried modifying the test framework. I fixed my test, but broke 1000+ other tests. (That fix obviously didn't get merged, but I accidentally skewed the statistics.)

andrross added enhancement Enhancement or improvement to existing feature or request discuss Issues intended to help drive brainstorming and decision making untriaged labels Nov 15, 2023

peterzhuamazon mentioned this issue Nov 27, 2023

Adding lucene snapshot to the ci staging repository #11241

Merged

1 task

peternied added :test Adding or fixing a test and removed untriaged labels Nov 30, 2023

drunken-monkey mentioned this issue Jan 29, 2024

Fix the "highlight.max_analyzer_offset" request parameter with "plain" highlighter #10919

Merged

8 tasks

reta mentioned this issue Feb 21, 2024

[BUG] Gradle check is unreliable #12410

Open

prudhvigodithi mentioned this issue May 29, 2024

Update the publish gradle check library mapping opensearch-project/opensearch-build-libraries#431

Merged

This was referenced May 31, 2024

Update the gradle-check.jenkinsfile library version to 6.4.8 opensearch-project/opensearch-build#4741

Merged

Update the README with newly added Gradle Check Metrics opensearch-project/opensearch-metrics#38

Merged

prudhvigodithi self-assigned this May 31, 2024

prudhvigodithi mentioned this issue May 31, 2024

Update the DEVELOPER_GUIDE.md to add Gradle Check Metrics Dashboard details #13919

Merged

8 tasks

andrross closed this as completed Jun 1, 2024

prudhvigodithi mentioned this issue Jun 3, 2024

Add additional details on Gradle Check failures autocut issues #13950

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better visibility into test failures over time #11217

Better visibility into test failures over time #11217

andrross commented Nov 15, 2023 •

edited

Loading

andrross commented Jan 30, 2024

peternied commented Mar 8, 2024

prudhvigodithi commented May 10, 2024 •

edited

Loading

andrross commented May 17, 2024

prudhvigodithi commented May 30, 2024 •

edited

Loading

reta commented May 31, 2024

andrross commented May 31, 2024

prudhvigodithi commented May 31, 2024 •

edited

Loading

prudhvigodithi commented May 31, 2024

andrross commented Jun 1, 2024

dblock commented Jun 3, 2024

andrross commented Jun 3, 2024

prudhvigodithi commented Jun 3, 2024

dblock commented Jun 3, 2024

prudhvigodithi commented Jun 3, 2024

msfroh commented Jun 3, 2024

Better visibility into test failures over time #11217

Better visibility into test failures over time #11217

Comments

andrross commented Nov 15, 2023 • edited Loading

andrross commented Jan 30, 2024

peternied commented Mar 8, 2024

prudhvigodithi commented May 10, 2024 • edited Loading

andrross commented May 17, 2024

prudhvigodithi commented May 30, 2024 • edited Loading

Identify top hitter for prioritization

Identify commit introduced a flaky test or increase freq of existing test failure

Build failure trend to identify health of software

Developers impacted due to flaky tests

Test history

reta commented May 31, 2024

andrross commented May 31, 2024

prudhvigodithi commented May 31, 2024 • edited Loading

prudhvigodithi commented May 31, 2024

Here is the example: After finding the failed tests from Post Merge Actions

andrross commented Jun 1, 2024

dblock commented Jun 3, 2024

andrross commented Jun 3, 2024

prudhvigodithi commented Jun 3, 2024

dblock commented Jun 3, 2024

prudhvigodithi commented Jun 3, 2024

msfroh commented Jun 3, 2024

andrross commented Nov 15, 2023 •

edited

Loading

prudhvigodithi commented May 10, 2024 •

edited

Loading

prudhvigodithi commented May 30, 2024 •

edited

Loading

prudhvigodithi commented May 31, 2024 •

edited

Loading