Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reporting/CSV Export] exported search data is randomly missing rows #112186

Closed
tsullivan opened this issue Sep 14, 2021 · 4 comments · Fixed by #113675
Closed

[Reporting/CSV Export] exported search data is randomly missing rows #112186

tsullivan opened this issue Sep 14, 2021 · 4 comments · Fixed by #113675
Labels
bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead Feature:Reporting:CSV Reporting issues pertaining to CSV file export impact:critical This issue should be addressed immediately due to a critical level of impact on the product. loe:medium Medium Level of Effort needs-team Issues missing a team label v7.16.0

Comments

@tsullivan
Copy link
Member

tsullivan commented Sep 14, 2021

EDIT: removed the outdated description.

Steps to reproduce:

  1. Use the latest snapshot of Elasticsearch 8.0.0
  2. Load the test data and test saved objects:
    export TEST_KIBANA_URL=http://elastic:changeme@localhost:5601
    export TEST_ES_URL=http://elastic:changeme@localhost:9200
    node scripts/kbn_archiver.js --config x-pack/test/functional/config.js load x-pack/test/functional/fixtures/kbn_archiver/reporting/ecommerce
    node scripts/es_archiver.js --config x-pack/test/functional/config.js load x-pack/test/functional/es_archives/reporting/ecommerce
    
  3. Open Discover and view the ecommerce index
  4. Set the timepicker range to Jun 2, 2019 @ 18:54:23.161 / Jul 22, 2019 @ 07:03:07.850
    • search covers 4,675 hits
  5. From the menu at the top, click Share > CSV Reports > Generate CSV
  6. The exported CSV file contains less than 4,675 rows

Update: view #113675 (comment) for steps on running the skipped test.

@tsullivan tsullivan added the bug Fixes for quality problems that affect the customer experience label Sep 14, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 14, 2021
@tsullivan tsullivan added v7.15.0 and removed needs-team Issues missing a team label labels Sep 14, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 14, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-reporting-services (Team:Reporting Services)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServices)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Sep 14, 2021
@tsullivan tsullivan added impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. needs-team Issues missing a team label labels Sep 14, 2021
@botelastic botelastic bot removed the needs-team Issues missing a team label label Sep 14, 2021
@tsullivan tsullivan added v7.16.0 and removed v7.15.0 labels Sep 14, 2021
@tsullivan tsullivan changed the title [Reporting/CSV Export] Result data chunks are saved out of order and sometimes getting dropped [Reporting/CSV Export] Rows of CSV could be missing or have incorrect null values Sep 20, 2021
@exalate-issue-sync exalate-issue-sync bot added loe:small Small Level of Effort loe:large Large Level of Effort and removed loe:small Small Level of Effort labels Sep 20, 2021
@tsullivan
Copy link
Member Author

tsullivan commented Sep 20, 2021

Thanks to research done by @jloleysens, we're realizing that these test failures are due to ES returning shard failures. That explanation lines up to the evidence that we see random rows of CSV missing, only some of the time. I have also seen CSV rows that weren't missing, but had null values in random cells where there should have been 0 - and that kind of thing.

We have a few fixes to aim for:

  • Stabilize the tests: adjust the tests to perform fewer scrolls (use a smaller date range and return back a smaller number of total records)
  • Handle the shard failures when they happen, by capturing them in the generate_csv module, and add them as warning text to the completed reports.
    • Elasticsearch provides the shard failures in a search response, separately from the hits information.
  • Use the point-in-time API to page through sets of data for CSV export.
    • Scan-and-Scroll may still be necessary as a backup strategy when exporting non-timebased data.

@exalate-issue-sync exalate-issue-sync bot changed the title [Reporting/CSV Export] Rows of CSV could be missing or have incorrect null values [Reporting/CSV Export] Shard Failures are Sep 20, 2021
@exalate-issue-sync exalate-issue-sync bot changed the title [Reporting/CSV Export] Shard Failures are [Reporting/CSV Export] Shard failures are silently ignored Sep 20, 2021
@tsullivan tsullivan changed the title [Reporting/CSV Export] Shard failures are silently ignored [Reporting/CSV Export] exported search data is randomly missing rows Sep 22, 2021
@tsullivan
Copy link
Member Author

In stabilizing the CSV tests, we are only going to keep a single test that is responsible for testing a "large" export. That test is going to be skipped until this issue is resolved: 428c0d9

@exalate-issue-sync exalate-issue-sync bot added impact:critical This issue should be addressed immediately due to a critical level of impact on the product. and removed impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:large Large Level of Effort labels Sep 27, 2021
@exalate-issue-sync exalate-issue-sync bot added the loe:medium Medium Level of Effort label Sep 29, 2021
@sophiec20 sophiec20 added the Feature:Reporting:CSV Reporting issues pertaining to CSV file export label Aug 21, 2024
@botelastic botelastic bot added the needs-team Issues missing a team label label Aug 21, 2024
@sophiec20 sophiec20 added (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead and removed (Deprecated) Team:Reporting Services labels Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead Feature:Reporting:CSV Reporting issues pertaining to CSV file export impact:critical This issue should be addressed immediately due to a critical level of impact on the product. loe:medium Medium Level of Effort needs-team Issues missing a team label v7.16.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants