Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Automatic Import] Fixes the CSV header bug #212513

Merged

Conversation

ilyannn
Copy link
Contributor

@ilyannn ilyannn commented Feb 26, 2025

Release Note

Automatic Import now produces correct pipeline to parse CSV files with special characters in column names.

Summary

Fixes #211911

Now the pipeline works as expected:

  - csv:
      tag: parse_csv
      field: message
      target_fields:
        - ai_202502211453.logs._timestamp
        - ai_202502211453.logs.message
      description: Parse CSV input
  - drop:
      ignore_failure: true
      if: >-
        ctx.ai_202502211453?.logs?._timestamp == '@timestamp' &&
        ctx.ai_202502211453?.logs?.message == 'message'
      tag: remove_csv_header
      description: Remove the CSV header line by comparing the values

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
  • Flaky Test Runner was used on any tests changed
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

Details

The CSV processing is now a three-stage process:

  1. Parse the samples with the temporary column names of the form column1.
  2. Test parsing with the actual pipeline that parses into package.dataStream.columnName.
  3. Convert the samples into JSON form {"columnName": "value", ...} for further processing.

Tests

There are unit tests tests for the CSV functionality that include a mock CSV processing pipeline.

@ilyannn ilyannn added release_note:fix backport:prev-major Backport to (8.x, 8.18, 8.17, 8.16) the previous major branch and other branches in development labels Feb 26, 2025
@ilyannn ilyannn self-assigned this Feb 26, 2025
@ilyannn ilyannn requested a review from a team as a code owner February 26, 2025 13:23
@elasticmachine
Copy link
Contributor

elasticmachine commented Feb 26, 2025

💔 Build Failed

Failed CI Steps

History

cc @ilyannn

Copy link
Member

@P1llus P1llus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Good one

@ilyannn ilyannn enabled auto-merge (squash) February 28, 2025 12:32
@ilyannn ilyannn merged commit ab46dde into elastic:main Feb 28, 2025
9 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.16, 8.17, 8.18, 8.x

https://github.com/elastic/kibana/actions/runs/13590247953

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.16 Backport failed because of merge conflicts
8.17 Backport failed because of merge conflicts

You might need to backport the following PRs to 8.17:
- [ResponseOps][Rules] Do not show connector not registered in action connectors modal (#212660)
- [Search] Don't error out on missing connectors permissions (#212622)
8.18 Backport failed because of merge conflicts

You might need to backport the following PRs to 8.18:
- [ResponseOps][Rules] Do not show connector not registered in action connectors modal (#212660)
- [Obs AI Assistant] Show loader to fix the flicker in KB settings tab (#212678)
- [Search] Don't error out on missing connectors permissions (#212622)
- Fix panel to remove query filter for .NET OTel runtime metrics on curated dashboard (#212529)
- [ Siem Migrations ] Upsell Siem Migrations Start (#212607)
- [Cloud Security] Add upgrade agentless deployment background task (#207143)
8.x Backport failed because of merge conflicts

You might need to backport the following PRs to 8.x:
- [ResponseOps][Rules] Do not show connector not registered in action connectors modal (#212660)
- [Obs AI Assistant] Update delete confirmation modal (#212695)
- [Obs AI Assistant] Show loader to fix the flicker in KB settings tab (#212678)
- [Obs AI Assistant] Enable syntax highlighting for ES|QL (#212669)
- [Search] Don't error out on missing connectors permissions (#212622)
- [Onboarding] Hide the semantic_text banner if there exists a semantic_text field (#210676)
- [ Siem Migrations ] Upsell Siem Migrations Start (#212607)
- FTR - optimize service initialization (#212421)
- [Streams 🌊] Remove enablement check in PUT /api/streams/{id} for classic streams (#212289)

Manual backport

To create the backport manually run:

node scripts/backport --pr 212513

Questions ?

Please refer to the Backport tool documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-major Backport to (8.x, 8.18, 8.17, 8.16) the previous major branch and other branches in development release_note:fix v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Automatic Import] Problems generating CSV mapping with @timestamp field
4 participants