Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[microsoft_exchange_online_message_trace] Remove event.start/end #12446

Conversation

chrisberkhout
Copy link
Contributor

@chrisberkhout chrisberkhout commented Jan 23, 2025

Proposed commit message

[microsoft_exchange_online_message_trace] Remove event.start/end

Don't set `event.start` and `event.end` to `StartDate` and `EndDate`,
because those are query parameters, not properties of the event itself.

Also, extend the description of the `additional_look_back` setting to
note that up to 24h may be necessary (according to the API
documentation[1]).

[1]: https://learn.microsoft.com/en-us/previous-versions/office/developer/o365-enterprise-developers/jj984335%28v%3doffice.15%29

Discussion

Elsewhere I suggested standard solutions for pagination when there is:

  1. a delay between data creation and data availability, and/or
  2. an unstable a period after initial creation of the data.

image

I suggested switching from the look-back strategy (introduced in #8717) to the standard solutions, in order to reduce fetching of the same data multiple times (and reduce indexing failures due to duplicate _id values).

However, the standard solution for delayed data availability isn't possible, because there is no creation or update time associated with the log entries we receive. StartDate and EndDate are query parameters. In the API documentation these fields are marked "[In]" rather than "[In/Out]" (input parameters and report output columns). We just see them echoed back in the response.

This PR cleans up incorrect population of ECS fields with StartDate and EndDate.

Shall we drop microsoft.online_message_trace.StartDate and microsoft.online_message_trace.EndDate as well?

The standard solution for an unstable period is complicated by the fact that this period can be quite long. In #8717 it was reported to usually be less than an hour, but some users on the Internet report delays of multiple hours and the API documentation says up to 24 hours.

So, regarding pagination and duplicate _id values, the choice is between:

  • a long (up to 24h) delay for all data (by only fetching data once it's known to be there), or
  • indexing rejections due to duplicate _id values (using the look-back strategy).

In this case I think the look-back strategy is not a bad one.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

Related issues

@chrisberkhout chrisberkhout added Integration:microsoft_exchange_online_message_trac Microsoft Exchange Online Message Trace bugfix Pull request that fixes a bug issue Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations] labels Jan 23, 2025
@chrisberkhout chrisberkhout self-assigned this Jan 23, 2025
@chrisberkhout chrisberkhout requested a review from a team as a code owner January 23, 2025 19:29
@elasticmachine
Copy link

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@elastic-vault-github-plugin-prod

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

cc @chrisberkhout

Copy link
Contributor

@ShourieG ShourieG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my end but I think we should wait for @efd6 to review since he has more context with regards to this PR that introduced these fields.

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@chrisberkhout chrisberkhout merged commit 1c3e25d into elastic:main Jan 27, 2025
5 checks passed
@elastic-vault-github-plugin-prod

Package microsoft_exchange_online_message_trace - 1.25.3 containing this change is available at https://epr.elastic.co/package/microsoft_exchange_online_message_trace/1.25.3/

harnish-elastic pushed a commit to harnish-elastic/integrations that referenced this pull request Feb 4, 2025
…stic#12446)

Don't set `event.start` and `event.end` to `StartDate` and `EndDate`,
because those are query parameters, not properties of the event itself.

Also, extend the description of the `additional_look_back` setting to
note that up to 24h may be necessary (according to the API
documentation[1]).

[1]: https://learn.microsoft.com/en-us/previous-versions/office/developer/o365-enterprise-developers/jj984335%28v%3doffice.15%29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Pull request that fixes a bug issue Integration:microsoft_exchange_online_message_trac Microsoft Exchange Online Message Trace Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants