Rename onebox telemetry metadata fields #5770

thenamankumar · 2024-10-01T16:06:41Z

This PR renames a bunch of OneBox telemetry events metadata fields and make them camelcase.

This PR also removes the check isDotCom from the promptText private metadata field passed with the chat-question:executed event. None of the private metadata fields are sent to SG through the telemetery pipeline, so that data is still private, similar to the other present privateMetadata field. The only change is that the promptText will be recorded in the event_logs on that sourcegraph instance.

We need to remove the check so that we can selectively collect that field for the S2 instance, configured by a env var here: https://github.com/sourcegraph/cloud/pull/15143

Test plan

Make sure all the updated events are still recorded correctly.

Changelog

akalia25 · 2024-10-01T16:30:54Z

Hey @thenamankumar I thought we were only interested in collecting query (or now promptText) for the event onebox.intentCorrection, curious why the scope has changed to now include cody.chat-question?

akalia25 · 2024-10-01T16:33:03Z

vscode/src/chat/chat-view/ChatController.ts

-                // 🚨 SECURITY: chat transcripts are to be included only for DotCom users AND for V2 telemetry
-                // V2 telemetry exports privateMetadata only for DotCom users
-                // the condition below is an additional safeguard measure
-                promptText:
-                    isDotCom(authStatus) &&
-                    (await truncatePromptString(inputText, CHAT_INPUT_TOKEN_BUDGET)),
+                promptText: await truncatePromptString(inputText, CHAT_INPUT_TOKEN_BUDGET),


We have been really strict on how we collect sensitive data like this, to ensure, it is never collected from customers. I do understand that privateMetadata is only collected for dotcom instances, but to relax this additional dotcom check for transcripts, I will need to confirm with the team we are okay with this.

Yes @akalia25 , I have tagged the @sourcegraph/security-code-review to get approvals. We can do according to the suggestions.

To confirm again, @chenkc805 do we really need promptText for the chat-question:executed events on S2? Along with the detectedIntent, userSpecifiedIntent and intentScores?

context: https://linear.app/sourcegraph/issue/SRCH-1028/log-telemetry-of-the-output-of-the-intent-detection-model-with-each

Hey @thenamankumar could we add an isS2() clause (similar to isDotCom()), that would allow for transcripts promptText/responseText to be stored for both dotcom instance and S2.

This would help avoid collecting this data on customer's local PG instance and keeping it extra secure.

something like this: isDotCom(authStatus) || isS2(authStatus))

I have removed this change for now from this PR.

RXminuS · 2024-10-02T18:15:49Z

@thenamankumar we should maybe sync on this tomorrow because I just removed the field all together after was told it was unused. #5558 Also there's a new way of invoking and testing these now, happy to walk you through it or have a look in lib/shared/src/telemetry-v2/events/at-mention.ts and vscode/e2e/telemetry/at-mention.test.ts

github-actions · 2024-10-03T08:31:00Z

‼️ Hey @sourcegraph/cody-security, please review this PR carefully as it introduces the usage of an unsafe_ function or abuses PromptString.

Removed then requested changes, unblocking to merge the rest renames.

thenamankumar requested review from camdencheek, akalia25 and a team October 1, 2024 16:06

camdencheek approved these changes Oct 1, 2024

View reviewed changes

akalia25 previously requested changes Oct 1, 2024

View reviewed changes

Rename onbox telemetry metadata fields

82494e2

thenamankumar force-pushed the naman/onebox-telemetry-renames branch 2 times, most recently from 07c50ad to 82494e2 Compare October 3, 2024 08:30

thenamankumar merged commit 81fb2e9 into main Oct 3, 2024
20 checks passed

thenamankumar deleted the naman/onebox-telemetry-renames branch October 3, 2024 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename onebox telemetry metadata fields #5770

Rename onebox telemetry metadata fields #5770

thenamankumar commented Oct 1, 2024 •

edited

Loading

akalia25 commented Oct 1, 2024

akalia25 Oct 1, 2024 •

edited

Loading

thenamankumar Oct 1, 2024

akalia25 Oct 1, 2024

akalia25 Oct 1, 2024

thenamankumar Oct 3, 2024

RXminuS commented Oct 2, 2024 •

edited

Loading

github-actions bot commented Oct 3, 2024

Rename onebox telemetry metadata fields #5770

Rename onebox telemetry metadata fields #5770

Conversation

thenamankumar commented Oct 1, 2024 • edited Loading

Test plan

Changelog

akalia25 commented Oct 1, 2024

akalia25 Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

thenamankumar Oct 1, 2024

Choose a reason for hiding this comment

akalia25 Oct 1, 2024

Choose a reason for hiding this comment

akalia25 Oct 1, 2024

Choose a reason for hiding this comment

thenamankumar Oct 3, 2024

Choose a reason for hiding this comment

RXminuS commented Oct 2, 2024 • edited Loading

github-actions bot commented Oct 3, 2024

thenamankumar commented Oct 1, 2024 •

edited

Loading

akalia25 Oct 1, 2024 •

edited

Loading

RXminuS commented Oct 2, 2024 •

edited

Loading