Add Simple TEE VID Labeling App #1991

stevenwarejones · 2025-01-03T20:23:02Z

No description provided.

…labeling_tee_app

wfa-reviewable · 2025-01-03T20:23:10Z

This change is

…labeling_tee_app

SanjayVas

Reviewable status: 0 of 19 files reviewed, 1 unresolved discussion (waiting on @kungfucraig and @stevenwarejones)

a discussion (no related file):
For dependent PRs in the same repo, you need to base the PR branch on the parent PR branch and set the base branch of the PR in GitHub. This way the PR doesn't include diffs from the parent.

Additionally, the "depends on" should not be in the main PR description as that is copied to the commit message. The statement will no longer be relevant after the base PR is merged. Therefore, it should be in a separate PR comment.

…labeling_tee_app

stevenwarejones

Reviewable status: 0 of 19 files reviewed, 1 unresolved discussion (waiting on @kungfucraig, @Marco-Premier, and @SanjayVas)

a discussion (no related file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

For dependent PRs in the same repo, you need to base the PR branch on the parent PR branch and set the base branch of the PR in GitHub. This way the PR doesn't include diffs from the parent.

Additionally, the "depends on" should not be in the main PR description as that is copied to the commit message. The statement will no longer be relevant after the base PR is merged. Therefore, it should be in a separate PR comment.

Done.

SanjayVas

Reviewed 3 of 19 files at r1, 9 of 13 files at r3, 4 of 5 files at r4, all commit messages.
Reviewable status: 16 of 19 files reviewed, 16 unresolved discussions (waiting on @kungfucraig, @Marco-Premier, and @stevenwarejones)

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/BUILD.bazel line 4 at r4 (raw file):

package(
    default_testonly = True,

I assume this isn't intended to only be used in tests. Also public visibility is for generally usable Bazel library code. Just leave out the package declaration for now and add it in when you know what Bazel packages would use this.

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/BUILD.bazel line 3 at r4 (raw file):

load("@wfa_rules_kotlin_jvm//kotlin:defs.bzl", "kt_jvm_test")

package(

Same note re: package declaration in the test tree.

src/main/proto/wfa/measurement/securecomputation/teeapps/v1alpha/tee_app_config.proto line 17 at r4 (raw file):

syntax = "proto3";

package wfa.measurement.securecomputation.teeapps.v1alpha;

Is this part of the versioned gRPC API? If not, you can put it in a separate config package instead of a versioned package.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 58 at r4 (raw file):

    storageClient: MesosRecordIoStorageClient
  ) {
    val inputBlob =

Specify types where not obvious, i.e. where the type isn't named on the right side of the assignment via a constructor or well-known factory.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 79 at r4 (raw file):

  override suspend fun runWork(message: TriggeredApp) {
    val teeAppConfig = message.config.unpack(TeeAppConfig::class.java)
    assert(teeAppConfig.workTypeCase == TeeAppConfig.WorkTypeCase.VID_LABELING_CONFIG)

We don't use assertions outside of tests. Use require or check instead to enforce a precondition/invariant.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 82 at r4 (raw file):

    val vidLabelingConfig = teeAppConfig.vidLabelingConfig
    val compiledNode: CompiledNode =
      @Suppress("WHEN_ENUM_CAN_BE_NULL_IN_JAVA")

Document suppression reason.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 88 at r4 (raw file):

          val modelData =
            vidModelBlob.read().reduce { acc, byteString -> acc.concat(byteString) }.toStringUtf8()
          CompiledNode.getDefaultInstance()

Use parseTextProto from common-jvm

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 61 at r4 (raw file):

class VidLabelerAppTest() {

  @Rule @JvmField val pubSubEmulatorProvider = GooglePubSubEmulatorProvider()

This is meant to be used as a @ClassRule so you share the emulator instance across test methods.

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 71 at r4 (raw file):

  @Before
  fun setup() {

Use descriptive names for @Before and @After methods

Code quote:

setup

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 93 at r4 (raw file):

  @Test
  fun vidLabelWithTextProtoModel() {

Same note re: test method naming.

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 94 at r4 (raw file):

  @Test
  fun vidLabelWithTextProtoModel() {
    runBlocking {

If the body of the whole function is contained inside runBlocking, the convention is to use a function expression. IntelliJ has a context action to perform this refactor for you.

fun `SUT does foo when bar`() = runBlocking {
  // ...
}

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 120 at r4 (raw file):

          val indexString = "%02d".format(index)
          val inputFileName = "labeler_input_$indexString.textproto"
          parseTextProto(

Use the overload of parseTextProto that takes a File, passing the default instance.

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 121 at r4 (raw file):

          val inputFileName = "labeler_input_$indexString.textproto"
          parseTextProto(
            Paths.get("$TEXTPROTO_PATH/$inputFileName").toFile().bufferedReader(),

If TEXTPROTO_PATH is a Path instance, you can use resolve()

Suggestion:

TEXTPROTO_PATH.resolve(inputFileName)

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 146 at r4 (raw file):

      publisher.publishMessage(topicId, work)

      withTimeout(10000) {

This is the wrong way to handle test timeouts. The final timeout is configured by the Bazel runner via the size and/or timeout attributes on the Bazel target. More specific timeouts can be specified using JUnit timeout facilities (pass timeout to the @Test annotation or use the Timeout rule)

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 147 at r4 (raw file):

      withTimeout(10000) {
        while (true) {

Is there some mechanism you can introduce to avoid having to poll here? e.g. having an in-memory QueueSubscriber impl so you can track when messages are acked?

We avoid polling in tests unless it's absolutely necessary, such as when dealing with isolated processes.

src/test/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerAppTest.kt line 174 at r4 (raw file):

                  labelerOutput {},
                )
              ProtoTruth.assertThat(output).isEqualTo(expectedOutput)

nit: import this symbol

Code quote:

ProtoTruth.assertThat

kungfucraig

Reviewed 2 of 5 files at r4.
Reviewable status: 16 of 19 files reviewed, 24 unresolved discussions (waiting on @Marco-Premier and @stevenwarejones)

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 34 at r4 (raw file):

/*
 * TEE VID Labeling App.

Can you describe what it does?

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 36 at r4 (raw file):

 * TEE VID Labeling App.
 */
class VidLabelerApp(

We've already spelled out Application in other places. We should be consistent wrt App vs. Application.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 65 at r4 (raw file):

    val outputFlow =
      inputRecords.map { byteString ->
        val labelerInput =

Do you really expect the input blob to have LabelerInput protos? If these are the files that EDPs supply I'd rather there be a more official interface format and keep LabelerInput as an implementation detail.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 71 at r4 (raw file):

            .build() as LabelerInput
        val labelerOutput: LabelerOutput = labeler.label(input = labelerInput)
        labelerOutput.toString().toByteStringUtf8()

If you're going to build a labeled impression table down stream you'll want to write both the labelerInput and the labelerOutput to the next file. I'd actually recommend creating a new proto specific to the app that holds both of these.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 74 at r4 (raw file):

      }

    storageClient.writeBlob(outputBlobKey, outputFlow)

You should use the opportunity of having read the records to write them into multiple files so that they can be processed more efficiently downstream. For example, write a file per day per campaign. A second step can then read those files sort by VID and write down the files you have for your partitioning scheme.

src/main/proto/wfa/measurement/securecomputation/teeapps/v1alpha/tee_app_config.proto line 27 at r4 (raw file):

message TeeAppConfig {
  message VidLabelingConfig {

We should assume that it's necessary to create labels for more than one model line.

Also, why not just provide a model_line name and allow the application to get the rest of what it needs from the Model Repo?

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/BUILD.bazel line 1 at r4 (raw file):

load("@wfa_rules_kotlin_jvm//kotlin:defs.bzl", "kt_jvm_library")

Can you write just "teeapp" instead of "teeapps" in the directory name?

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/BUILD.bazel line 9 at r4 (raw file):

kt_jvm_library(
    name = "vid_labeler_app",

I'd expect something with an "app" suffix to include a main if not be an actual image.

stevenwarejones

Reviewable status: 16 of 19 files reviewed, 24 unresolved discussions (waiting on @kungfucraig and @Marco-Premier)

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 65 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Do you really expect the input blob to have LabelerInput protos? If these are the files that EDPs supply I'd rather there be a more official interface format and keep LabelerInput as an implementation detail.

This is just an initial version. Updates in the future will be based on the forthcoming storage schema design doc.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 71 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

If you're going to build a labeled impression table down stream you'll want to write both the labelerInput and the labelerOutput to the next file. I'd actually recommend creating a new proto specific to the app that holds both of these.

yeah - I'd prefer that be a todo until we land on an agreed storage schema design doc.

src/main/kotlin/org/wfanet/measurement/securecomputation/teeapps/vidlabeling/VidLabelerApp.kt line 74 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

You should use the opportunity of having read the records to write them into multiple files so that they can be processed more efficiently downstream. For example, write a file per day per campaign. A second step can then read those files sort by VID and write down the files you have for your partitioning scheme.

I was thinking the underlying storage client would be a sharded storage client but that will come out in the design doc.

kungfucraig

I'm going to wait on the design document before reviewing further.

Reviewable status: 16 of 19 files reviewed, 24 unresolved discussions (waiting on @Marco-Premier and @stevenwarejones)

stevenwarejones added 5 commits January 3, 2025 10:32

commit

ef15606

add Vid Labeler Tee App

e19b2a2

add case for different model formats

9479b91

Merge branch 'stevenwarejones_data_watcher' into stevenwarejones_vid_…

c7b2e5c

…labeling_tee_app

rebase

8c2a7e6

restore MODULE.bazel.lock

83aad6f

stevenwarejones mentioned this pull request Jan 3, 2025

Add Control plane integration test #1992

Open

stevenwarejones added 2 commits January 3, 2025 15:36

Merge branch 'stevenwarejones_data_watcher' into stevenwarejones_vid_…

52020a1

…labeling_tee_app

rebase

a14f362

stevenwarejones requested review from kungfucraig and SanjayVas January 3, 2025 20:42

SanjayVas requested changes Jan 3, 2025

View reviewed changes

SanjayVas mentioned this pull request Jan 3, 2025

Add DataWatcher #1990

Open

Merge branch 'stevenwarejones_data_watcher' into stevenwarejones_vid_…

7b622b9

…labeling_tee_app

stevenwarejones changed the base branch from main to stevenwarejones_data_watcher January 7, 2025 18:48

stevenwarejones requested review from SanjayVas and Marco-Premier January 7, 2025 18:49

stevenwarejones commented Jan 7, 2025

View reviewed changes

SanjayVas requested changes Jan 7, 2025

View reviewed changes

kungfucraig requested changes Jan 10, 2025

View reviewed changes

stevenwarejones commented Jan 13, 2025

View reviewed changes

kungfucraig reviewed Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Simple TEE VID Labeling App #1991

Add Simple TEE VID Labeling App #1991

stevenwarejones commented Jan 3, 2025 •

edited

Loading

wfa-reviewable commented Jan 3, 2025

SanjayVas left a comment

stevenwarejones left a comment

SanjayVas left a comment

kungfucraig left a comment

stevenwarejones left a comment

kungfucraig left a comment

Add Simple TEE VID Labeling App #1991

Are you sure you want to change the base?

Add Simple TEE VID Labeling App #1991

Conversation

stevenwarejones commented Jan 3, 2025 • edited Loading

wfa-reviewable commented Jan 3, 2025

SanjayVas left a comment

Choose a reason for hiding this comment

stevenwarejones left a comment

Choose a reason for hiding this comment

SanjayVas left a comment

Choose a reason for hiding this comment

kungfucraig left a comment

Choose a reason for hiding this comment

stevenwarejones left a comment

Choose a reason for hiding this comment

kungfucraig left a comment

Choose a reason for hiding this comment

stevenwarejones commented Jan 3, 2025 •

edited

Loading