Merge pull request #34 from ZKStats/tmp/recheck_func

Tmp/recheck func
ZKStats · May 14, 2024 · 54972b7 · 54972b7
2 parents 09db78e + c849e60
commit 54972b7
Show file tree

Hide file tree

Showing 48 changed files with 2,631 additions and 2,166 deletions.
diff --git a/README.md b/README.md
@@ -56,10 +56,10 @@ def user_computation(s: State, data: list[torch.Tensor]) -> torch.Tensor:
     # Compute the median of the second column
     median2 = s.median(data[1])
     # Compute the mean of the medians
-    return s.mean(torch.Tensor([median1, median2]).reshape(1, -1, 1))
+    return s.mean(torch.cat((median1.unsqueeze(0), median2.unsqueeze(0))).reshape(1,-1,1))
 ```
 
-> NOTE: `reshape` is required for now since input must be in shape `[1, data_size, 1]` for now. It should be addressed in the future
+> NOTE: `reshape` is required for now since input must be in shape `[1, data_size, 1]` for now. It should be addressed in the future, the same for torch.cat(), and unsqueeze(), we will write wrapper in the future.
 
 #### Torch Operations
 
@@ -88,7 +88,7 @@ def user_computation(s: State, data: list[torch.Tensor]) -> torch.Tensor:
 ### Proof Generation and Verification
 
 The flow between data providers and users is as follows:
-![zkstats-lib-flow](./assets/zkstats-lib.png)
+![zkstats-lib-flow](./assets/zkstats-flow.png)
 
 #### Data Provider: generate data commitments
 
@@ -109,12 +109,16 @@ When generating a proof, since dataset might contain floating points, data provi
 
 #### Both: derive PyTorch model from the computation
 
-When a user wants to request a data provider to generate a proof for their defined computation, the user must send the data provider first. Then, both the data provider and the user transform the model to necessary settings, respectively.
+When a user wants to request a data provider to generate a proof for their defined computation, the user must let the data provider know what the computation is. Then, the data provider, with real dataset, will generate model from computation using computation_to_model() method. Since we use witness approach (described more in Note section below), the data provider is required to send the pre-calculated witness back to verifier. Then, verifier, with pre-calculated witness, generates the model from computation to be the exact model as prover.
+
+Note here, that we can also just let prover generate model, and then send that model to verifier directly. However, to make sure that the prover's model actually comes from verifier's computation, it's better to have verifier generates the model itself from its computation, but just with the help of pre-calculated witness.
 
 ```python
 from zkstats.core import computation_to_model
-
-_, model = computation_to_model(user_computation)
+# For prover: generate prover_model, and write to precal_witness file
+_, prover_model = computation_to_model(user_computation, precal_witness_path, True, error)
+# For verifier, generate verifier model (which is same as prover_model) by reading precal_witness file
+_, verifier_model = computation_to_model(user_computation, precal_witness_path, False, error)
 ```
 
 #### Data Provider: generate settings
@@ -204,14 +208,14 @@ See our jupyter notebook for [examples](./examples/).
 
 ## Benchmarks
 
+TOFIX: Update the benchmark method. See more in issues.
 See our jupyter notebook for [benchmarks](./benchmark/).
-TODO: clean benchmark
 
 ## Note
 
 - We implement using witness approach instead of directly calculating the value in circuit. This sometimes allows us to not calculate stuffs like division or exponential which requires larger scale in settings. (If we don't use larger scale in those cases, the accuracy will be very bad)
 - Dummy data to feed in verifier onnx file needs to have same shape as the private dataset, but can be filled with any value (we just randomize it to be uniform 1-10 with 1 decimal).
-- For Mode function, if there are more than 1 value possible, we just output one of them (the one that first encountered), conforming to the spec of statistics.mode in python lib (https://docs.python.org/3.9/library/statistics.html#statistics.mode)
+- For Mode function, if there are more than 1 value possible, we just outputthe one that first encountered, conforming to the spec of statistics.mode in python lib (https://docs.python.org/3.9/library/statistics.html#statistics.mode)
 
 ## Legacy
 

diff --git a/assets/zkstats-flow.png b/assets/zkstats-flow.png
diff --git a/assets/zkstats-lib.png b/assets/zkstats-lib.png
diff --git a/examples/1.only_torch/data.json b/examples/1.only_torch/data.json
@@ -0,0 +1,4 @@
+{
+  "x": [0.5, 1, 2, 3, 4, 5, 6, 7],
+  "y": [2.7, 3.3, 1.1, 2.2, 3.8, 8.2, 4.4, 3.8]
+}
diff --git a/examples/1.only_torch/only_torch.ipynb b/examples/1.only_torch/only_torch.ipynb
diff --git a/examples/2.torch+state/data.json b/examples/2.torch+state/data.json
@@ -0,0 +1,4 @@
+{
+  "x": [0.5, 1, 2, 3, 4, 5, 6],
+  "y": [2.7, 3.3, 1.1, 2.2, 3.8, 8.2, 4.4, 3.8]
+}
diff --git a/examples/2.torch+state/torch+state.ipynb b/examples/2.torch+state/torch+state.ipynb
diff --git a/examples/3.state/data.json b/examples/3.state/data.json
@@ -0,0 +1,4 @@
+{
+  "x": [0.5, 1, 2, 3, 4, 5, 6],
+  "y": [2.7, 3.3, 1.1, 2.2, 3.8, 8.2, 4.4, 3.8]
+}
diff --git a/examples/3.state/state.ipynb b/examples/3.state/state.ipynb
diff --git a/examples/computation/computation.ipynb b/examples/computation/computation.ipynb
diff --git a/examples/computation/data.json b/examples/computation/data.json
diff --git a/examples/correlation/correlation.ipynb b/examples/correlation/correlation.ipynb
diff --git a/examples/covariance/covariance.ipynb b/examples/covariance/covariance.ipynb
diff --git a/examples/geomean/geomean.ipynb b/examples/geomean/geomean.ipynb
diff --git a/examples/harmomean/harmomean.ipynb b/examples/harmomean/harmomean.ipynb
diff --git a/examples/mean+median/data.json b/examples/mean+median/data.json
diff --git a/examples/mean+median/mean+median.ipynb b/examples/mean+median/mean+median.ipynb
diff --git a/examples/mean/mean.ipynb b/examples/mean/mean.ipynb
diff --git a/examples/median/median.ipynb b/examples/median/median.ipynb
diff --git a/examples/mode/data.json b/examples/mode/data.json
@@ -1,28 +1,8 @@
 {
   "col_name": [
-    23.2, 92.8, 91.0, 37.2, 82.0, 15.5, 79.3, 46.6, 98.1, 75.5, 78.9, 77.6,
-    33.8, 75.7, 96.8, 12.3, 18.4, 13.4, 6.0, 8.2, 25.8, 41.3, 68.5, 15.2, 74.7,
-    72.7, 18.0, 42.2, 36.1, 76.7, 1.2, 96.4, 4.9, 92.0, 12.8, 28.2, 61.8, 56.9,
-    44.3, 50.4, 81.6, 72.5, 12.9, 40.3, 12.8, 28.8, 36.3, 16.1, 68.4, 35.3,
-    79.2, 48.4, 97.1, 93.7, 77.0, 48.7, 93.7, 54.1, 65.4, 30.8, 34.4, 31.4,
-    78.7, 12.7, 90.7, 39.4, 86.0, 55.9, 6.8, 22.2, 65.3, 18.8, 7.1, 55.9, 38.6,
-    15.6, 59.2, 77.3, 76.9, 11.9, 19.9, 19.4, 54.3, 39.4, 4.0, 61.1, 16.8, 81.9,
-    49.3, 76.9, 19.2, 68.2, 54.4, 70.2, 89.8, 23.4, 67.5, 18.7, 10.8, 80.7,
-    80.3, 96.2, 62.3, 17.2, 23.0, 98.0, 19.1, 8.1, 36.2, 7.5, 55.9, 1.2, 56.8,
-    85.1, 18.9, 23.0, 13.5, 64.3, 9.1, 14.1, 14.1, 23.1, 73.2, 86.6, 39.1, 45.5,
-    85.0, 79.0, 15.8, 5.2, 81.5, 34.3, 24.3, 14.2, 84.6, 33.7, 86.3, 83.3, 62.8,
-    72.7, 14.7, 36.8, 92.5, 4.7, 30.0, 59.4, 57.6, 37.4, 22.0, 20.9, 61.6, 26.8,
-    47.1, 63.6, 6.0, 96.6, 61.2, 80.2, 59.3, 23.1, 29.3, 46.3, 89.2, 77.6, 83.2,
-    87.2, 63.2, 81.8, 55.0, 59.7, 57.8, 43.4, 92.4, 66.9, 82.1, 51.0, 22.1,
-    29.9, 41.0, 85.2, 61.5, 14.6, 48.0, 52.7, 31.4, 83.9, 35.5, 77.3, 35.8,
-    32.6, 22.2, 19.3, 49.1, 70.9, 43.9, 88.8, 56.3, 41.8, 90.3, 20.4, 80.4,
-    36.4, 91.5, 69.6, 75.3, 92.4, 84.8, 17.7, 2.3, 41.3, 91.3, 68.6, 73.3, 62.5,
-    60.5, 73.5, 70.7, 77.5, 76.8, 98.1, 40.9, 66.3, 8.6, 48.9, 75.4, 14.7, 35.9,
-    89.6, 15.1, 45.0, 77.6, 30.5, 76.1, 46.9, 34.3, 65.1, 43.9, 91.6, 88.8, 8.9,
-    42.9, 11.8, 32.1, 20.1, 48.9, 79.7, 15.3, 45.4, 80.1, 73.1, 76.5, 52.4, 9.6,
-    41.9, 52.7, 55.1, 30.9, 83.7, 46.7, 39.3, 40.5, 52.4, 19.2, 25.8, 52.7,
-    81.0, 38.0, 54.5, 15.3, 64.3, 88.3, 49.8, 90.5, 90.4, 79.7, 87.3, 32.3,
-    11.9, 5.7, 33.6, 75.1, 65.9, 29.1, 39.4, 87.5, 3.3, 66.3, 79.0, 97.9, 69.6,
-    22.0, 62.8, 97.1, 90.4, 39.5, 11.7, 30.3, 18.9, 34.6, 6.6
+    46.2, 40.4, 44.8, 48.1, 51.2, 91.9, 38.2, 36.3, 22.2, 11.5, 17.9, 20.2,
+    99.9, 75.2, 29.8, 19.4, 46.1, 94.8, 6.6, 94.5, 99.7, 1.6, 4.0, 86.7, 28.7,
+    63.0, 66.7, 2.5, 41.4, 35.6, 45.0, 44.8, 9.6, 16.6, 9.8, 20.3, 25.9, 71.9,
+    27.5, 30.9, 62.9, 44.8, 45.7, 2.4, 91.4, 16.2, 61.5, 41.4, 77.1, 44.8
   ]
 }