[luci/pass] introduce quantize weight with GPTQ pass #14285

BLee-bot · 2024-10-30T01:50:36Z

This introduces quantize weight with GPTQ pass

ONE-DCO-1.0-Signed-off-by: Banseok Lee [email protected]

Related Issue : #13480
Draft PR: #13585

compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

seanshpark · 2024-10-30T03:51:50Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+                                 int channel_dim_index, std::vector<float> &scaling_factor,
+                                 std::vector<float> &max, std::vector<float> &min)
+{
+  int channel_idx = indices[channel_dim_index];


add some assertion checks for input arguments

indices[channel_dim_index]; may produce out of bounds problem and produce segmentation fault.

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

seanshpark · 2024-10-30T03:59:11Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+void compute_asym_scale_zp(float min, float max, float &scaling_factor, int64_t &zp,
+                           float &nudged_min, float &nudged_max, int32_t k_max_scale)


looks like some arguments are input and some are outputs.
it would be nice to add some comments about this.
k_max_scale would be better to group with other inputs.

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

seanshpark · 2024-10-30T04:03:03Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+    cholesky_inverse(hessian, size_hessian);
+    cholesky_decomposition(hessian, size_hessian);
+
+    // transpose hessian to make upper trangular


trangular ?

seanshpark · 2024-10-30T04:04:18Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+      error[cal_offset(dimension, indices)] =
+        (data - (quantized_values[cal_offset(dimension, indices)] - zp[channel_idx]) *
+                  scaling_factor[channel_idx]) /
+        hessian[cal_offset_2d(dimension_hessian, indices_diag_hessian)];


lets split hessian[cal_offset_2d(dimension_hessian, indices_diag_hessian)] to

auto offset = cal_offset_2d(dimension_hessian, indices_diag_hessian) hessian[offset];

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

seanshpark · 2024-10-30T04:06:18Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+private:
+  void fake_quantize_cwq(luci::CircleConst *weights, std::vector<float> &hessian) const
+  {
+    // assert(output_type == loco::DataType::U8); // FIX_CALLER_UNLESS


Suggested change

// assert(output_type == loco::DataType::U8); // FIX_CALLER_UNLESS

seanshpark · 2024-10-30T04:08:15Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+    auto new_weights = luci::clone(weights);
+    node->filter(new_weights);
+
+    auto hessian = (*hessian_map)[node];


where is this hessian_map declared?

it would be better to use class and use _ prefix to member variables.
this class is not simple to declare as struct

I changed it to a class and added _ prefix to member variables.

seanshpark · 2024-10-30T04:12:04Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.test.cpp

+{
+struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test
+{


Suggested change

{

struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test

{

{

struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test

{

seanshpark · 2024-10-30T04:12:30Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.test.cpp

+struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test
+{
+  /**
+   *  nconv graph


what is nconv ?

I copied MakeGraph code blocks from "compiler/luci/pass/src/QuantizeWeightsPass.test.cpp"
It seems that this should change to conv2d

seanshpark · 2024-10-30T04:38:12Z

compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h

+#include <loco.h>
+
+#include <logo/Pass.h>
+
+#include <luci/Pass/QuantizationParameters.h>
+#include <luci/IR/CircleNode.h>
+#include <unordered_map>


Suggested change

#include <loco.h>

#include <logo/Pass.h>

#include <luci/Pass/QuantizationParameters.h>

#include <luci/IR/CircleNode.h>

#include <unordered_map>

#include <luci/Pass/QuantizationParameters.h>

#include <luci/IR/CircleNode.h>

#include <logo/Pass.h>

#include <loco.h>

#include <unordered_map>

seanshpark · 2024-10-30T04:49:19Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+#include <iostream>
+#include <cmath>
+#include <functional>
+#include <limits>


is limits necessary?

It doesn't seem necessary. I'll remove it

This patch includes minor refactoring to improve overall code quality. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

seanshpark · 2024-10-31T21:30:01Z

compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h

+using HessianMap = std::unordered_map<const luci::CircleNode *, std::vector<float>>;
+
+/**
+ * @brief Pass to quantize weights


this brief doesn't give any help. plz remove or give more useful description.

seanshpark · 2024-10-31T21:30:19Z

compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h

+
+private:
+  std::unique_ptr<Context> _ctx;
+  HessianMap *_hessian_map;


Suggested change

HessianMap *_hessian_map;

HessianMap *_hessian_map = nullptr;

seanshpark · 2024-10-31T21:31:32Z

compiler/luci/pass/src/QuantizationUtils.cpp

@@ -292,6 +292,11 @@ uint32_t cal_offset(loco::TensorShape &dimension, uint32_t *indices)
         indices[2] * dimension.dim(3).value() + indices[3];
 }

+uint32_t cal_offset_2d(loco::TensorShape &dimension, uint32_t *indices)
+{
+  return indices[0] * dimension.dim(1).value() + indices[1];


I don't know caller is calling with safe index.
It would be better to add some assertion check before accessing ptr and array.

seanshpark · 2024-10-31T21:36:27Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+      }
+    }
+  }
+  return;


Suggested change

return;

do you really need to add return here?

seanshpark · 2024-10-31T21:36:42Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+          std::cout << "Error: Matrix is not positive definite." << std::endl;
+          return;


how does the caller know about this error?

seanshpark · 2024-10-31T21:42:33Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+    {
+      damp += hessian[i * size_hessian + i];
+    }
+    damp /= size_hessian;


check size_hessian != 0

seanshpark · 2024-10-31T21:46:04Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+  zp = nudged_zero_point;
+}
+
+void asymmetric_wquant_per_channel(CircleConst *node, std::vector<float> &min,


this method is quite long and have many depths that may increase complexity.
plz split this to several methods.

I tried to divide the method into smaller parts, but if there are still parts that should be further divided, please let me know.

try to split to small methods under some certain lines, line 20 or 30.

seanshpark · 2024-10-31T22:19:51Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+      error[cal_offset(dimension, indices)] =
+        (data - (quantized_values[cal_offset(dimension, indices)] - zp[channel_idx]) *
+                  scaling_factor[channel_idx]) /


it would be better to extract cal_offset(dimension, indices) to some variable and use it like error[index].
and to quantized_values[index] - zp[channel_idx] too

seanshpark · 2024-10-31T22:20:16Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+                auto _h_offset = cal_offset_2d(dimension_hessian, indices_hessain);
+
+                node->at<loco::DataType::FLOAT32>(cal_offset(dimension, indices_channel_first)) -=
+                  error[cal_offset(dimension, indices_error)] * hessian[_h_offset];


ditto to cal_offset(dimension, indices_error)

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

This commit adds a description to the QuantizeWeightsWithGPTQPass class. Additionally, some variables have been extracted from the function parameters to improve readability and maintainability. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

This commit adds support for the GPTQ algorithm and hessian map to the CircleQuantizer class in LUCI. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

seanshpark · 2024-11-01T08:00:54Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

+/**
+ * @brief Compute the scale and zero point for the given range of values
+ * @param min: The minimum value in the range of values to be quantized.
+ * @param max: The maximum value in the range of values to be quantized.
+ * @param k_max_scale: The maximum value of the quantization scale.
+ * @param scaling_factor: The computed scaling factor for the quantization.
+ * @param zp: The computed zero point for the quantization.
+ * @param nudged_min: The nudged minimum value after applying the scaling factor and zero point.
+ * @param nudged_max: The nudged maximum value after applying the scaling factor and zero point.
+ */


I mean, it's ok to add explanations for each arguments but as other functions don't provide this, so we don't need to add helps like this.
If you really want to add this much of comments, it would be better add to others too... but I don't recommend that.

As I wrote, some arguments are input and some are outputs and this can't be identified at sight when you previously had k_max_scale at the end but it was used as input.

Please read my comments carefully.

I recommend to remove all these comments if not really necessary understanding this function.

seanshpark · 2024-11-01T08:02:07Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp

@@ -0,0 +1,697 @@
+/*
+ * Copyright (c) 2024 Samsung Electronics Co., Ltd. All Rights Reserved
+ * Copyright 2019 The TensorFlow Authors. All Rights Reserved.


what TF code did you reference?

jinevening · 2024-11-01T08:46:23Z

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.test.cpp

+  }
+
+  luci::QuantizeWeightsWithGPTQPass pass(std::move(ctx), &hessian_map);
+  EXPECT_NO_THROW(pass.run(&_g));


Can you add some value tests? It is difficult for me to ensure that those implementations (cholesky things) are correct.

This commit refactors the QuantizeWeightsWithGPTQPass.cpp file to improve its readability and maintainability. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

seanshpark · 2024-11-03T21:34:11Z

@01000-you , @BLee-bot , there are lots of changes and mixed files
that got lots of comments even I cannot remember what I've left.

Please split this PR to several PRs, like under 50 lines, that I and you guys can manage.

seanshpark · 2024-11-04T03:35:40Z

you can split PRs with

CircleQuantizer.h
QuantizationUtils.h and cpp
QuantizeWeightsWithGPTQPass.h and cpp

that doesn't exceed 50 lines with single context of change.

context means, introducing something, removing something, changing existing codes for single purpose.

01000-you · 2024-11-04T07:18:31Z

@seanshpark, @jinevening

Thank you for your kind reviews. I'll follow your guide and upload it accordingly.

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

seanshpark reviewed Oct 30, 2024

View reviewed changes

seanshpark mentioned this pull request Oct 30, 2024

[record-hessian] Introduce HessianComputer #14265

Merged

01000-you force-pushed the gp branch 5 times, most recently from 5fc8609 to f56171f Compare October 31, 2024 11:52

Refactoring and clean-up

bce1b10

This patch includes minor refactoring to improve overall code quality. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

01000-you force-pushed the gp branch from f56171f to bce1b10 Compare October 31, 2024 12:04

seanshpark reviewed Oct 31, 2024

View reviewed changes

compiler/luci/pass/src/QuantizeWeightsWithGPTQPass.cpp Outdated Show resolved Hide resolved

y01000.you added 2 commits November 1, 2024 10:23

Add GPTQ algorithm and hessian map support for circle quantizer

9387b96

This commit adds support for the GPTQ algorithm and hessian map to the CircleQuantizer class in LUCI. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

seanshpark reviewed Nov 1, 2024

View reviewed changes

jinevening reviewed Nov 1, 2024

View reviewed changes

01000-you force-pushed the gp branch from 7623fbd to ee758f1 Compare November 1, 2024 10:54

Refactor GPTQPass to improve readability and maintainability

3574826

This commit refactors the QuantizeWeightsWithGPTQPass.cpp file to improve its readability and maintainability. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>

01000-you force-pushed the gp branch from ee758f1 to 3574826 Compare November 1, 2024 10:56

seanshpark mentioned this pull request Nov 4, 2024

[record-hessian] Introduce RecordHessian. #14291

Closed

seanshpark closed this Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[luci/pass] introduce quantize weight with GPTQ pass #14285

[luci/pass] introduce quantize weight with GPTQ pass #14285

BLee-bot commented Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

01000-you Nov 1, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

BLee-bot Nov 1, 2024

seanshpark Oct 30, 2024

seanshpark Oct 30, 2024

01000-you Oct 31, 2024

seanshpark Oct 31, 2024

seanshpark Oct 31, 2024

seanshpark Oct 31, 2024

seanshpark Oct 31, 2024 •

edited

Loading

seanshpark Nov 1, 2024

seanshpark Oct 31, 2024

seanshpark Oct 31, 2024

seanshpark Oct 31, 2024

01000-you Nov 1, 2024

seanshpark Nov 3, 2024

seanshpark Oct 31, 2024

seanshpark Oct 31, 2024

seanshpark Nov 1, 2024

seanshpark Nov 1, 2024

jinevening Nov 1, 2024

seanshpark commented Nov 3, 2024

seanshpark commented Nov 4, 2024

01000-you commented Nov 4, 2024

		void compute_asym_scale_zp(float min, float max, float &scaling_factor, int64_t &zp,
		float &nudged_min, float &nudged_max, int32_t k_max_scale)

	HessianMap *_hessian_map;
	HessianMap *_hessian_map = nullptr;

		std::cout << "Error: Matrix is not positive definite." << std::endl;
		return;

[luci/pass] introduce quantize weight with GPTQ pass #14285

[luci/pass] introduce quantize weight with GPTQ pass #14285

Conversation

BLee-bot commented Oct 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanshpark Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanshpark commented Nov 3, 2024

seanshpark commented Nov 4, 2024

01000-you commented Nov 4, 2024

seanshpark Oct 31, 2024 •

edited

Loading