-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[luci/pass] introduce quantize weight with GPTQ pass #14285
Conversation
compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h
Outdated
Show resolved
Hide resolved
compiler/luci/pass/include/luci/Pass/QuantizeWeightsWithGPTQPass.h
Outdated
Show resolved
Hide resolved
int channel_dim_index, std::vector<float> &scaling_factor, | ||
std::vector<float> &max, std::vector<float> &min) | ||
{ | ||
int channel_idx = indices[channel_dim_index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some assertion checks for input arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indices[channel_dim_index];
may produce out of bounds problem and produce segmentation fault.
void compute_asym_scale_zp(float min, float max, float &scaling_factor, int64_t &zp, | ||
float &nudged_min, float &nudged_max, int32_t k_max_scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like some arguments are input and some are outputs.
it would be nice to add some comments about this.
k_max_scale
would be better to group with other inputs.
cholesky_inverse(hessian, size_hessian); | ||
cholesky_decomposition(hessian, size_hessian); | ||
|
||
// transpose hessian to make upper trangular |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trangular
?
error[cal_offset(dimension, indices)] = | ||
(data - (quantized_values[cal_offset(dimension, indices)] - zp[channel_idx]) * | ||
scaling_factor[channel_idx]) / | ||
hessian[cal_offset_2d(dimension_hessian, indices_diag_hessian)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets split hessian[cal_offset_2d(dimension_hessian, indices_diag_hessian)]
to
auto offset = cal_offset_2d(dimension_hessian, indices_diag_hessian)
hessian[offset];
private: | ||
void fake_quantize_cwq(luci::CircleConst *weights, std::vector<float> &hessian) const | ||
{ | ||
// assert(output_type == loco::DataType::U8); // FIX_CALLER_UNLESS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// assert(output_type == loco::DataType::U8); // FIX_CALLER_UNLESS |
auto new_weights = luci::clone(weights); | ||
node->filter(new_weights); | ||
|
||
auto hessian = (*hessian_map)[node]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this hessian_map
declared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be better to use class
and use _
prefix to member variables.
this class is not simple to declare as struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to a class
and added _
prefix to member variables.
{ | ||
struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | |
struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test | |
{ | |
{ | |
struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test | |
{ |
struct QuantizeWeightsWithGPTQPassTest : public ::testing::Test | ||
{ | ||
/** | ||
* nconv graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is nconv
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied MakeGraph code blocks from "compiler/luci/pass/src/QuantizeWeightsPass.test.cpp"
It seems that this should change to conv2d
#include <loco.h> | ||
|
||
#include <logo/Pass.h> | ||
|
||
#include <luci/Pass/QuantizationParameters.h> | ||
#include <luci/IR/CircleNode.h> | ||
#include <unordered_map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <loco.h> | |
#include <logo/Pass.h> | |
#include <luci/Pass/QuantizationParameters.h> | |
#include <luci/IR/CircleNode.h> | |
#include <unordered_map> | |
#include <luci/Pass/QuantizationParameters.h> | |
#include <luci/IR/CircleNode.h> | |
#include <logo/Pass.h> | |
#include <loco.h> | |
#include <unordered_map> |
#include <iostream> | ||
#include <cmath> | ||
#include <functional> | ||
#include <limits> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is limits
necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem necessary. I'll remove it
5fc8609
to
f56171f
Compare
This patch includes minor refactoring to improve overall code quality. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>
using HessianMap = std::unordered_map<const luci::CircleNode *, std::vector<float>>; | ||
|
||
/** | ||
* @brief Pass to quantize weights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this brief doesn't give any help. plz remove or give more useful description.
|
||
private: | ||
std::unique_ptr<Context> _ctx; | ||
HessianMap *_hessian_map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HessianMap *_hessian_map; | |
HessianMap *_hessian_map = nullptr; |
@@ -292,6 +292,11 @@ uint32_t cal_offset(loco::TensorShape &dimension, uint32_t *indices) | |||
indices[2] * dimension.dim(3).value() + indices[3]; | |||
} | |||
|
|||
uint32_t cal_offset_2d(loco::TensorShape &dimension, uint32_t *indices) | |||
{ | |||
return indices[0] * dimension.dim(1).value() + indices[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know caller is calling with safe index.
It would be better to add some assertion check before accessing ptr and array.
} | ||
} | ||
} | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really need to add return
here?
std::cout << "Error: Matrix is not positive definite." << std::endl; | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does the caller know about this error?
{ | ||
damp += hessian[i * size_hessian + i]; | ||
} | ||
damp /= size_hessian; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check size_hessian != 0
zp = nudged_zero_point; | ||
} | ||
|
||
void asymmetric_wquant_per_channel(CircleConst *node, std::vector<float> &min, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is quite long and have many depths that may increase complexity.
plz split this to several methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to divide the method into smaller parts, but if there are still parts that should be further divided, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to split to small methods under some certain lines, line 20 or 30.
error[cal_offset(dimension, indices)] = | ||
(data - (quantized_values[cal_offset(dimension, indices)] - zp[channel_idx]) * | ||
scaling_factor[channel_idx]) / |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be better to extract cal_offset(dimension, indices)
to some variable and use it like error[index]
.
and to quantized_values[index] - zp[channel_idx]
too
auto _h_offset = cal_offset_2d(dimension_hessian, indices_hessain); | ||
|
||
node->at<loco::DataType::FLOAT32>(cal_offset(dimension, indices_channel_first)) -= | ||
error[cal_offset(dimension, indices_error)] * hessian[_h_offset]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto to cal_offset(dimension, indices_error)
This commit adds a description to the QuantizeWeightsWithGPTQPass class. Additionally, some variables have been extracted from the function parameters to improve readability and maintainability. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>
This commit adds support for the GPTQ algorithm and hessian map to the CircleQuantizer class in LUCI. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>
/** | ||
* @brief Compute the scale and zero point for the given range of values | ||
* @param min: The minimum value in the range of values to be quantized. | ||
* @param max: The maximum value in the range of values to be quantized. | ||
* @param k_max_scale: The maximum value of the quantization scale. | ||
* @param scaling_factor: The computed scaling factor for the quantization. | ||
* @param zp: The computed zero point for the quantization. | ||
* @param nudged_min: The nudged minimum value after applying the scaling factor and zero point. | ||
* @param nudged_max: The nudged maximum value after applying the scaling factor and zero point. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, it's ok to add explanations for each arguments but as other functions don't provide this, so we don't need to add helps like this.
If you really want to add this much of comments, it would be better add to others too... but I don't recommend that.
As I wrote, some arguments are input and some are outputs and this can't be identified at sight when you previously had k_max_scale
at the end but it was used as input.
Please read my comments carefully.
I recommend to remove all these comments if not really necessary understanding this function.
@@ -0,0 +1,697 @@ | |||
/* | |||
* Copyright (c) 2024 Samsung Electronics Co., Ltd. All Rights Reserved | |||
* Copyright 2019 The TensorFlow Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what TF code did you reference?
} | ||
|
||
luci::QuantizeWeightsWithGPTQPass pass(std::move(ctx), &hessian_map); | ||
EXPECT_NO_THROW(pass.run(&_g)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some value tests? It is difficult for me to ensure that those implementations (cholesky things) are correct.
This commit refactors the QuantizeWeightsWithGPTQPass.cpp file to improve its readability and maintainability. ONE-DCO-1.0-Signed-off-by: y01000.you <[email protected]>
@01000-you , @BLee-bot , there are lots of changes and mixed files Please split this PR to several PRs, like under 50 lines, that I and you guys can manage. |
you can split PRs with
that doesn't exceed 50 lines with single context of change. context means, introducing something, removing something, changing existing codes for single purpose. |
Thank you for your kind reviews. I'll follow your guide and upload it accordingly. |
This introduces quantize weight with GPTQ pass
ONE-DCO-1.0-Signed-off-by: Banseok Lee [email protected]
Related Issue : #13480
Draft PR: #13585