Skip to content

Commit

Permalink
Merge pull request #237 from huawei-noah/zjj_release_1.8.4
Browse files Browse the repository at this point in the history
release 1.8.4
  • Loading branch information
zhangjiajin authored Jun 2, 2022
2 parents 09b74a7 + 8cd5254 commit ab8b723
Show file tree
Hide file tree
Showing 27 changed files with 241 additions and 116 deletions.
11 changes: 7 additions & 4 deletions README.cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,16 @@

---

**Vega ver1.8.2 发布**
**Vega ver1.8.4 发布**

- 错误修正

- 修正文档中链接错误。
- 评估服务支持多输入。
- 修正在NPU下使用Apex的错误。
- 修正ASHA算法更新数据时失败的问题。
- 修正HCCL+Apex下,loss不更新的问题。
- 增加字典类指标。
- 更新安全配置文档。
- 移除安全模式下对Horovod和TensorFlow的支持。
- 增加安全模型下对Python3.9及以上版本的要求。

---

Expand Down
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,16 @@

---

**Vega ver1.8.2 released**
**Vega ver1.8.4 released**

- Bug Fixed:

- Fixed bad document links.
- The model to be evaluated supports multiple imputs.
- Fixed using Apex on the NPU.
- Fixed bug that ASHA failed to update data.
- Fixed bug that loss is not updated on HCCL+Apex.
- Add dictionary metrics.
- Update the security configuration document.
- Not Allowed Horovod and TensorFlow in safe mode.
- Python 3.9 or later is required in the security model.

---

Expand Down
4 changes: 2 additions & 2 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
**Vega ver1.8.2 released:**
**Vega ver1.8.4 released:**

**Introduction**

Vega is an AutoML algorithm tool chain developed by Noah's Ark Laboratory, the main features are as follows:

1. Full pipeline capailities: The AutoML capabilities cover key functions such as Hyperparameter Optimization, Data Augmentation, Network Architecture Search (NAS), Model Compression, and Fully Train. These functions are highly decoupled and can be configured as required, construct a complete pipeline.
1. Full pipeline capabilities: The AutoML capabilities cover key functions such as Hyperparameter Optimization, Data Augmentation, Network Architecture Search (NAS), Model Compression, and Fully Train. These functions are highly decoupled and can be configured as required, construct a complete pipeline.
2. Industry-leading AutoML algorithms: Provides Noah's Ark Laboratory's self-developed **industry-leading algorithms (Benchmark)** and **Model Zoo** to download the State-of-the-art (SOTA) models.
3. Fine-grained network search space: The network search space can be freely defined, and rich network architecture parameters are provided for use in the search space. The network architecture parameters and model training hyperparameters can be searched at the same time, and the search space can be applied to Pytorch, TensorFlow and MindSpore.
4. High-concurrency neural network training capability: Provides high-performance trainers to accelerate model training and evaluation.
Expand Down
2 changes: 1 addition & 1 deletion docs/cn/user/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,6 @@ Vega针对以上每个算法提供了示例,通过尝试运行示例,可快

## 3. 模型端侧评估

Vega还提供了端侧模型评估的能力,支持的端侧硬件有Davinci推理芯片(ATLAS 200 DK、ATLAS 300产品和开发板环境Evb)和手机,支持在 [Bolt](https://github.com/huawei-noah/bolt) 部署评估。
Vega还提供了端侧模型评估的能力,支持的端侧硬件有Davinci推理芯片(ATLAS 200 DK、ATLAS 300I产品和开发板环境Evb)和手机,支持在 [Bolt](https://github.com/huawei-noah/bolt) 部署评估。

用户可参考 [评估服务安装和配置指导](./evaluate_service.md) 安装和配置评估服务,用以在架构搜索过程中实时评估搜索到的模型,得到适用于该终端设备的模型。
84 changes: 55 additions & 29 deletions docs/cn/user/security_configure.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,8 @@ Vega的安全配置,包括如下步骤:
5. 加密私钥口令
6. 配置安全相关的配置文件
7. 配置评估服务守护服务
8. 安装dask和distributed
9. 配置HCCL白名单
10. 注意事项
8. 配置HCCL白名单
9. 注意事项

## 1.安装OpenSSL

Expand All @@ -36,6 +35,7 @@ openssl req -new -x509 -key ca.key -out ca.crt -subj "/C=<country>/ST=<province>

1. 以上`<country>``<province>``<city>``<organization>``<group>``<cn>`根据实际情况填写,去掉符号`<>`,本文后面的配置也是同样的。并且CA的配置需要和其他的不同。
2. RSA密钥长度建议在3072位及以上,如本例中使用4096长度。
3. 缺省证书有效期为30天,可使用`-days`参数调整有效期,如`-days 365`,设置有效期为365天。

## 3. 生成评估服务使用的证书

Expand All @@ -46,12 +46,29 @@ openssl req -new -x509 -key ca.key -out ca.crt -subj "/C=<country>/ST=<province>

### 3.1 生成加密证书

执行以下命令,获得证书配置文件:

1. 查询openssl配置文件所在的路径:

`openssl version -d`

在输出信息中,找到类似于`OPENSSLDIR: "/etc/pki/tls"`,其中"/etc/pki/tls"即为配置文件所在目录。

2. 拷贝配置文件到当前目录:

`cp /etc/pki/tls/openssl.cnf .`

3. 在配置文件中openssl.cnf中,增加如下配置项:

`req_extensions = v3_req # The extensions to add to a certificate request`

执行如下脚本,生成评估服务器所使用的证书的加密私钥,执行该命令时,会提示输入加密密码,密码的强度要求如下:

1. 密码长度大于等于8位
2. 必须包含至少1位大写字母
3. 必须包含至少1位小写字母
4. 必须包含至少1位数字
5. 必须包含至少1位特殊字符

```shell
openssl genrsa -aes-256-ofb -out server.key 4096
Expand All @@ -60,8 +77,8 @@ openssl genrsa -aes-256-ofb -out server.key 4096
然后再执行如下命令,生成证书,并删除临时文件:

```shell
openssl req -new -key server.key -out server.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>"
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt
openssl req -new -key server.key -out server.csr -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>" -config ./openssl.cnf -extensions v3_req
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -extfile ./openssl.cnf -extensions v3_req
rm server.csr
```

Expand All @@ -74,8 +91,8 @@ openssl genrsa -aes-256-ofb -out client.key 4096
然后再执行如下命令,生成证书,并删除临时文件:

```shell
openssl req -new -key client.key -out client.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>"
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt
openssl req -new -key client.key -out client.csr -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>" -config ./openssl.cnf -extensions v3_req
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -extfile ./openssl.cnf -extensions v3_req
rm client.csr
```

Expand All @@ -85,13 +102,13 @@ rm client.csr

```shell
openssl genrsa -out server.key 4096
openssl req -new -key server.key -out server.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>"
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt
openssl req -new -key server.key -out server.csr -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>" -config ./openssl.cnf -extensions v3_req
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -extfile ./openssl.cnf -extensions v3_req
rm server.csr

openssl genrsa -out client.key 4096
openssl req -new -key client.key -out client.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>"
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt
openssl req -new -key client.key -out client.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>" -config ./openssl.cnf -extensions v3_req
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -extfile ./openssl.cnf -extensions v3_req
rm client.csr
```

Expand All @@ -101,13 +118,13 @@ rm client.csr

```shell
openssl genrsa -out server_dask.key 4096
openssl req -new -key server_dask.key -out server_dask.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>"
openssl x509 -req -in server_dask.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server_dask.crt
openssl req -new -key server_dask.key -out server_dask.csr -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>" -config ./openssl.cnf -extensions v3_req
openssl x509 -req -in server_dask.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server_dask.crt -extfile ./openssl.cnf -extensions v3_req
rm server_dask.csr

openssl genrsa -out client_dask.key 4096
openssl req -new -key client_dask.key -out client_dask.csr -extensions v3_ca -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>"
openssl x509 -req -in client_dask.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client_dask.crt
openssl req -new -key client_dask.key -out client_dask.csr -subj "/C=<country>/ST=<province>/L=<city>/O=<organization>/OU=<group>/CN=<cn>" -config ./openssl.cnf -extensions v3_req
openssl x509 -req -in client_dask.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client_dask.crt -extfile ./openssl.cnf -extensions v3_req
rm client_dask.csr
```

Expand Down Expand Up @@ -222,31 +239,22 @@ sudo systemctl daemon-reload
sudo systemctl start evaluate-service
```

## 8. 安装Dask和distributed

安装Vega时,会自动安装Dask和Distributed的最新版本,我们发现在当前版本中Distributed关闭dash board时存在bug,需要执行如下命令,安装如下版本的这两个组件:

```shell
pip3 install --user dask==2.11.0
pip3 install --user distributed==2.11.0
```

## 9. 配置HCCL白名单
## 8. 配置HCCL白名单

请参考Ascend提供的[配置指导](https://support.huawei.com/enterprise/zh/doc/EDOC1100206668/8e964064)

## 10. 注意事项
## 9. 注意事项

### 10.1 模型风险
### 9.1 模型风险

对于AI框架来说,模型就是程序,模型可能会读写文件、发送网络数据。例如Tensorflow提供了本地操作API tf.read_file, tf.write_file,返回值是一个operation,可以被Tensorflow直接执行。
因此对于未知来源的模型,请谨慎使用,使用前应该排查该模型是否存在恶意操作,消除安全隐患。

### 10.2 运行脚本风险
### 9.2 运行脚本风险

Vega提供的script_runner功能可以调用外部脚本进行超参优化,请确认脚本来源,确保不存在恶意操作,谨慎运行未知来源脚本。

### 10.3 KMC组件不支持多个用户同时使用
### 9.3 KMC组件不支持多个用户同时使用

若使用KMC组件对私钥密码加密,需要注意KMC组件不支持不同的用户同时使用KMC组件。若需要切换用户,需要在root用户下,使用如下命令查询当前信号量:

Expand All @@ -259,3 +267,21 @@ ipcs
```bash
ipcrm -S '<信号量>'
```

### 9.4 删除开源软件中不使用的私钥文件

Vega安装时,会自动安装Vega所依赖的开源软件,请参考[列表](https://github.com/huawei-noah/vega/blob/master/setup.py)

部分开源软件的安装包中可能会带有测试用的私钥文件,Vega不会使用这些私钥文件,删除这些私钥文件不会影响Vega的正常运行。

可执行如下命令所有的私钥文件:

```bash
find ~/.local/ -name *.pem
```

在以上命令列出的所有文件中,找到Vega所依赖的开源软件的私钥文件。一般私钥文件的名称中会带有单词`key`,打开这些文件,可以看到以`-----BEGIN PRIVATE KEY-----`开头,以`-----END PRIVATE KEY-----`结尾,这些文件都可以删除。

### 9.5 Horovod 和 TensorFlow

在安全模式下,Vega不支持Horovod数据并行,也不支持TensorFlow框架,Vega在运行前检查若是Horovod数据并行程序,或者TensorFlow框架,会自动退出。
2 changes: 1 addition & 1 deletion docs/en/user/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,6 @@ For details about the pipeline configuration, see **[Configuration Guide](./conf

## 3. Model Evaluation On Device

Vega also provides the model evaluation capability. The device hardware supported by Vega includes Davinci inference chips (Atlas 200 DK, Atlas 300, and development board environment Evb) and mobile phones. The [Bolt](https://github.com/huawei-noah/bolt) can be deployed for evaluation.
Vega also provides the model evaluation capability. The device hardware supported by Vega includes Davinci inference chips (Atlas 200 DK, Atlas 300I, and development board environment Evb) and mobile phones. The [Bolt](https://github.com/huawei-noah/bolt) can be deployed for evaluation.

You can install and configure the evaluation service by referring to [Evaluation Service Installation and Configuration Guide](./evaluate_service.md) to evaluate the model found during architecture search in real time and obtain the model applicable to the device.
Loading

0 comments on commit ab8b723

Please sign in to comment.