Skip to content

Commit

Permalink
update c550 base operation data and h2d data
Browse files Browse the repository at this point in the history
  • Loading branch information
Hodoryu committed Aug 16, 2024
1 parent 10c7afd commit 013fc93
Show file tree
Hide file tree
Showing 22 changed files with 144 additions and 50 deletions.
8 changes: 4 additions & 4 deletions base/benchmarks/computation-BF16/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

- 产品名称:C550
- 产品型号:曦云®C550 64G
- TDP:350W
- TDP:450W

# 所用服务器配置

Expand All @@ -29,16 +29,16 @@

| 评测项 | BF16算力测试值(8卡平均) | BF16算力标定值(8卡平均) | 测试标定比例(8卡平均) |
| ---- | ---------------- | ---------------- | ------------- |
| 评测结果 | | | 94.2% |
| 评测结果 | | | 82.8% |

## 能耗监控结果

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡平均) | 单卡TDP |
| ---- | ------------ | ------------ | ------------- | ----- | ------------- | ------------- | -------------- | ----- |
| 监控结果 | 4284.0 | 4386.0 | 102.0 | / | 196.0W | 292.0W | 96.0W | 350W |
| 监控结果 | 4207.5 | 4233.0 | 25.5 | / | 125.5W | 150.0W | 24.5W | 450W |

## 其他重要监控结果

| 监控项 | 系统平均CPU占用 | 系统平均内存占用 | 单卡平均温度(8卡平均) | 单卡平均显存占用(8卡平均) |
| ---- | --------------- | -------------- | ------------- | --------------- |
| 监控结果 | 1.01% | 0.634% | 36.5°C | 3.244% |
| 监控结果 | 0.784% | 0.55% | 35.5°C | 5.003% |
6 changes: 3 additions & 3 deletions base/benchmarks/computation-BF16/metax/C550/case_config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
M: 8192
N: 8192
K: 8192
M: 6656
N: 2048
K: 4096
DIST_BACKEND: "nccl"
8 changes: 4 additions & 4 deletions base/benchmarks/computation-FP16/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

- 产品名称:C550
- 产品型号:曦云®C550 64G
- TDP:350W
- TDP:450W

# 所用服务器配置

Expand All @@ -29,16 +29,16 @@

| 评测项 | FP16算力测试值(8卡平均) | FP16算力标定值(8卡平均) | 测试标定比例(8卡平均) |
| ---- | ---------------- | ---------------- | ------------- |
| 评测结果 | | | 90.9% |
| 评测结果 | | | 83.5% |

## 能耗监控结果

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡平均) | 单卡TDP |
| ---- | ------------ | ------------ | ------------- | ----- | ------------- | ------------- | -------------- | ----- |
| 监控结果 | 4284.0W | 4386.0W | 102.0W | / | 226.0W | 352.0W | 126.0W | 350W |
| 监控结果 | 4182.0W | 4182.0W | 0.0W | / | 112.5W | 124.0W | 11.5W | 450W |

## 其他重要监控结果

| 监控项 | 系统平均CPU占用 | 系统平均内存占用 | 单卡平均温度(8卡平均) | 单卡平均显存占用(8卡平均) |
| ---- | --------------- | -------------- | ------------- | --------------- |
| 监控结果 | 0.855% | 0.638% | 37.0°C | 4.538% |
| 监控结果 | 0.872% | 0.55% | 34.5°C | 4.71% |
6 changes: 3 additions & 3 deletions base/benchmarks/computation-FP16/metax/C550/case_config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
M: 8192
N: 8192
K: 8192
M: 6656
N: 2048
K: 4096
DIST_BACKEND: "nccl"
8 changes: 4 additions & 4 deletions base/benchmarks/computation-FP32/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

- 产品名称:C550
- 产品型号:曦云®C550 64G
- TDP:350W
- TDP:450W

# 所用服务器配置

Expand All @@ -29,16 +29,16 @@

| 评测项 | FP32算力测试值(8卡平均) | FP32算力标定值(8卡平均) | 测试标定比例(8卡平均) |
| ---- | ---------------- | ---------------- | ------------- |
| 评测结果 | | | 95.6% |
| 评测结果 | | | 87.4% |

## 能耗监控结果

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡平均) | 单卡TDP |
| ---- | ------------ | ------------ | ------------- | ----- | ------------- | ------------- | -------------- | ----- |
| 监控结果 | 4207.5W | 4233.0W | 25.5W | / | 463.0W | 463.0W | 0W | 350W |
| 监控结果 | 4896.0W | 6069.0W | 1173W | / | 253.0W | 405.0W | 152.0W | 450W |

## 其他重要监控结果

| 监控项 | 系统平均CPU占用 | 系统平均内存占用 | 单卡平均温度(8卡平均) | 单卡平均显存占用(8卡平均) |
| ---- | --------------- | -------------- | ------------- | --------------- |
| 监控结果 | 0.995% | 4.771% | 49.5°C | 6.371% |
| 监控结果 | 0.793% | 1.096% | 60.0°C | 6.371% |
6 changes: 3 additions & 3 deletions base/benchmarks/computation-TF32/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

- 产品名称:C550
- 产品型号:曦云®C550 64G
- TDP:350W
- TDP:450W

# 所用服务器配置

Expand All @@ -25,13 +25,13 @@

| 评测项 | TF32算力测试值(8卡平均) | TF32算力标定值(8卡平均) | 测试标定比例(8卡平均) |
| ---- | ---------------- | ---------------- | ------------- |
| 评测结果 | | | 88.55% |
| 评测结果 | | | 82.8% |

## 能耗监控结果

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡平均) | 单卡TDP |
| ---- | ------------ | ------------ | ------------- | ----- | ------------- | ------------- | -------------- | ----- |
| 监控结果 | 4207.5W | 4233.0W | 25.5W | / | 292.0W | 484.0W | 192.0W | 350W |
| 监控结果 | 4207.5W | 4233.0W | 25.5W | / | 292.0W | 484.0W | 192.0W | 450W |

## 其他重要监控结果

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand Down Expand Up @@ -44,7 +44,7 @@ The second metric, busbw, is chosen for the following reasons:

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(16卡平均) | 单卡最大功耗(16卡最大) | 单卡功耗标准差(16卡最大) | 单卡TDP |
| ---- | ------- | ------- | ------- | ----- | ------------ | ------------ | ------------- | ----- |
| 监控结果 | 4335.0W | 4488.0W | 153.0W | / | 136.5W | 173.0W | 36.5W | 350W |
| 监控结果 | 4335.0W | 4488.0W | 153.0W | / | 136.5W | 173.0W | 36.5W | 450W |

## 其他重要监控结果

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand Down Expand Up @@ -34,7 +34,7 @@

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡最大) | 单卡TDP |
| ---- | ------- | ------- | ------- | ----- | ------------ | ------------ | ------------- | ----- |
| 监控结果 | 4437.0W | 4692.0W | 255.0W | / | 160.0W | 219.0W | 59.0W | 350W |
| 监控结果 | 4437.0W | 4692.0W | 255.0W | / | 160.0W | 219.0W | 59.0W | 450W |

## 其他重要监控结果

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand Down Expand Up @@ -33,7 +33,7 @@

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(2卡平均) | 单卡最大功耗(2卡最大) | 单卡功耗标准差(2卡最大) | 单卡TDP |
| ---- | ------- | ------- | ------- | ----- | ------------ | ------------ | ------------- | ----- |
| 监控结果 | 4207.5W | 4233.0W | 25.5W | / | 114.0W | 128.0W | 14.0W | 350W |
| 监控结果 | 4207.5W | 4233.0W | 25.5W | / | 114.0W | 128.0W | 14.0W | 450W |

## 其他重要监控结果

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand All @@ -32,7 +32,7 @@

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(2卡平均) | 单卡最大功耗(2卡最大) | 单卡功耗标准差(2卡最大) | 单卡TDP |
| ---- | ------- | ------- | ------- | ----- | ------------ | ------------ | ------------- | ----- |
| 监控结果 | 4490.22W | 4539.0W | 74.41W | / | 149.3W | 153.0W | 12.3W | 350W |
| 监控结果 | 4490.22W | 4539.0W | 74.41W | / | 149.3W | 153.0W | 12.3W | 450W |

## 其他重要监控结果

Expand Down

This file was deleted.

1 change: 0 additions & 1 deletion base/benchmarks/interconnect-h2d/metax/C550/env.sh

This file was deleted.

This file was deleted.

4 changes: 2 additions & 2 deletions base/benchmarks/main_memory-bandwidth/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand Down Expand Up @@ -33,7 +33,7 @@

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡最大) | 单卡TDP |
| ---- | ------- | ------- | ------- | ----- | ------------ | ------------ | ------------- | ----- |
| 监控结果 | 4233.0W | 4284.0W | 51.0W | / | 164.5W | 229.0W | 64.5W | 350W |
| 监控结果 | 4233.0W | 4284.0W | 51.0W | / | 164.5W | 229.0W | 64.5W | 450W |

## 其他重要监控结果

Expand Down
2 changes: 1 addition & 1 deletion base/benchmarks/main_memory-capacity/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand Down
31 changes: 22 additions & 9 deletions base/run.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -258,15 +258,28 @@ def prepare_containers_env_cluster(dp_path, case_log_dir, container_name,
image_name, nnodes, config):
'''Prepare containers environments in the cluster. It will start
containers, setup environments, start monitors, and clear caches.'''
container_start_args = " --rm --init --detach --net=host --uts=host" \
+ " --ipc=host --security-opt=seccomp=unconfined" \
+ " --privileged=true --ulimit=stack=67108864" \
+ " --ulimit=memlock=-1" \
+ " -w " + config.FLAGPERF_PATH \
+ " --shm-size=" + config.SHM_SIZE \
+ " -v " + dp_path + ":" \
+ config.FLAGPERF_PATH


container_start_args=None
if config.VENDOR == "metax" and any("TF32" in key for key in config.CASES.keys()):

container_start_args = " --rm --init --detach --net=host --uts=host" \
+ " --ipc=host --security-opt=seccomp=unconfined" \
+ " --privileged=true --ulimit=stack=67108864" \
+ " --ulimit=memlock=-1" \
+ " -e TORCH_ALLOW_TF32_CUBLAS_OVERRIDE=1" \
+ " -w " + config.FLAGPERF_PATH \
+ " --shm-size=" + config.SHM_SIZE \
+ " -v " + dp_path + ":" \
+ config.FLAGPERF_PATH
else:
container_start_args = " --rm --init --detach --net=host --uts=host" \
+ " --ipc=host --security-opt=seccomp=unconfined" \
+ " --privileged=true --ulimit=stack=67108864" \
+ " --ulimit=memlock=-1" \
+ " -w " + config.FLAGPERF_PATH \
+ " --shm-size=" + config.SHM_SIZE \
+ " -v " + dp_path + ":" \
+ config.FLAGPERF_PATH
if config.ACCE_CONTAINER_OPT is not None:
container_start_args += " " + config.ACCE_CONTAINER_OPT

Expand Down
6 changes: 3 additions & 3 deletions base/toolkits/computation-INT8/metax/C550/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand All @@ -27,13 +27,13 @@

| 评测项 | BF16算力测试值(8卡平均) | BF16算力标定值(8卡平均) | 测试标定比例(8卡平均) |
| ---- | ---------------- | ---------------- | ------------- |
| 评测结果 | | | 80.83% |
| 评测结果 | | | 79.3% |

## 能耗监控结果

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡平均) | 单卡TDP |
| ---- | ------------ | ------------ | ------------- | ----- | ------------- | ------------- | -------------- | ----- |
| 监控结果 | 1820.77W | 1950.0W | 54.52W | / | 56.5W | 57.0W | 0.5W | 350W |
| 监控结果 | 1820.77W | 1950.0W | 54.52W | / | 56.5W | 57.0W | 0.5W | 450W |

## 其他重要监控结果

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

* 产品名称:C550
* 产品型号:曦云®C550 64G
* TDP:350W
* TDP:450W

# 所用服务器配置

Expand Down Expand Up @@ -35,10 +35,14 @@

| 监控项 | 系统平均功耗 | 系统最大功耗 | 系统功耗标准差 | 单机TDP | 单卡平均功耗(8卡平均) | 单卡最大功耗(8卡最大) | 单卡功耗标准差(8卡最大) | 单卡TDP |
| ---- | ------- | ------- | ------- | ----- | ------------ | ------------ | ------------- | ----- |
| 监控结果 | 4386.0W | 4641.0W | 190.820W | / | 117.5W | 130.0W | 12.5W | 350W |
| 监控结果 | 4238.1W | 4284.0W | 121.65W | / | 99.2W | 100.0W | 4.58W | 450W |

## 其他重要监控结果

| 监控项 | 系统平均CPU占用 | 系统平均内存占用 | 单卡平均温度(8卡平均) | 单卡平均显存占用(8卡平均) |
| ---- | --------- | -------- | ------------ | -------------- |
| 监控结果 | 2.756% | 2.37% | 35.5°C | 14.5% |
| 监控结果 | 0.089% | 1.151% | 34.32°C | 1.285% |

# 厂商测试工具原理说明

使用cudaMemcpy,进行hosttodevice的CPU-AI芯片互联操作,计算CPU-AI芯片互联带宽
56 changes: 56 additions & 0 deletions base/toolkits/interconnect-h2d/metax/C550/bandwidth.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
// Copyright (c) 2024 BAAI. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License")
#include <stdio.h>
#include <cuda_runtime.h>

#define GB (1024ULL * 1024ULL * 1024ULL)
#define SIZE (16ULL * GB)
#define WARMUP_ITERATIONS 100
#define ITERATIONS 1000

void checkCudaError(cudaError_t err, const char *msg) {
if (err != cudaSuccess) {
fprintf(stderr, "CUDA Error: %s: %s\n", msg, cudaGetErrorString(err));
exit(EXIT_FAILURE);
}
}

int main() {
float *d_src, *d_dst;
cudaEvent_t start, end;
float elapsed_time;

checkCudaError(cudaMallocHost(&d_src, SIZE), "cudaMallocHost");
checkCudaError(cudaMalloc(&d_dst, SIZE), "cudaMalloc");

checkCudaError(cudaEventCreate(&start), "cudaEventCreate");
checkCudaError(cudaEventCreate(&end), "cudaEventCreate");

for (int i = 0; i < WARMUP_ITERATIONS; ++i) {
checkCudaError(cudaMemcpy(d_dst, d_src, SIZE, cudaMemcpyHostToDevice), "cudaMemcpy");
}

checkCudaError(cudaEventRecord(start), "cudaEventRecord");

for (int i = 0; i < ITERATIONS; ++i) {
checkCudaError(cudaMemcpy(d_dst, d_src, SIZE, cudaMemcpyHostToDevice), "cudaMemcpy");
}

checkCudaError(cudaEventRecord(end), "cudaEventRecord");
checkCudaError(cudaEventSynchronize(end), "cudaEventSynchronize");

checkCudaError(cudaEventElapsedTime(&elapsed_time, start, end), "cudaEventElapsedTime");

double bandwidth = SIZE * ITERATIONS / (elapsed_time / 1000.0);

printf("[FlagPerf Result]transfer-bandwidth=%.2fGiB/s\n", bandwidth / (1024.0 * 1024.0 * 1024.0));
printf("[FlagPerf Result]transfer-bandwidth=%.2fGB/s\n", bandwidth / (1000.0 * 1000.0 * 1000.0));

checkCudaError(cudaFreeHost(d_src), "cudaFreeHost");
checkCudaError(cudaFree(d_dst), "cudaFree");
checkCudaError(cudaEventDestroy(start), "cudaEventDestroy");
checkCudaError(cudaEventDestroy(end), "cudaEventDestroy");

return 0;
}
8 changes: 8 additions & 0 deletions base/toolkits/interconnect-h2d/metax/C550/main.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
export MACA_PATH=/opt/maca
export CUDA_PATH=$MACA_PATH/tools/cu-bridge
export MACA_CLANG_PATH=$MACA_PATH/mxgpu_llvm/bin
export LD_LIBRARY_PATH=./:$MACA_PATH/lib:$LD_LIBRARY_PATH
export PATH=$CUDA_PATH/bin:$MACA_CLANG_PATH:$PATH
export MACA_VISIBLE_DEVICES=0
cucc bandwidth.cu -lcublas -o bdtest
./bdtest
Loading

0 comments on commit 013fc93

Please sign in to comment.