9.4.1. Foreword¶
Horizon’s J5 AI Benchmark Sample Package (hereafter referred to as the ABP) contains the most frequently used performance and accuracy evaluation samples of classification, detection, segmentation, optical-flow, tracking estimation, lidar multitask, bev and depth estimation models. In model performance evaluation samples, developers are able, not only to evaluate the single frame latency, but also to evaluate the dual-core latency using multi-thread scheduling. The pre-build source code, executable programs and evaluation scripts in the ABP allow developers to experience the samples, and develop their own applications, which makes development easier.
9.4.2. About the Deliverables¶
The AI Benchmark sample package is located in the ddk/samples/ai_benchmark/ path of the horizon_j5_open_explorer release package and consists the following main contents:
NO. |
NAME |
DESCRIPTIONS |
|---|---|---|
1 |
code |
This folder contains sample source code and compilation scripts. |
2 |
j5 |
Dev board operating environment of the ABP. |
9.4.2.1. Sample Code Package¶
Note
The on-board model needs to be obtained first by executing the resolve_ai_benchmark_ptq.sh and the resolve_ai_benchmark_qat.sh
in the ddk/samples/ai_toolchain/model_zoo/runtime/ai_benchmark directory of the OE package respectively.
Directory of the sample code package is shown as below:
ai_benchmark/code/ # sample source code
├── build_ptq_j5.sh
├── build_qat_j5.sh
├── CMakeLists.txt
├── deps_gcc9.3 # third party dependencies
│ └── aarch64
├── include # source code header files
│ ├── base
│ ├── input
│ ├── method
│ ├── output
│ ├── plugin
│ └── utils
├── README.md
└── src # sample source code
├── input
├── method
├── output
├── plugin
├── simple_example.cc # sample main program
└── utils
ai_benchmark/j5 # Sample package runtime environment
├── ptq # PTQ (post-training quantization) model samples
│ ├── data # accuracy evaluation dataset
│ ├── mini_data # performance evaluations dataset
│ │ ├── cityscapes
│ │ ├── coco
│ │ ├── imagenet
│ │ └── voc
│ ├── model # PTQ (post-training quantization) solution nv12 model
│ │ ├── README.md
│ │ └── runtime -> ../../../../model_zoo/runtime/ai_benchmark/ptq # Soft link to the model in the OE package, the board environment you need to specify the model path
│ ├── README.md
│ ├── script # execution script
│ │ ├── aarch64 # executable files generated by the compilation and dependencies
│ │ ├── classification # samples classification models
│ │ │ ├── efficientnasnet_m
│ │ │ ├── efficientnasnet_s
│ │ │ ├── efficientnet_lite0
│ │ │ ├── efficientnet_lite1
│ │ │ ├── efficientnet_lite2
│ │ │ ├── efficientnet_lite3
│ │ │ ├── efficientnet_lite4
│ │ │ ├── googlenet
│ │ │ ├── mobilenetv1
│ │ │ ├── mobilenetv2
│ │ │ ├── resnet18
│ │ │ └── vargconvnet
│ │ ├── config # configuration files of model inference
│ │ │ └── model
│ │ ├── detection # detection model samples
│ │ │ ├── centernet_resnet101
│ │ │ ├── preq_qat_fcos_efficientnetb0
│ │ │ ├── preq_qat_fcos_efficientnetb2
│ │ │ ├── preq_qat_fcos_efficientnetb3
│ │ │ ├── ssd_mobilenetv1
│ │ │ ├── yolov2_darknet19
│ │ │ ├── yolov3_darknet53
│ │ │ ├── yolov3_vargdarknet
│ │ │ └── yolov5x
│ │ ├── segmentation # segmentation model samples
│ │ │ ├── deeplabv3plus_efficientnetb0
│ │ │ ├── deeplabv3plus_efficientnetm1
│ │ │ ├── deeplabv3plus_efficientnetm2
│ │ │ └── fastscnn_efficientnetb0
│ │ ├── env.sh # basic environment script
│ │ └── README.md
│ └── tools # accuracy evaluation tools
│ ├── python_tools
│ └── README.md
└── qat # QAT training model samples
├── data # model accuracy evaluation dataset
├── mini_data # model performance evaluations dataset
├── model # QAT scheme nv12 model
│ ├── README.md
│ └── runtime -> ../../../../model_zoo/runtime/ai_benchmark/qat # Soft link to the model in the OE package, the board environment you need to specify the model path
├── README.md
├── script # execution script
│ ├── aarch64 # executable files generated by the compilation and dependencies
│ ├── bev # bev model samples
│ │ ├── bev_mt_gkt
│ │ ├── bev_mt_ipm
│ │ ├── bev_mt_ipm_temporal
│ │ ├── bev_mt_lss
│ │ ├── detr3d_efficientnetb3
│ │ └── petr_efficientnetb3
│ ├── classification # samples classification models
│ │ ├── mixvargenet
│ │ ├── mobilenetv1
│ │ ├── mobilenetv2
│ │ ├── resnet18
│ │ ├── resnet50
│ │ ├── swint
│ │ └── vargnetv2
│ ├── config # model inference profile
│ │ ├── model
│ │ ├── preprocess
│ │ ├── reference_points
│ │ └── visible
│ ├── detection # detection model samples
│ │ ├── centerpoint_pointpillar
│ │ ├── detr_efficientnetb3
│ │ ├── detr_resnet50
│ │ ├── fcos_efficientnetb0
│ │ ├── fcos3d_efficientnetb0
│ │ ├── ganet
│ │ ├── keypoint_efficientnetb0
│ │ ├── retinanet
│ │ ├── pointpillars_kitti_car
│ │ └── yolov3_mobilenetv1
│ ├── disparity_pred # disparity pred model samples
│ │ └── stereonet_plus
│ ├── multitask # multi task model samples
│ │ └── lidar_multitask
│ ├── opticalflow # optical flow model samples
│ │ └── pwcnet_opticalflow
│ ├── segmentation # segmentation model samples
│ │ └── mobilenet_unet
│ ├── tracking # tracking model samples
│ │ └── motr
│ ├── traj_pred # traj pred model samples
│ │ └── densetnt_argoverse1
│ ├── env.sh # basic environment scripts
│ └── README.md
└── tools # pre-processing and accuracy evaluation tools
├── eval_preprocess
├── python_tools
└── README.md
The code directory contains the source code of the evaluation program, used to evaluate model performance and accuracy.
The j5 directory contains various pre-compiled application programs and evaluation scripts, used to evaluate the accuracy and performance of different models in Horizon’s BPU (Brain Processing Unit).
The build_ptq_j5.sh script is the PTQ real program one-click compilation script.
The build_qat_j5.sh script is the QAT real program one-click compilation script.
The deps_gcc9.3 sub-directory contains dependencies required by sample codes, including:
appsdk gflags glog hb_dsp nlohmann opencv rapidjson
9.4.2.2. Sample Models¶
Model_zoo, the releases of AI Benchmark sample package’s PTQ models and QAT models, are respectively located in the paths of the horizon_j5_open_explorer release: ddk/samples/ai_toolchain/model_zoo/runtime/ai_benchmark/ptq and ddk/samples/ai_toolchain/model_zoo/runtime/ai_benchmark/qat.
It contains commonly used classification, detection, segmentation and optical flow prediction models, and the naming rules of the models is {model_name}_{backbone}_{input_size}_{input_type}.
The PTQ models in model_zoo are compiled by the PTQ model conversion sample package (i.e., ddk/samples/ai_toolchain/horizon_model_convert_sample/).
For detailed information about the original models, see the README document in the PTQ model conversion sample package, or read PTQ Model Conversion Samples Guide .
PTQMODEL |
MODEL NAME |
|---|---|
centernet_resnet101 |
centernet_resnet101_512x512_nv12.bin |
deeplabv3plus_efficientnetb0 |
deeplabv3plus_efficientnetb0_1024x2048_nv12.bin |
deeplabv3plus_efficientnetm1 |
deeplabv3plus_efficientnetm1_1024x2048_nv12.bin |
deeplabv3plus_efficientnetm2 |
deeplabv3plus_efficientnetm2_1024x2048_nv12.bin |
efficientnasnet_m |
efficientnasnet_m_300x300_nv12.bin |
efficientnasnet_s |
efficientnasnet_s_280x280_nv12.bin |
efficientnet_lite0 |
efficientnet_lite0_224x224_nv12.bin |
efficientnet_lite1 |
efficientnet_lite1_240x240_nv12.bin |
efficientnet_lite2 |
efficientnet_lite2_260x260_nv12.bin |
efficientnet_lite3 |
efficientnet_lite3_280x280_nv12.bin |
efficientnet_lite4 |
efficientnet_lite4_300x300_nv12.bin |
fastscnn_efficientnetb0 |
fastscnn_efficientnetb0_1024x2048_nv12.bin |
preq_qat_fcos_efficientnetb0 |
fcos_efficientnetb0_512x512_nv12.bin |
preq_qat_fcos_efficientnetb2 |
fcos_efficientnetb2_768x768_nv12.bin |
preq_qat_fcos_efficientnetb3 |
fcos_efficientnetb3_896x896_nv12.bin |
googlenet |
googlenet_224x224_nv12.bin |
mobilenetv1 |
mobilenetv1_224x224_nv12.bin |
mobilenetv1_128x128_resizer_nv12.bin |
|
mobilenetv2 |
mobilenetv2_224x224_nv12.bin |
resnet18 |
resnet18_224x224_nv12.bin |
ssd_mobilenetv1 |
ssd_mobilenetv1_300x300_nv12.bin |
vargconvnet |
vargconvnet_224x224_nv12.bin |
yolov2_darknet19 |
yolov2_darknet19_608x608_nv12.bin |
yolov3_darknet53 |
yolov3_darknet53_416x416_nv12.bin |
yolov3_vargdarknet |
yolov3_vargdarknet_416x416_nv12.bin |
yolov5x |
yolov5x_672x672_nv12.bin |
QATMODEL |
MODEL NAME |
|---|---|
bev_mt_gkt |
bev_gkt_mixvargenet_multitask_nuscenes/compile/model.hbm |
bev_mt_ipm |
bev_ipm_efficientnetb0_multitask_nuscenes/compile/model.hbm |
bev_mt_ipm_temporal |
bev_ipm_4d_efficientnetb0_multitask_nuscenes/compile/model.hbm |
bev_mt_lss |
bev_lss_efficientnetb0_multitask_nuscenes/compile/model.hbm |
centerpoint_pointpillar |
centerpoint_pointpillar_nuscenes/compile/model.hbm |
densetnt_argoverse1 |
densetnt_vectornet_argoverse1/compile/model.hbm |
detr_efficientnetb3 |
detr_efficientnetb3_mscoco/compile/model.hbm |
detr_resnet50 |
detr_resnet50_mscoco/compile/model.hbm |
detr3d_efficientnetb3 |
detr3d_efficientnetb3_nuscenes/compile/model.hbm |
fcos_efficientnetb0 |
fcos_efficientnetb0_mscoco/compile/model.hbm |
fcos3d_efficientnetb0 |
fcos3d_efficientnetb0_nuscenes/compile/model.hbm |
ganet |
ganet_mixvargenet_culane/compile/model.hbm |
keypoint_efficientnetb0 |
keypoint_efficientnetb0_carfusion/compile/model.hbm |
lidar_multitask |
centerpoint_mixvargnet_multitask_nuscenes/compile/model.hbm |
mixvargenet |
mixvargenet_imagenet/compile/model.hbm |
mobilenet_unet |
unet_mobilenetv1_cityscapes/compile/model.hbm |
mobilenetv1 |
mobilenetv1_imagenet/compile/model.hbm |
mobilenetv2 |
mobilenetv2_imagenet/compile/model.hbm |
petr_efficientnetb3 |
petr_efficientnetb3_nuscenes/compile/model.hbm |
pwcnet_opticalflow |
pwcnet_pwcnetneck_flyingchairs/compile/model.hbm |
qim |
motr_efficientnetb3_mot17/qim/compile/model.hbm |
resnet18 |
resnet18_imagenet/compile/model.hbm |
resnet50 |
resnet50_imagenet/compile/model.hbm |
retinanet |
retinanet_vargnetv2_fpn_mscoco/compile/model.hbm |
swint |
horizon_swin_transformer_imagenet/compile/model.hbm |
motr |
motr_efficientnetb3_mot17/compile/model.hbm |
vargnetv2 |
vargnetv2_imagenet/compile/model.hbm |
yolov3_mobilenetv1 |
yolo_mobilenetv1_voc/compile/model.hbm |
pointpillars_kitti_car |
pointpillars_kitti_car/compile/model.hbm |
9.4.2.3. Public Datasets¶
The public datasets required by the ABP are: VOC, COCO, ImageNet, Cityscapes, FlyingChairs, Kitti3d, Culane, Nuscenes, Mot17 and Carfusion. Access is as follows, if you have any questions during the data preparation process, please contact Horizon.
VOC: http://host.robots.ox.ac.uk/pascal/VOC/ (use VOC2012)
COCO: https://cocodataset.org/#download
ImageNet: https://www.image-net.org/download.php
Cityscapes: https://github.com/mcordts/cityscapesScripts
FlyingChairs: https://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs.en.html
KITTI3D: https://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d
CULane: https://xingangpan.github.io/projects/CULane.html
nuScenes: https://www.nuscenes.org/nuscenes#download
mot17: https://opendatalab.com/MOT17
carfusion: http://www.cs.cmu.edu/~ILIM/projects/IM/CarFusion/cvpr2018/index.html
argoverse1: https://www.argoverse.org/av1.html
SceneFlow: https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html
9.4.3. Development Environment¶
9.4.3.1. Prepare the Development Board¶
1.After getting the development board, follow the descriptions in the System Image Upgrade to upgrade the system image file to the version recommended by the sample package.
2.Make sure the remote connection between local dev machine and the dev board.
9.4.3.2. Compilation¶
Install the gcc-ubuntu-9.3.0-2020.03-x86_64-aarch64-linux-gnu cross-compilation tool in current environment, and then run the build_ptq_j5.sh script in the code folder to compile executable programs in real machine. The executable programs and corresponding dependencies will be copied into the aarch64 sub-folder of the j5/ptq/script folder automatically.
Note
The cross-compilation tool specified by the build_ptq_j5.sh script is located in the /opt folder. If you want to install it into some other locations, please modify the build_ptq_j5.sh script.
export CC=/opt/gcc-ubuntu-9.3.0-2020.03-x86_64-aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc
export CXX=/opt/gcc-ubuntu-9.3.0-2020.03-x86_64-aarch64-linux-gnu/bin/aarch64-linux-gnu-g++
9.4.4. How to Use¶
9.4.4.1. Evaluation Scripts¶
Evaluation sample scripts are in the script and tools folders. The script folder contains the scripts used for evaluating frequently used classification, detection, segmentation, optical-flow, tracking estimation, lidar multitask, bev and depth estimation models in dev board. There are three scripts under each model:
The fps.sh script implements FPS statistics (multi-threading scheduling. You can freely specify number of threads as needed) .
The latency.sh implements statistics of single-frame latency (one thread, single-frame).
The accuracy.sh script is used for evaluating model accuracy.
script:
├── aarch64 # executable files generated by the compilation and dependencies
│ ├── bin
│ └── lib
├── env.sh # base config
├── config
│ ├── model
│ │ ├── data_name_list # image name config
│ │ └── input_init # model input config
│ ├── preprocess
│ │ └── centerpoint_preprocess_5dim.json # preprocess config
│ └── reference_points # reference points config
│ │ ├── bev_mt_gkt
│ │ ├── bev_mt_ipm
│ │ ├── bev_mt_ipm_temporal
│ │ └── bev_mt_lss
├── detection # detection models
│ ├── fcos_efficientnetb0 # there are other models in this directory, just use this model directory as a reference
│ │ ├── accuracy.sh
│ │ ├── fps.sh
│ │ ├── latency.sh
│ │ ├── workflow_accuracy.json
│ │ ├── workflow_fps.json
│ │ ├── workflow_latency.json
│ ......
├── bev # bev models (qat)
├── classification # classification models
├── disparity_pred # disparity pred models (qat)
├── multitask # multitask models
├── segmentation # segmentation models
├── opticalflow # optical-flow models (qat)
├── tracking # tracking models (qat)
├── traj_pred # traj pred models (qat)
└── README.md
The (PTQ)tools folder contains the scripts used for accuracy evaluation, mainly including the precision calculation scripts under python_tools.
tools:
python_tools
└── accuracy_tools
├── cityscapes_metric.py
├── cls_eval.py
├── coco_metric.py
├── config.py
├── coco_det_eval.py
├── parsing_eval.py
└── voc_det_eval.py
└── voc_metric.py
The (QAT)tools folder contains the scripts used for accuracy evaluation, mainly including pre-processing scripts and precision calculation scripts.
tools:
tools/
├── eval_preprocess
│ ├── util
│ ├── bev_preprocess.py
│ ├── centerpoint_preprocess.py
│ ├── detr_process.py
│ ├── dentsetnt_process.py
│ ├── fcos_process.py
│ ├── fcos3d_process.py
│ ├── ganet_process.py
│ ├── lidar_preprocess.py
│ ├── motr_process.py
│ ├── imagenet.py
│ ├── pwcnet_process.py
│ ├── pointpillars_process.py
│ ├── retinanet_process.py
│ ├── stereonet_preprocess.py
│ └── voc.py
├── python_tools
│ └── accuracy_tools
│ ├── argoverse_util
│ ├── nuscenes_metric_pro
│ ├── whl_package
│ ├── bev_eval.py
│ ├── centerpoint_eval.py
│ ├── cityscapes_metric.py
│ ├── cls_eval.py
│ ├── coco_metric.py
│ ├── config.py
│ ├── dentsetnt_eval.py
│ ├── detr_eval.py
│ ├── fcos_eval.py
│ ├── fcos3d_eval.py
│ ├── ganet_eval.py
│ ├── lidar_multitask_eval.py
│ ├── motr_eval.py
│ ├── parsing_eval.py
│ ├── pwcnet_eval.py
│ └── retinanet_eval.py
│ └── voc_metric.py
│ └── kitti3d_metric.py
│ └── pointpillars_eval.py
│ └── stereonet_eval.py
│ └── yolov3_eval.py
└── README.md
Attention
Run the following commands before the evaluation and copy the ptq ( qat ) directory to the dev board.
scp -r ddk/samples/ai_benchmark/j5/ptq root@192.168.1.1:/userdata/ptq/
scp -r ddk/samples/ai_benchmark/j5/qat root@192.168.1.1:/userdata/qat/
9.4.4.2. Performance Evaluation¶
Performance evaluation is divided into latency and fps.
9.4.4.2.1. How to Use Performance Evaluation Scripts¶
To evaluate the latency:
In the directory of the to-be-evaluated model, run sh latency.sh to evaluate single frame latency, as shown below:
I0419 02:35:07.041095 39124 output_plugin.cc:80] Infer latency: [avg: 13.124ms, max: 13.946ms, min: 13.048ms], Post process latency: [avg: 3.584ms, max: 3.650ms, min: 3.498ms].
Note
inferdenotes the time consumption of model inference.Post processdenotes the time consumption of post-processing.
fps:
This function uses multi-threaded concurrency and is designed to allow the model to reach the ultimate performance on BPU. Due to the multi-thread concurrency and data sampling, the frame rate value will be low during the start-up phase, then the frame rate will increase and gradually stabilize, with the frame rate fluctuating within 0.5%..
To test the frame rate, go to the model directory and run sh fps.sh, as shown below.
I0419 02:35:00.044417 39094 output_plugin.cc:109] Throughput: 1129.39fps # model frame rate
9.4.4.2.2. About Command-line Parameters¶
The fps.sh script is shown as below:
#!/bin/sh
source ../../env.sh
export SHOW_FPS_LOG=1
export STAT_CYCLE=100 # specify environment variable, fps log printing frequency
${app} \
--config_file=workflow_fps.json \
--log_level=1
The latency.sh script is shown as below:
#!/bin/sh
source ../../env.sh
export SHOW_LATENCY_LOG=1 # specify environment variable, print latency level log
export STAT_CYCLE=50 # specify environment variable, latency log printing frequency
${app} \
--config_file=workflow_latency.json \
--log_level=1
9.4.4.2.3. About Configuration File¶
Attention
When the max_cache parameter takes effect, the image will be pre-processed and read into memory. To ensure the stable running of your application, do not set too large a value, we recommend that you set a value of no more than 30.
Take centernet_resnet101 model for example
Content of fps workflow configuration file is shown as below:
{
"input_config": {
"input_type": "image", # input data format, support image or bin file
"height": 512, # input data height
"width": 512, # input data width
"data_type": 1, # input data type:HB_DNN_IMG_TYPE_NV12
"image_list_file": "../../../mini_data/coco/coco.lst", # path to the lst file
"need_pre_load": true, # whether to read the dataset using the preload method
"limit": 12,
"need_loop": true, # whether to use cyclic read data for evaluation
},
"output_config": {
"output_type": "image", # output data format
"image_list_enable": true,
"in_order": false # whether to output in order
},
"workflow": [
{
"method_type": "InferMethod", # inference method
"unique_name": "InferMethod",
"method_config": {
"core": 0, # core id
"model_file": "../../../model/runtime/centernet_resnet101/centernet_resnet101_512x512_nv12.bin" # path to model file
}
},
{
"thread_count": 8, # post-process thread numbers
"method_type": "PTQCenternetMaxPoolSigmoidPostProcessMethod", # post-process method
"unique_name": "PTQCenternetMaxPoolSigmoidPostProcessMethod",
"method_config": { # post-process parameters
"class_num": 80,
"score_threshold": 0.4,
"topk": 100,
"det_name_list": "../../config/model/data_name_list/coco_classes.names"
}
}
]
}
Content latency workflow configuration file is shown as below:
{
"input_config": {
"input_type": "image",
"height": 512,
"width": 512,
"data_type": 1,
"image_list_file": "../../../mini_data/coco/coco.lst",
"need_pre_load": true,
"limit": 2,
"need_loop": true,
"max_cache": 10
},
"output_config": {
"output_type": "image",
"enable_view_output": false,
"view_output_dir": "./output_dir",
"image_list_enable": true
},
"workflow": [
{
"method_type": "InferMethod",
"unique_name": "InferMethod",
"method_config": {
"core": 0,
"model_file": "../../../model/runtime/centernet_resnet101/centernet_resnet101_512x512_nv12.bin"
}
},
{
"thread_count": 1,
"method_type": "PTQCenternetMaxPoolSigmoidPostProcessMethod",
"unique_name": "PTQCenternetMaxPoolSigmoidPostProcessMethod",
"method_config": {
"class_num": 80,
"score_threshold": 0.4,
"topk": 100,
"det_name_list": "../../config/model/data_name_list/coco_classes.names"
}
}
]
}
9.4.4.2.4. Result Visualization¶
If you want to see the effect of a single inference of the model, you can modify workflow_latency.json and re-run the latency.sh script to generate the display effect in the output_dir directory.
Attention
When the display effect is generated, the script will run slowly due to the dump effect; only the latency.sh script dump is supported.
9.4.4.2.4.1. Visual operation steps¶
Modify the latency configuration file
"output_config": {
"output_type": "image",
"enable_view_output": true, # turn on visualization
"view_output_dir": "./output_dir", # visualization result output path
"image_list_enable": true,
"in_order": false
}
Execute the latency.sh script
sh latency.sh
Attention
The visualization of the bev model needs to specify the scene information and the path of the homography matrix. The homography matrix is used for the conversion of the camera perspective and the bird’s-eye view. Different scenes have their own homography matrices.
The workflow_latency.json configuration file of the bev model is recommended to be modified as follows:
"output_config": {
"output_type": "image",
"enable_view_output": true, # turn on visualization
"view_output_dir": "./output_dir", # visualization result output path
"bev_ego2img_info": [
"../../config/visible/bev/scenes.json", # scene information of input file
"../../config/visible/bev/boston.bin", # homography matrix of the boston scene
"../../config/visible/bev/singapore.bin" # homography matrix of the singapore scene
],
"image_list_enable": true,
"in_order": false
}
9.4.4.2.4.2. Visualization¶
model category |
visualization |
|---|---|
classification |
|
detection 2d |
|
detection 3d |
|
segmentation |
|
keypoint |
|
lane line |
|
optical flow |
|
lidar |
|
lidar multitask |
|
bev |
|
traj_pred |
|
disparity_pred |
|
Attention
If you need to visualize images other than minidata during trajectory prediction visualization, you need to configure additional road information and trajectory information files in minidata/argoverse1/visualization. You can use the densent_process.py preprocessing script to generate configuration files, and set –is-gen-visual-config parameter to true.
9.4.4.3. Model Accuracy Evaluation¶
Take the following 5 steps to perform the model evaluation:
1.data pre-process.
2.data mounting.
3.lst file generation.
4.model inference.
5.model accuracy computing.
9.4.4.3.1. Data Pre-processing¶
9.4.4.3.1.1. PTQ Model Data Pre-processing¶
To the PTQ model data pre-processing, run the hb_eval_preprocess tool in x86
to pre-process data. The so-called pre-processing refers to the special
processing operations before images are fed into the model. For example: resize,
crop and padding, etc. The tool is integrated into the horizon_tc_ui tool and it will be available after the tool is installed using the install script. After the raw dataset is pre-processed by the tool, the corresponding pre-processed binary file .bin file set of the model will be generated.
You can directly run hb_eval_preprocess --help for help.
Tip
About the hb_eval_preprocess tool command line parameters, you can type hb_eval_preprocess -h, or see the hb_eval_preprocess Tool in the PTQ tools guide.
The datasets corresponding to each model in the sample package are described in detail below, as well as the pre-processing operations for the corresponding datasets.
9.4.4.3.1.1.1. VOC Dataset¶
The VOC dataset is used for evaluating the ssd_mobilenetv1 model. You can download this dataset from the official website: VOC dataset official website download address, We recommend that you decompress the downloaded dataset into the following structure. If you encounter any problems during data preparation, please contact Horizon. The samples in the ABP use the val.txt file in the Main folder, the source images in the JPEGImages folder and the annotations in the Annotations folder.
.
└── VOCdevkit # root directory
└── VOC2012 # datasets of different years. Here only contains dataset 2012, there are dataset 2007 and more
├── Annotations # the XML files which explain the images in corresponding to the images in the JPEGImages folder
├── ImageSets # this folder stores the TXT files in which each line contains an image name. the ±1 at the end denotes positive/negative samples
│ ├── Action
│ ├── Layout
│ ├── Main
│ └── Segmentation
├── JPEGImages # contains source images
├── SegmentationClass # contains semantic segmentation related images
└── SegmentationObject # contains instance segmentation related images
Preprocess the dataset:
hb_eval_preprocess -m ssd_mobilenetv1 -i VOCdevkit/VOC2012/JPEGImages -v VOCdevkit/VOC2012/ImageSets/Main/val.txt -o ./pre_ssd_mobilenetv1
9.4.4.3.1.1.2. COCO Dataset¶
The COCO dataset is used for evaluating the centernet_resnet101, detr_efficientnetb3, detr_resnet50, yolov2_darknet19, yolov3_darknet53, yolov3_vargdarknet, yolov5x, preq_qat_fcos_efficientnetb0, preq_qat_fcos_efficientnetb2 and preq_qat_fcos_efficientnetb3 models. You can download this dataset from the official website: COCO dataset official website download address, We recommend that you decompress the downloaded data set into the following structure. If you encounter any problems during the data preparation process, please contact Horizon. The examples are mainly used in the instances_val2017.json annotation file and images under the annotations folder picture of:
.
├── annotations # contains annotation data
└── images # contains source images
Preprocess the dataset:
hb_eval_preprocess -m model_name -i coco/coco_val2017/images -o ./pre_model_name
9.4.4.3.1.1.3. ImageNet Dataset¶
The ImageNet dataset is used for evaluating classification models e.g. efficientnasnet_m, efficientnasnet_s, efficientnet_lite0, efficientnet_lite1, efficientnet_lite2, efficientnet_lite3, efficientnet_lite, googlenet, mobilenetv1, mobilenetv2, resnet18 and vargconvnet etc. You can download this dataset from the official website: ImageNet dataset official website download address, We recommend that you decompress the downloaded dataset into the following structure. If you encounter any problems during data preparation, please contact Horizon. The example mainly uses the annotation file val.txt and the source images in the val directory:
.
├── val.txt
└── val
Preprocess the dataset:
hb_eval_preprocess -m model_name -i imagenet/val -o ./pre_model_name
9.4.4.3.1.1.4. Cityscapes Dataset¶
The Cityscapes dataset is used for evaluating the deeplabv3plus_efficientnetb0, deeplabv3plus_efficientnetm1, deeplabv3plus_efficientnetm2 and fastscnn_efficientnetb0 model. You can download this dataset from the official website: Cityscapes dataset official website download address, We recommend that you decompress the downloaded dataset into the following structure. If you encounter any problems during data preparation, please contact Horizon. The samples in the ABP use the annotation files in the ./gtFine/val folder and the source images in the ./leftImg8bit/val folder.
.
├── gtFine
│ └── val
│ ├── frankfurt
│ ├── lindau
│ └── munster
└── leftImg8bit
└── val
├── frankfurt
├── lindau
└── munster
Preprocess the dataset:
hb_eval_preprocess -m model_name -i cityscapes/leftImg8bit/val -o ./pre_model_name
9.4.4.3.1.2. QAT Model Data Pre-processing¶
The QAT model data pre-process needs to execute the preprocess scripts in ai_benchmark_j5/j5/qat/tools/eval_preprocess in the x86 environment.
The datasets corresponding to each model in the sample package are described in detail as below, as well as the pre-processing operations for the corresponding datasets.
9.4.4.3.1.2.1. FlyingChairs Dataset¶
You can download the dataset from the official website: FlyingChairs dataset official website download address, We recommend that you decompress the downloaded data set into the following structure. If you encounter any problems during data preparation, please contact Horizon. The FlyingChairs dataset is used for evaluating the pwcnet_opticalflow model. Its directory is shown as below. {id}_img1.ppm and {id}_img1.ppm are an image pair, the image width size is 512, the height size is 384; {id} is serial numbers from 00001 to 22872, and each image pair is labeled {id}_flow.flo. The FlyingChairs_train_val.txt file is used to divide the training set and the validation set, The label value of 2 indicates the validation set. The structure of the original dataset file is as follows:
.
├── FlyingChairs_release
│ └── data
│ ├── 00001_img1.ppm
│ ├── 00001_img2.ppm
│ └── 00001_flow.ppm
├── FlyingChairs_train_val.txt
Preprocess the dataset:
#!/bin/sh
python3 pwcnet_process.py --input-path=./flyingchairs/FlyingChairs_release/data/ --val-file=./flyingchairs/FlyingChairs_train_val.txt --output-path=./pre_pwcnet_opticalflow
9.4.4.3.1.2.2. ImageNet Dataset¶
The ImageNet dataset is used to evaluate mixvargenet, mobilenetv1, mobilenetv2, resnet50, swint and vargnetv2 QAT classification models.
Preprocess the dataset:
#!/bin/sh
python3 imagenet.py --image-path=./standard_imagenet/val/ --save-path=./pre_model_name
9.4.4.3.1.2.3. Cityscapes Dataset¶
The Cityscapes dataset is used to evaluate mobilenet_unet QAT segmentation models, The validation set data can be used directly without pre-processing.
9.4.4.3.1.2.4. VOC Dataset¶
The VOC dataset is used to evaluate yolov3_mobilenetv1 QAT detection models.
Preprocess the dataset:
#!/bin/sh
python3 voc.py --image-path=./VOCdevkit/VOC2012/JPEGImages/ --save-path=./pre_yolov3_mobilenetv1
9.4.4.3.1.2.5. COCO Dataset¶
The COCO dataset is used to evaluate fcos_efficientnetb0 and retinanet QAT detection models.
Preprocess the dataset:
#!/bin/sh
python3 fcos_process.py --image-path=./mscoco/images/val2017/ --label-path=./mscoco/images/annotations/instances_val2017.json --save-path=./pre_fcos_efficientnetb0
#!/bin/sh
python3 retinanet_process.py --image-path=./mscoco/images/val2017/ --label-path=./mscoco/images/annotations/instances_val2017.json --save-path=./pre_retinanet
9.4.4.3.1.2.6. kitti3d Dataset¶
The kitti3d dataset is used to evaluate pointpillars_kitti_car detection models. You can download this dataset from the official website: kitti3d dataset official website download address, We recommend that you download the following compressed packages. If you encounter any problems during data preparation, please contact Horizon.
.
├── kitti3d
├── data_object_calib.zip # camera calibration matrices of object data set
├── data_object_image_2.zip # left color images of object data set
├── data_object_label_2.zip # taining labels of object data set
└── data_object_veloodyne.zip # velodyne point cloud
It is recommended that you decompress the downloaded dataset into the following structure:
.
├── kitti3d_origin
├── ImageSets
│ ├── test.txt
│ ├── train.txt
│ ├── trainval.txt
│ └── val.txt
├── testing
│ ├── calib
│ ├── image_2
│ └── velodyne
└── training
├── calib
├── image_2
├── label_2
└── velodyne
Preprocess the dataset:
#!/bin/sh
python3 pointpillars_process.py --data-path=./kitti3d_origin --save-path=./pre_kitti3d --height=1 --width=150000
9.4.4.3.1.2.7. culane Dataset¶
The culane dataset is used to evaluate ganet detection models. You can download this dataset from the official website: culane dataset official website download address, We recommend that you download the following compressed packages. If you encounter any problems during data preparation, please contact Horizon.
.
├── culane
├── annotations_new.tar.gz
├── driver_23_30frame.tar.gz
├── driver_37_30frame.tar.gz
├── driver_100_30frame.tar.gz
├── driver_161_90frame.tar.gz
├── driver_182_30frame.tar.gz
├── driver_193_90frame.tar.gz
├── laneseg_label_w16.tar.gz
└── list.tar.gz
where annotations_new.tar.gz needs to be unpacked at the end to make corrections to the original annotation file. After unpacking the original dataset, It is recommended that you decompress the downloaded dataset into the following structure:
.
├── culane # root directory
├── driver_23_30frame # datasets and annotations
│ ├── 05151640_0419.MP4 # a section of the dataset, containing each frame of the picture
│ │ ├──00000.jpg # source images
│ │ ├──00000.lines.txt # annotation file where each line gives the x,y coordinates of the lane marking keypoints
│ ......
├── driver_37_30frame
├── driver_100_30frame
├── driver_161_90frame
├── driver_182_30frame
├── driver_193_90frame
├── laneseg_label_w16 # lane segment labels
└── list # train, validate, test list
Preprocess the dataset:
#!/bin/sh
python3 ganet_process.py --image-path=./culane --save-path=./pre_culane
9.4.4.3.1.2.8. nuscenes Dataset¶
The nuscenes dataset is used to evaluate fcos3d, centerpoint_pointpillar detection, bev_mt_gkt, bev_mt_lss, bev_mt_ipm, bev_mt_ipm_temporal, detr3d_efficientnetb3, petr_efficientnetb3 models. To obtain this dataset, you can downloaded it from nuscenes dataset official website download address . Please contact Horizon if you have any questions during the data preparation process. We recommends that you download the following compressed packages:
├── Nuscenes
├── nuScenes-map-expansion-v1.3.zip
├── nuScenes-map-expansion-v1.2.zip
├── nuScenes-map-expansion-v1.1.zip
├── nuScenes-map-expansion-v1.0.zip
├── v1.0-mini.tar
├── v1.0-test_blobs.tar
├── v1.0-test_meta.tar
├── v1.0-trainval01_blobs.tar
├── v1.0-trainval02_blobs.tar
├── v1.0-trainval03_blobs.tar
├── v1.0-trainval04_blobs.tar
├── v1.0-trainval05_blobs.tar
├── v1.0-trainval06_blobs.tar
├── v1.0-trainval07_blobs.tar
├── v1.0-trainval08_blobs.tar
├── v1.0-trainval09_blobs.tar
├── v1.0-trainval10_blobs.tar
└── v1.0-trainval_meta.tar
For the lidar multitask model, you also need to download the lidar segmentation label lidarseg from the official website, and update v1.0-trainval according to the nuscenes official website tutorial Horizon recommends that you decompress the downloaded dataset into the following structure:
.
├── Nuscenes
├── can_bus
├── lidarseg
├── maps
├── nuscenes
│ └── meta
│ ├── maps
│ ├── v1.0-mini
│ └── v1.0-trainval
├── samples
├── sweeps
├── v1.0-mini
└── v1.0-trainval
Preprocess the dataset:
Attention
In addition to generating pre-processed images, fcos3d_process.py will also process the internal parameters of the camera to be used, and generate the corresponding camera internal reference configuration files.
In addition to generating preprocessed data, centerpoint_preprocess.py, bev_preprocess.py and lidar_preprocess.py will also generate a val_gt_infos.pkl file under the preprocessed data path for precision calculation.
bev_preprocess.py needs to specify the name of the model through –model, the options are bev_mt_gkt, bev_mt_lss, bev_mt_ipm, bev_mt_ipm_temporal, detr3d and petr.
#!/bin/sh
python3 fcos3d_process.py --src-data-dir=./Nuscenes --file-path=../../script/config/model/data_name_list/nuscenes_names.txt --save-path=./processed_fcos3d_images
#!/bin/sh
python3 centerpoint_preprocess.py --data-path=./Nuscenes --save-path=./nuscenes_lidar_val
#!/bin/sh
python3 bev_preprocess.py --model=model_name --data-path=./Nuscenes --meta-path=./Nuscenes/meta --reference-path=../../script/config/reference_points --save-path=./nuscenes_bev_val
#!/bin/sh
python3 lidar_preprocess.py --data-path=./Nuscenes --save-path=./nuscenes_lidar_val
9.4.4.3.1.2.9. mot17 Dataset¶
The mot17 dataset is used to evaluate motr models. To obtain this dataset, you can downloaded it from mot17 dataset official website download address . Please contact Horizon if you have any questions during the data preparation process. We recommends that you download the following compressed packages:
.
├── valdata # root directory
├── gt_val
│ ├── MOT17-02-SDP
│ ├── MOT17-04-SDP
│ ├── MOT17-05-SDP
│ ├── MOT17-09-SDP
│ ├── MOT17-10-SDP
│ ├── MOT17-11-SDP
│ ├── MOT17-13-SDP
├── images
│ └── train
│ ├── MOT17-04-SDP
│ ├── MOT17-05-SDP
│ ├── MOT17-09-SDP
│ ├── MOT17-10-SDP
│ ├── MOT17-11-SDP
│ ├── MOT17-13-SDP
└── mot17.val
Preprocess the dataset:
#!/bin/sh
python3 motr_process.py --image-path=./valdata/images/train --save-path=./processed_motr
9.4.4.3.1.2.10. carfusion Dataset¶
The carfusion dataset is used to evaluate keypoint_efficientnetb0 model. To obtain this dataset, you can downloaded it from carfusion dataset official website download address . Please contact Horizon if you have any questions during the data preparation process. We recommends that you download the following compressed packages:
.
├── carfusion # Root directory
├── train
└── test
Preprocess the dataset:
#!/bin/sh
# First generate the data required for evaluation(If you are using this dataset for the first time, you must use the following script to generate the data required for the evaluation)
python3 gen_carfusion_data.py --src-data-path=carfusion --out-dir=cropped_data --num-workers 2
The directory after executing the first script is as follows:
.
├── cropped_data # Root directory
├── test
└── simple_anno
Ensure that the data root address is at the same level as the cropped_data, and then run the following command:
#!/bin/sh
python3 keypoints_preprocess.py --data-root=./ --label-path=cropped_data/simple_anno/keypoints_test.json --save-path=./processed_carfusion
9.4.4.3.1.2.11. argoverse1 Dataset¶
The argoverse1 dataset is used to evaluate keypoint_efficientnetb0 model. To obtain this dataset, you can downloaded it from argoverse1 dataset official website download address . Please contact Horizon if you have any questions during the data preparation process. We recommends that you download the following compressed packages:
.
├── arogverse-1 # Root directory
├── map_files
└── val
Preprocess the dataset:
Attention
The densetnt_process.py generates the appropriate review meta file under src-path in addition to the preprocessed input. The evaluation only needs to use the two parameters –src-path and –dst-path, and there is no need to pay attention to other parameters.
#!/bin/sh
python3 densetnt_process.py --src-path=arogverse-1 --dst-path=processed_arogverse1
9.4.4.3.1.2.12. SceneFlow Dataset¶
The SceneFlow dataset is used to evaluate stereonet_plus model. To obtain this dataset, you can downloaded it from SceneFlow dataset official website download address . Please contact Horizon if you have any questions during the data preparation process. We recommends that you download the following compressed packages:
.
├── SceneFlow # Root directory
├── FlyingThings3D
│ ├── disparity
│ ├── frames_finalpass
└── SceneFlow_finalpass_test.txt
Preprocess the dataset:
Attention
In addition to generating preprocessed data, stereonet_preprocess.py, bev_preprocess.py and lidar_preprocess.py will also generate a val_gt_infos.pkl file under the preprocessed data path for precision calculation.
#!/bin/sh
python3 stereonet_preprocess.py --data-path=SceneFlow/ --data-list=SceneFlow/SceneFlow_finalpass_test.txt --save-path=sceneflow_val
Tip
Before use, please modify the dataset path and save path in the script to make the script run properly.
9.4.4.3.2. Model Mounting¶
Because datasets are huge, it is recommended to mount them for dev board to load, rather than to copy them into the dev board.
Server PC terminal (root permission is required):
Attention
A root permission is required to run this command.
Edit one line into /etc/exports:
/nfs *(insecure,rw,sync,all_squash,anonuid=1000,anongid=1000,no_subtree_check). Wherein,/nfsdenotes mounting path of local machine, it can be replaced by user-specified directory.Run
exportfs -a -rto bring /etc/exports into effect.
Board terminal:
Create the directory to be mounted:
mkdir -p /mnt.mount -t nfs {PC terminal IP}:/nfs /mnt -o nolock.
Mount the /nfs folder at PC terminal to the /mnt folder in dev board. In this way, mount the folder in which contains preprocessed folder to dev board and create a soft link of /data folder in the /ptq or /qat folder (at the same directory level as /script) in dev board.
9.4.4.3.3. Lst File Generation¶
The running process of precision calculation script in the sample is:
According to the value of
image_list_fileinworkflow_accuracy.json, find thelstfile of the corresponding data set.Load each preprocessing file according to the path information of preprocessing file stored in
lstfile, and then perform the inference.
Therefore, after generating the preprocessing file, you need to generate the corresponding LST file, and write the path of each preprocessing file into the lst file, which is related to the storage location of the data set at the board end.
Here, we recommend that its storage location shall be the same level as the ./data/dataset_name/pre_model_name folder.
The structure of the PTQ pre-processed data set is as follows:
|-- ptq
| |-- data
| | |-- cityscapes
| | | |-- pre_deeplabv3plus_efficientnetb0
| | | | |-- xxxx.bin # pre-processed binary file
| | | | |-- ....
| | | |-- pre_deeplabv3plus_efficientnetb0.lst # lst file: record the path of each preprocessing file
| | | |-- pre_deeplabv3plus_efficientnetm1
| | | |-- pre_deeplabv3plus_efficientnetm1.lst
| | | |-- pre_deeplabv3plus_efficientnetm2
| | | |-- pre_deeplabv3plus_efficientnetm2.lst
| | | |-- pre_fastscnn_efficientnetb0
| | | |-- pre_fastscnn_efficientnetb0.lst
| | |-- coco
| | | |-- pre_centernet_resnet101
| | | | |-- xxxx.bin
| | | | |-- ....
| | | |-- pre_centernet_resnet101.lst
| | | |-- pre_yolov2_darknet19
| | | |-- pre_yolov2_darknet19.lst
| | | |-- pre_yolov3_darknet53
| | | |-- pre_yolov3_darknet53.lst
| | | |-- pre_yolov3_vargdarknet
| | | |-- pre_yolov3_vargdarknet.lst
| | | |-- pre_yolov5x
| | | |-- pre_yolov5x.lst
| | | |-- pre_preq_qat_fcos_efficientnetb0
| | | |-- pre_preq_qat_fcos_efficientnetb0.lst
| | | |-- pre_preq_qat_fcos_efficientnetb2
| | | |-- pre_preq_qat_fcos_efficientnetb2.lst
| | | |-- pre_preq_qat_fcos_efficientnetb3
| | | |-- pre_preq_qat_fcos_efficientnetb3.lst
| | |-- imagenet
| | | |-- pre_efficientnasnet_m
| | | | |-- xxxx.bin
| | | | |-- ....
| | | |-- pre_efficientnasnet_m.lst
| | | |-- pre_efficientnasnet_s
| | | |-- pre_efficientnasnet_s.lst
| | | |-- pre_efficientnet_lite0
| | | |-- pre_efficientnet_lite0.lst
| | | |-- pre_efficientnet_lite1
| | | |-- pre_efficientnet_lite1.lst
| | | |-- pre_efficientnet_lite2
| | | |-- pre_efficientnet_lite2.lst
| | | |-- pre_efficientnet_lite3
| | | |-- pre_efficientnet_lite3.lst
| | | |-- pre_efficientnet_lite4
| | | |-- pre_efficientnet_lite4.lst
| | | |-- pre_googlenet
| | | |-- pre_googlenet.lst
| | | |-- pre_mobilenetv1
| | | |-- pre_mobilenetv1.lst
| | | |-- pre_mobilenetv2
| | | |-- pre_mobilenetv2.lst
| | | |-- pre_resnet18
| | | |-- pre_resnet18.lst
| | | |-- pre_vargconvnet
| | | |-- pre_vargconvnet.lst
| | |-- voc
| | | |-- pre_ssd_mobilenetv1
| | | | |-- xxxx.bin
| | | | |-- ....
| | | |-- pre_ssd_mobilenetv1.lst
| |-- model
| | |-- ...
| |-- script
| | |-- ...
The structure of the QAT pre-processed data set is as follows:
|-- qat
| |-- data
| | |-- carfusion
| | | |-- pre_keypoints
| | | | |-- xxxx # pre-processed binary file
| | | | |-- ....
| | | |-- pre_carfusion.lst # lst file: record the path of each preprocessing file
| | |-- cityscapes
| | | |-- pre_mobilenet_unet
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_mobilenet_unet .lst
| | |-- coco
| | | |-- pre_detr_efficientnetb3
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_detr_efficientnetb3.lst
| | | |-- pre_detr_resnet50
| | | |-- pre_detr_resnet50.lst
| | | |-- pre_fcos_efficientnetb0
| | | |-- pre_fcos_efficientnetb0.lst
| | | |-- pre_retinanet
| | | |-- pre_retinanet.lst
| | |-- imagenet
| | | |-- pre_mixvargenet
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_mixvargenet.lst
| | | |-- pre_mobilenetv1
| | | |-- pre_mobilenetv1.lst
| | | |-- pre_mobilenetv2
| | | |-- pre_mobilenetv2.lst
| | | |-- pre_resnet50
| | | |-- pre_resnet50.lst
| | | |-- pre_swint
| | | |-- pre_swint.lst
| | | |-- pre_vargnetv2
| | | |-- pre_vargnetv2.lst
| | |-- flyingchairs
| | | |-- pre_pwcnet_opticalflow
| | | | |-- xxxx.bin
| | | | |-- ....
| | | |-- pre_pwcnet_opticalflow.lst
| | |-- kitti3d
| | | |-- pre_pointpillars_kitti_car
| | | | |-- xxxx.bin
| | | | |-- ....
| | | |-- pre_pointpillars_kitti_car.lst
| | |-- voc
| | | |-- pre_yolov3_mobilenetv1
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_yolov3_mobilenetv1.lst
| | |-- culane
| | | |-- pre_ganet
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_ganet.lst
| | |-- nuscenes
| | | |-- pre_nuscenes
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_nuscenes.lst
| | | |-- fcos3d_nuscenes_camconfig.txt
| | |-- nuscenes_bev
| | | |-- images
| | | | |-- xxxx
| | | | |-- ....
| | | |-- reference_points
| | | | |-- xxxx
| | | | |-- ....
| | | |-- images.lst
| | | |-- reference_points.lst
| | |-- nuscenes_lidar
| | | |-- pre_nuscenes_lidar
| | | | |-- xxxx
| | | | |-- ....
| | | |-- pre_nuscenes_lidar.lst
| | |-- mot17
| | | |-- motr
| | | | |-- pre_motr
| | | | | |-- xxxx
| | | | | |-- ....
| | | | |-- MOT17-02-SDP.lst
| | | | |-- MOT17-04-SDP.lst
| | | | |-- MOT17-05-SDP.lst
| | | | |-- MOT17-09-SDP.lst
| | | | |-- MOT17-10-SDP.lst
| | | | |-- MOT17-11-SDP.lst
| | | | |-- MOT17-13-SDP.lst
| | |-- argoverse1
| | | |-- dentsetnt
| | | | |-- pre_dentsetnt
| | | | | |-- xxxx
| | | | | |-- ....
| | | | |-- pre_dentsetnt.lst
| | |-- sceneflow
| | | |-- left
| | | | |-- xxxx
| | | | |-- ....
| | | |-- right
| | | | |-- xxxx
| | | | |-- ....
| | | |-- mini_left.lst
| | | |-- mini_right.lst
| |-- model
| | |-- ...
| |-- script
| | |-- ...
The corresponding LST file is generated by reference as follows(except argoverse1, bev, mot17 and stereonet_plus):
find ../../../data/coco/pre_centernet_resnet101 -name "*bin*" > ../../../data/coco/pre_centernet_resnet101.lst
Note
The parameters after -name need to be adjusted according to the format of the preprocessed dataset, such as bin, png.
The path stored in the generated lst file is a relative path: ../../../data/coco/pre_centernet_resnet101/ , which can match the workflow_accuracy.json default configuration path.
If you need to change the storage location of the preprocessing dataset, you need to ensure that the corresponding lst file can be used by workflow_accuracy.json, Secondly,
it is necessary to ensure that the program can read the corresponding preprocessing file according to the path information in lst file.
argoverse1:
sh generate_acc_lst.sh
The path stored in the generated lst file is a relative path: ../../../data/argoverse1/densetnt/ , which can match the workflow_accuracy.json default configuration path.
bev:
find ../../../data/nuscenes_bev/images -name "*bin*" | sort > ../../../data/nuscenes_bev/images.lst
find ../../../data/nuscenes_bev/reference_points0 -name "*bin*" | sort > ../../../data/nuscenes_bev/reference_points0.lst
Take the bev_mt_ipm as an example. This model has two types of input: images and reference points. The input image and reference point of the same frame have the same name.
In order to ensure that the input corresponds, you need to add sort to sort by name when executing the find command.
In addition to images and reference points, detr3d_efficientnetb3 and petr_efficientnetb3 also have coords, masks and position embedding inputs. The method of generating lst is as follows:
detr3d_efficientnetb3:
find ../../../data/nuscenes_bev/coords0 -name "*bin*" | sort > ../../../data/nuscenes_bev/coords0.lst
find ../../../data/nuscenes_bev/coords1 -name "*bin*" | sort > ../../../data/nuscenes_bev/coords1.lst
find ../../../data/nuscenes_bev/coords2 -name "*bin*" | sort > ../../../data/nuscenes_bev/coords2.lst
find ../../../data/nuscenes_bev/coords3 -name "*bin*" | sort > ../../../data/nuscenes_bev/coords3.lst
find ../../../data/nuscenes_bev/masks -name "*bin*" | sort > ../../../data/nuscenes_bev/masks.lst
petr_efficientnetb3:
find ../../../data/nuscenes_bev/pos_embed -name "*bin*" | sort > ../../../data/nuscenes_bev/pos_embed.lst
Attention
For the bev model, the reference_points of detr3d_efficientnetb3 and petr_efficientnetb3 are used for model post processing. You need to configure the correct path in workflow_accuracy.json to ensure that the program can read the corresponding reference points file.
In addition, bev_mt_ipm_temporal is a timing model, which requires input order. Therefore, we provide a script gen_file_list.sh for generating lst files, the usage is as follows:
sh gen_file_list.sh
The path stored in the generated lst file is a relative path: ../../../data/nuscenes_bev/ , which can match the workflow_accuracy.json default configuration path.
If you need to change the storage location of the preprocessing dataset, you need to ensure that the corresponding lst file can be used by workflow_accuracy.json, Secondly,
it is necessary to ensure that the program can read the corresponding preprocessing file according to the path information in lst file.
mot17:
sh generate_acc_lst.sh
The path stored in the generated lst file is a relative path: ../../../data/mot17/motr/ , which can match the workflow_accuracy.json default configuration path.
If you need to change the storage location of the preprocessing dataset, you need to ensure that the corresponding lst file can be used by workflow_accuracy.json, Secondly,
it is necessary to ensure that the program can read the corresponding preprocessing file according to the path information in lst file.
Sceneflow:
find ../../../data/sceneflow/left -name "*png*" | sort > ../../../data/sceneflow/left.lst
find ../../../data/sceneflow/right -name "*png*" | sort > ../../../data/sceneflow/right.lst
Take the stereonet_plus as an example. The input left image and right the same frame have the same name.
In order to ensure that the input corresponds, you need to add sort to sort by name when executing the find command.
9.4.4.3.4. Model Inference¶
9.4.4.3.4.1. About Command-line Parameters¶
The accuracy.sh script is shown as below:
#!/bin/sh
source ../../base_config.sh # load basic configurations
export SHOW_FPS_LOG=1 # specify environment variable, print fps level log
${app} \ # executable program defined in the accuracy.sh script
--config_file=workflow_accuracy.json \ # load workflow configuration file of accuracy evaluation
--log_level=2 # specify log level
9.4.4.3.4.2. About Configuration File¶
Take centernet_resnet101 model for example Content of accuracy workflow configuration file is show as below:
{
"input_config": {
"input_type": "preprocessed_image", # input data type, supporting images or bin files, use pre-processed data for accuracy evaluation
"height": 512,
"width": 512,
"data_type": 1,
"image_list_file": "../../../data/coco/coco.lst",
"need_pre_load": false,
"limit": 14,
"need_loop": false,
"max_cache": 10
},
"output_config": {
"output_type": "eval",
"eval_enable": true,
"output_file": "./eval.log" # prediction results document
},
"workflow": [
{
"method_type": "InferMethod",
"unique_name": "InferMethod",
"method_config": {
"core": 0,
"model_file": "../../../model/runtime/centernet_resnet101/centernet_resnet101_512x512_nv12.bin"
}
},
{
"thread_count": 4,
"method_type": "PTQCenternetMaxPoolSigmoidPostProcessMethod",
"unique_name": "PTQCenternetMaxPoolSigmoidPostProcessMethod",
"method_config": {
"class_num": 80,
"score_threshold": 0.0,
"topk": 100,
"det_name_list": "../../config/model/data_name_list/coco_classes.names"
}
}
]
}
After the data has been mounted, log in dev board and run the accuracy.sh script in the centernet_resnet101 directory, as shown below:
root@j5dvb-hynix8G:/userdata/ptq/script/detection/centernet_resnet101# sh accuracy.sh
../../aarch64/bin/example --config_file=workflow_accuracy.json --log_level=2
...
I0419 03:14:51.158655 39555 infer_method.cc:107] Predict DoProcess finished.
I0419 03:14:51.187361 39556 ptq_centernet_post_process_method.cc:558] PTQCenternetPostProcessMethod DoProcess finished, predict result: [{"bbox":[-1.518860,71.691170,574.934631,638.294922],"prob":0.750647,"label":21,"class_name":"
I0118 14:02:43.636204 24782 ptq_centernet_post_process_method.cc:558] PTQCenternetPostProcessMethod DoProcess finished, predict result: [{"bbox":[3.432283,164.936249,157.480042,264.276825],"prob":0.544454,"label":62,"class_name":"
...
Inference results will be saved into the eval.log file dumped by dev board program.
9.4.4.3.5. Model Accuracy Computing¶
Attention
Please perform the accuracy calculation in docker environment or Linux environment.
9.4.4.3.5.1. PTQ Model Accuracy Computing¶
For the PTQ model, The model accuracy computing scripts are in the ptq/tools/python_tools folder, including accuracy_tools folder:
The cls_eval.py script is used for computing accuracy of classification models.
The coco_det_eval.py script is used for computing the accuracy of models evaluated using the COCO dataset.
The parsing_eval.py script is used for computing the accuracy of segmentation models evaluated using the Cityscapes dataset.
The voc_det_eval.py script is used for computing the accuracy of detection models using the VOC dataset.
9.4.4.3.5.1.1. Classification Models¶
Method to compute the accuracy of those models using the CIFAR-10 and ImageNet datasets is shown as below:
#!/bin/sh
python3 cls_eval.py --log_file=eval.log --gt_file=val.txt
Note
log_filerefers to inference result file of classification models.gt_filerefers to the annotation files of CIFAR-10 and ImageNet datasets.
9.4.4.3.5.1.2. Detection Models¶
Method to compute the accuracy of those models using the COCO dataset is shown as below:
#!/bin/sh
python3 coco_det_eval.py --eval_result_path=eval.log --annotation_path=instances_val2017.json
Note
eval_result_pathrefers to inference result file of detection models.annotation_pathrefers to the annotation file of of the COCO dataset.
Method to compute the accuracy of those detection models using the VOC dataset is shown as below:
#!/bin/sh
python3 voc_det_eval.py --eval_result_path=eval.log --annotation_path=../Annotations --val_txt_path=../val.txt
Note
eval_result_pathrefers to the inference result file of detection models.annotation_pathrefers to the annotation file of the VOC dataset.val_txt_pathrefers to the val.txt file in the …/ImageSets/Main folder in the VOC dataset.
9.4.4.3.5.1.3. Segmentation Models¶
Method to compute the accuracy of those segmentation models using the Cityscapes dataset is shown as below:
#!/bin/sh
python3 parsing_eval.py --log_file=eval.log --gt_path=cityscapes/gtFine/val
Note
log_filerefers to the inference result file of segmentation modelsgt_pathrefers to the annotation file of the Cityscapes dataset.
9.4.4.3.5.2. QAT Model Accuracy Computing¶
For the QAT model, the model accuracy computing scripts are in the qat/tools/python_tools folder, including accuracy_tools folder:
bev_eval.py is used to calculate the accuracy of bev model.
centerpoint_eval.py is used to calculate the accuracy of centerpoint lidar 3D model.
cls_eval.py is used to calculate the accuracy of the classification model.
detr_eval.py is used to calculate the accuracy of detr model.
retinanet_eval.py is used to calculate the accuracy of retinanet model.
fcos_eval.py is used to calculate the accuracy of fcos model.
motr_eval.py is used to calculate the accuracy of motr model.
parsing_eval.py is used to calculate the accuracy of segmentation model.
pwcnet_eval.py is used to calculate the accuracy of opticalflow model.
yolov3_eval.py is used to calculate the accuracy of yolov3 model.
pointpillars_eval.py is used to calculate the accuracy of pointpillars model.
ganet_eval.py is used to calculate the accuracy of ganet model.
fcos3d_eval.py is used to calculate the accuracy of fcos3d model.
keypoints_eval.py is used to calculate the accuracy of keypoints model.
lidar_multitask_eval.py is used to calculate the accuracy of lidar multitask model.
densetnt_eval.py is used to calculate the accuracy of densetnt model.
densetnt_eval.py is used to calculate the accuracy of stereonet_plus model.
9.4.4.3.5.2.1. Bev Models¶
Method to compute the accuracy of those models using the Nuscenes dataset is shown as below:
#!/bin/sh
python3 bev_eval.py --det_eval_path=bev_det_eval.log --seg_eval_path=bev_seg_eval.log --gt_files_path=./nuscenes_bev_val/val_gt_infos.pkl --meta_dir=./Nuscenes/meta/
# detr3d_efficientnetb3 and petr_efficientnetb3 are bev detection models, and there is no need to specify --seg_eval_path.
python3 bev_eval.py --det_eval_path=eval.log --gt_files_path=./nuscenes_bev_val/val_gt_infos.pkl --meta_dir=./Nuscenes/meta/
Note
det_eval_path: refers to detection inference result file of bev models.seg_eval_path: refers to segment inference result file of bev models.gt_files_path: refers to gt file generated by preprocessing the nuscenes dataset.meta_dir: refers to Nuscenes dataset meta data.
9.4.4.3.5.2.2. Classification Models¶
Method to compute the accuracy of those models using the CIFAR-10 and ImageNet datasets is shown as below:
#!/bin/sh
python3 cls_eval.py --log_file=eval.log --gt_file=val.txt
Note
log_filerefers to inference result file of classification models.gt_filerefers to the annotation files of CIFAR-10 and ImageNet datasets.
9.4.4.3.5.2.3. Detection Models¶
Method to compute the accuracy of those models using the COCO dataset is shown as below:
#!/bin/sh
python3 fcos_eval.py --eval_result_path=eval.log --annotation_path=instances_val2017.json --image_path=./mscoco/images/val2017/
# The qat fcos model needs to add --is_qat=True
python3 fcos_eval.py --eval_result_path=eval.log --annotation_path=instances_val2017.json --image_path=./mscoco/images/val2017/ --is_qat=True
Note
eval_result_pathrefers to inference result file of detection models.annotation_pathrefers to the annotation file of of the COCO dataset.image_pathrefers to COCO dataset source data.is_qatrefers to whether result evaluation of qat fcos model
#!/bin/sh
python3 retinanet_eval.py --eval_result_path=eval.log --annotation_path=instances_val2017.json --image_path=./mscoco/images/val2017/
Note
eval_result_pathrefers to inference result file of detection models.annotation_pathrefers to the annotation file of of the COCO dataset.image_pathrefers to COCO dataset source data.
Method to compute the accuracy of those detection models using the VOC dataset is shown as below:
#!/bin/sh
python3 yolov3_eval.py --eval_result_path=eval.log --annotation_path=../Annotations --val_txt_path=../val.txt --image_height=416 --image_width=416
Note
eval_result_pathrefers to the inference result file of detection models.annotation_pathrefers to the annotation file of the VOC dataset.val_txt_pathrefers to the val.txt file in the …/ImageSets/Main folder in the VOC dataset.image_heightrefers to image height.image_widthrefers to image width.
Method to compute the accuracy of those detection models using the KITTI dataset is shown as below:
#!/bin/sh
python3 pointpillars_eval.py --eval_result_path=eval.log --annotation_path=./val_gt_infos.pkl
Note
eval_result_pathrefers to the inference result file of detection models.annotation_pathrefers to the annotation file val_gt_infos.pkl of the KITTI dataset.
Method to compute the accuracy of those detection models using the culane dataset is shown as below:
#!/bin/sh
python3 ganet_eval.py --eval_path=eval.log --image_path=./culane
Note
eval_pathrefers to the inference result file of detection models.image_pathculane datasets
Method to compute the accuracy of those models using the Nuscenes dataset is shown as below:
#!/bin/sh
python3 fcos3d_eval.py --eval_result_path=eval.log --image_path=./Nuscenes
Note
eval_result_pathrefers to inference result file of detection models.image_pathrefers to Nuscenes dataset source data.
#!/bin/sh
python3 centerpoint_eval.py --predict_result_path=eval.log --gt_files_path=./nuscenes_lidar_val/val_gt_infos.pkl --meta_dir=./Nuscenes/meta/
Note
predict_result_pathrefers to inference result file of detection models.gt_files_pathrefers to gt file generated by preprocessing the nuscenes dataset.meta_dirrefers to Nuscenes dataset meta data.
Method to compute the accuracy of those detection models using the carfusion dataset is shown as below:
#!/bin/sh
python3 keypoints_eval.py --anno_path=./processed_carfusion/processed_anno.json --eval_result_path=eval.log
Note
anno_pathThe processed_anno.json file generated by preprocessing.eval_result_pathrefers to inference result file of detection models.
9.4.4.3.5.2.4. Segmentation Models¶
Method to compute the accuracy of those segmentation models using the Cityscapes dataset is shown as below:
#!/bin/sh
python3 parsing_eval.py --log_file=eval.log --gt_path=cityscapes/gtFine/val
Note
log_filerefers to the inference result file of segmentation models.gt_pathrefers to the annotation file of the Cityscapes dataset.
9.4.4.3.5.2.5. Opticalflow Models¶
Method to compute the accuracy of those pwcnet_opticalflow models using the FlyingChairs dataset is shown as below:
#!/bin/sh
python3 pwcnet_eval.py --log_file=eval.log --gt_path=./flyingchairs/FlyingChairs_release/data/ --val_file=./flyingchairs/FlyingChairs_train_val.txt
Note
log_filerefers to the inference result file of opticalflow estimation models.val_filerefers to the annotation file of the FlyingChairs dataset.gt_pathrefers to FlyingChairs dataset source data.
9.4.4.3.5.2.6. Tracking Models¶
Method to compute the accuracy of those models using the mot17 datasets is shown as below:
#!/bin/sh
python3 motr_eval.py --eval_result_path=eval_log --gt_val_path=valdata/gt_val
Note
eval_result_pathrefers to inference result file of tracking models.gt_val_pathrefers to the annotation files of motr17 datasets.
9.4.4.3.5.2.7. Multitask Models¶
Method to compute the accuracy of those models using the Nuscenes dataset is shown as below:
#!/bin/sh
python3 lidar_multitask_eval.py --det_eval_path=det_eval.log --seg_eval_path=seg_eval.log --gt_files_path=./nuscenes_lidar_val/val_gt_infos.pkl --data_dir=./Nuscenes
Note
det_eval_path: refers to detection inference result file of models.seg_eval_path: refers to segment inference result file of models.gt_files_path: refers to gt file generated by preprocessing the nuscenes dataset.data_dir: refers to Nuscenes dataset.
9.4.4.3.5.2.8. Traj Pred Models¶
Method to compute the accuracy of those models using the argoverse1t datasets is shown as below:
#!/bin/sh
python3 densetnt_eval.py --eval_result_path=eval.log --meta_path=argoverse1/meta
Note
eval_result_pathrefers to inference result file of traj pred models.meta_pathrefers to the annotation file generated by the argoverse1 dataset, after using the preprocessing script, it is generated in the original dataset directory meta.
9.4.4.3.5.2.9. Disparity Pred Models¶
Method to compute the accuracy of those models using the Sceneflow datasets is shown as below:
#!/bin/sh
python3 stereonet_eval.py --log_file=eval.log --gt_files_path=val_gt_infos.pkl
Note
log_filerefers to inference result file of disparity pred models.gt_files_pathrefers to gt file generated by preprocessing the Sceneflow dataset.
9.4.5. Model Integration¶
9.4.5.1. Pre-processing¶
9.4.5.1.1. Add Pre-processing File¶
The pre-processing .cc files are in the ai_benchmark/code/src/method/ directory.
While .h header files are in the ai_benchmark/code/include/method/ directory.
|--ai_benchmark
| |--code # source code of samples
| | |--include
| | | |--method # add your header files into this folder
| | | | |--qat_fcos_post_process_method.h
| | | | |--......
| | |--src
| | | |--method # add your .cc postprocess files into this folder
| | | | |--qat_fcos_post_process_method.cc
| | | | |--......
9.4.5.1.2. Add Pre-processing Configuration File¶
|--ai_benchmark
| |--j5/qat/script # sample script folder
| | |--config
| | | |--preprocess
| | | | |--centerpoint_preprocess_5dim.json # pre-processing configuration file
Preprocess of centerpoint_pointpillar can deploy to CPU or DSP depends on centerpoint_pointpillar_5dim.json.
If run_on_dsp in config file is set to true then preprocess will be running on DSP otherwise it running on CPU.
9.4.5.1.3. To Evalute the Latency of Preprocess¶
Run sh latency.sh to evaluate single frame latency of preprocess, as shown below:
I0615 13:30:40.772293 3670 output_plugin.cc:91] Pre process latency: [avg: 20.295ms, max: 28.690ms, min: 18.512ms], Infer latency: [avg: 25.053ms, max: 31.943ms, min: 24.702ms], Post process latency: [avg: 52.760ms, max: 54.099ms, min: 51.992ms].
Note
Pre processdenotes the time consumption of pre-processing.Inferdenotes the time consumption of model inference.Post processdenotes the time consumption of post-processing.
9.4.5.2. Post-processing¶
Post-processing consists of 2 steps. Let’s take integration of CenterNet model as an example:
1.Add the post-processing file ptq_centernet_post_process_method.cc and the header file ptq_centernet_post_process_method.h.
2.Add a model execution script and a configuration file.
9.4.5.2.1. Add Post-processing File¶
Post-processing code file can reuse any post-processing files in the src/method directory.
You only need to modify the InitFromJsonString function and the PostProcess function.
The InitFromJsonString function is used for loading the post-processing related parameters in the workflow.json.
You can customize the corresponding input parameters. The PostProcess function is used for implementing post-processing logic.
The post-processing .cc files are in the ai_benchmark/code/src/method/ directory.
While .h header files are in the ai_benchmark/code/include/method/ directory.
|--ai_benchmark
| |--code # source code of samples
| | |--include
| | | |--method # add your header files into this folder
| | | | |--qat_fcos_post_process_method.h
| | | | |--......
| | |--src
| | | |--method # add your .cc postprocess files into this folder
| | | | |--qat_fcos_post_process_method.cc
| | | | |--......
9.4.5.2.2. Add Model Execution and Configuration Files¶
Directory structure of scripts is shown as below(except centerpoint_pointpillar and motr):
|--ai_benchmark
| |--j5/ptq/script # sample script folder
| | |--detection
| | | |--centernet_resnet101
| | | | |--accuracy.sh # accuracy evaluation script
| | | | |--fps.sh # performance evaluation script
| | | | |--latency.sh # single-frame latency sample script
| | | | |--workflow_accuracy.json # accuracy configuration file
| | | | |--workflow_fps.json # performance configuration file
| | | | |--workflow_latency.json # single-frame latency configuration file
Directory structure of centerpoint_pointpillar scripts is shown as below:
|--ai_benchmark
| |--j5/qat/script # sample script folder
| | |--detection
| | | |--centerpoint_pointpillar
| | | | |--accuracy.sh # accuracy evaluation script
| | | | |--dsp_image
| | | | | |--dsp_deploy.sh # dsp deployment scipt
| | | | | |--vdsp0 # image of dsp core 0
| | | | | |--vdsp1 # image of dsp core 1
| | | | |--fps.sh # performance evaluation script
| | | | |--latency.sh # single-frame latency sample script
| | | | |--README.md # dsp deployment introduction document
| | | | |--workflow_accuracy # accuracy configuration folder
| | | | |--workflow_fps.json # performance configuration file
| | | | |--workflow_latency.json # single-frame latency configuration file
To process on DSP, you need to execute dsp_deploy.sh to deploy the DSP environment. For a detailed introduction to dsp deployment, please refer to README.md.
Directory structure of motr scripts is shown as below:
|--ai_benchmark
| |--j5/qat/script # sample script folder
| | |--tracking
| | | |--motr
| | | | |--accuracy.sh # accuracy evaluation script
| | | | |--fps.sh # performance evaluation script
| | | | |--generate_acc_lst.sh # generate accuracy lst script
| | | | |--latency.sh # single-frame latency sample script
| | | | |--workflow_accuracy # accuracy configuration folder
| | | | |--workflow_fps.json # performance configuration file
| | | | |--workflow_latency.json # single-frame latency configuration file
9.4.6. Helper Tools¶
9.4.6.1. Log¶
There are 2 types of logs: sample Log and DNN Log. Wherein, sample log refers to the log in the ABP deliverables, while DNN log refers to the log in the embedded runtime library. Developers can specify logs as needed.
9.4.6.1.1. Sample Log¶
1.Log level.
Both glog and vlog are used in sample log and there are 4 customized log levels:
0(SYSTEM), in sample code this log level is used for generating error information.1(REPORT), in sample code this log level is used for generating performance data.2(DETAIL), in sample code this log level is used for generating current system status.3(DEBUG), in sample code this log level is used for generating debugging information.
Rules to set log levels: assume that a log level P is specified, if Q,
whose rank is inferior to that of P happens, then log will activated. Otherwise Q will be blocked.
The default ranks of log level: DEBUG>DETAIL>REPORT>SYSTEM.
2.Set log levels.
When running samples, specify the log_level parameter to set log levels.
For example, if log_level=0, then SYSTEM log should be dumped; else if log_level=3,
then DEBUG, DETAIL, REPORT and SYSTEM logs should be dumped.
9.4.6.1.2. Dnn Log¶
For the configuration of dnn logs, please read the Configuration Info section in the BPU SDK API DOC.
9.4.6.2. OP Time Consumption¶
9.4.6.2.1. General Descriptions¶
Use the HB_DNN_PROFILER_LOG_PATH environment variable to specify statistics of OP performance.
Types and values of this environment variable are described as below:
HB_DNN_PROFILER_LOG_PATH=${path}: denotes the output path of OP node.
After the program is executed, a profiler.log file should be generated.
9.4.6.2.2. Sample¶
Takeing mobilenetv1 as an example, as shown in the following code block:
Start 1 threads to run the model at the same time, set export HB_DNN_PROFILER_LOG_PATH=. /,
the statistical output will be shown as follows.
1{
2 "perf_result": {
3 "FPS": 3492.5347070636512,
4 "average_latency": 2.1842353343963623
5 },
6 "running_condition": {
7 "core_id": 0,
8 "frame_count": 200,
9 "model_name": "/home/jenkins/workspace/oolchain_tc_sys_release_j5_1.7.0/j5_toolchain/samples/03_classification/01_mobilenet/mapper/model_output/mobilenetv1_224x224_nv12",
10 "run_time": 57.265,
11 "thread_num": 1
12 }
13}
14***
15{
16 "processor_latency": {
17 "BPU_inference_time_cost": {
18 "avg_time": 1.985635,
19 "max_time": 2.055,
20 "min_time": 1.047
21 },
22 "CPU_inference_time_cost": {
23 "avg_time": 0.07651,
24 "max_time": 0.23300000000000004,
25 "min_time": 0.069
26 }
27 },
28 "model_latency": {
29 "BPU_MOBILENET_subgraph_0": {
30 "avg_time": 1.985635,
31 "max_time": 2.055,
32 "min_time": 1.047
33 },
34 "Dequantize_fc7_1_HzDequantize": {
35 "avg_time": 0.03118,
36 "max_time": 0.081,
37 "min_time": 0.029
38 },
39 "MOBILENET_subgraph_0_output_layout_convert": {
40 "avg_time": 0.009460000000000001,
41 "max_time": 0.033,
42 "min_time": 0.008
43 },
44 "Preprocess": {
45 "avg_time": 0.005860000000000001,
46 "max_time": 0.037,
47 "min_time": 0.004
48 },
49 "Softmax_prob": {
50 "avg_time": 0.030010000000000002,
51 "max_time": 0.082,
52 "min_time": 0.028
53 }
54 },
55 "task_latency": {
56 "TaskPendingTime": {
57 "avg_time": 0.019245,
58 "max_time": 0.145,
59 "min_time": 0.008
60 },
61 "TaskRunningTime": {
62 "avg_time": 2.12983,
63 "max_time": 2.208,
64 "min_time": 1.427
65 }
66 }
67}
The above output information contains model_latency and task_latency.
Wherein, model_latency contains the time consumption required to run each operator of the model;
while task_latency contains the time consumption of each task of the model.
Note
The profiler.log will only be generated when the program is normally exited.
9.4.6.3. Dump Tool¶
Enable the HB_DNN_DUMP_PATH environment variable to dump the input and output of each node in inference process.
The dump tool can check if there are consistency problems between simulator and real machine,
i.e. Whether the output of the real machine and the simulator are exactly the same, given the same model and the same inputs.


















