利用yolov5 v5.0训练自己的目标检测模型--移植华为Atlas200DK开发
4 yolov5 移植华为Atlas 200平台进行图像推理
这个开源的项目通过大家的不断的完善和修复已经到了第5个分支,因此我们选择第五个版本来实验,首先点击左上角的master这个图标来选择项目的第5个分支,如下图所示,然后将版本选择好以后,点击右上角的code那个按键,将代码下载下来。至此整个项目就已经准备好了。
现在来对代码的整体目录做一个介绍:
1.3 环境的安装和依赖的安装
关于深度学习的环境的安装,我已经写了一篇很详细的博客了,值得一提的一点就是,正常需要利用GPU去训练数据集的话,是需要安装对应的CUDA和cudnn的,但是我写的那篇博客是利用anaconda去配置环境,不要再额外的去英伟达的官网下载CUDA。 博客的链接为利用Anaconda安装pytorch和paddle深度学习环境+pycharm安装---免额外安装CUDA和cudnn(适合小白的保姆级教学)
打开requirements.txt这个文件,可以看到里面有很多的依赖库和其对应的版本要求。我们打开pycharm的命令终端,在中输入如下的命令,就可以安装了。
pip install -r requirements.txt
至此,深度学习的环境和依赖包就都结束了。
在VOCData目录下新建Annotations, images, ImageSets, labels 四个文件夹。
其中images存放的是原始的图片数据集,Annotations存放的是标记后生成的xml文件,labels存放的是保存标记内容的txt文件,ImageSets存放的是训练数据集和测试数据集的分类情况。
├── VOCData
│ ├── Annotations 进行 detection 任务时的标签文件,xml 形式,文件名与图片名一一对应
│ ├── images 存放 .jpg 格式的图片文件
│ ├── ImageSets 存放的是分类和检测的数据集分割文件,包含train.txt, val.txt,trainval.txt,test.txt
├── train.txt 写着用于训练的图片名称(绝对路径)
├── val.txt 写着用于验证的图片名称
├── trainval.txt train与val的合集
├── test.txt 写着用于测试的图片名称
│ ├── labels 存放label标注信息的txt文件,与图片一一对应
怎么生成和划分训练测试数据集和labels?
可以先建Annotations和images两个文件夹,将所有图像放入images文件夹中,将对应的标注文件xml放到Annotations中;
然后在VOCData同一目录下运行split_train_val.py脚本
split_train_val.py
import os
import random
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--xml_path', default='./VOCData/Annotations', type=str, help='input xml label path')
parser.add_argument('--txt_path', default='./VOCData/labels', type=str, help='output txt label path')
opt = parser.parse_args()
trainval_percent = 1.0
train_percent = 0.9
xmlfilepath = opt.xml_path
txtsavepath = opt.txt_path
total_xml = os.listdir(xmlfilepath)
if not os.path.exists(txtsavepath):
os.makedirs(txtsavepath)
num = len(total_xml)
list_index = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list_index, tv)
train = random.sample(trainval, tr)
file_trainval = open(txtsavepath + '/trainval.txt', 'w')
file_test = open(txtsavepath + '/test.txt', 'w')
file_train = open(txtsavepath + '/train.txt', 'w')
file_val = open(txtsavepath + '/val.txt', 'w')
for i in list_index:
name = total_xml[i][:-4] + '\n'
if i in trainval:
file_trainval.write(name)
if i in train:
file_train.write(name)
else:
file_val.write(name)
else:
file_test.write(name)
file_trainval.close()
file_train.close()
file_val.close()
file_test.close()
然后再运行脚本txt2yolo_label.py,生成labels中的txt文件
txt2yolo_label.py:
import xml.etree.ElementTree as ET
from tqdm import tqdm
import os
from os import getcwd
sets = ['train', 'val', 'test']
classes = ['person', 'other_clothes', 'reflective_clothes', 'hat']
def convert(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = (box[0] + box[1]) / 2.0 - 1
y = (box[2] + box[3]) / 2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return x, y, w, h
def convert_annotation(image_id):
try:
in_file = open('VOCData/Annotations/%s.xml' % (image_id), encoding='utf-8')
out_file = open('VOCData/labels/%s.txt' % (image_id), 'w', encoding='utf-8')
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
float(xmlbox.find('ymax').text))
b1, b2, b3, b4 = b
if b2 > w:
b2 = w
if b4 > h:
b4 = h
b = (b1, b2, b3, b4)
bb = convert((w, h), b)
out_file.write(str(cls_id) + " " +
" ".join([str(a) for a in bb]) + '\n')
except Exception as e:
print(e, image_id)
wd = getcwd()
for image_set in sets:
if not os.path.exists('./VOCData/labels/'):
os.makedirs('./VOCData/labels/')
image_ids = open('./VOCData/labels/%s.txt' %
(image_set)).read().strip().split()
list_file = open('./VOCData/%s.txt' % (image_set), 'w')
for image_id in tqdm(image_ids):
list_file.write('./VOCData/images/%s.jpg\n' % (image_id))
convert_annotation(image_id)
list_file.close()
2.2 获得预训练权重
一般为了缩短网络的训练时间,并达到更好的精度,我们一般加载预训练权重进行网络的训练。而yolov5的5.0版本给我们提供了几个预训练权重,我们可以对应我们不同的需求选择不同的版本的预训练权重。通过如下的图可以获得权重的名字和大小信息,可以预料的到,预训练权重越大,训练出来的精度就会相对来说越高,但是其检测的速度就会越慢。预训练权重可以通过这个网址进行下载,本次训练自己的数据集用的预训练权重为yolov5s.pt。
打开这个文件夹修改其中的参数,首先将箭头1中的那一行代码注释掉(我已经注释掉了),如果不注释这行代码训练的时候会报错;箭头2中需要将训练和测试的数据集的路径填上(最好要填绝对路径,有时候由目录结构的问题会莫名奇妙的报错);箭头3中需要检测的类别数,我这里是识别安全帽和人,所以这里填写2;最后箭头4中填写需要识别的类别的名字(必须是英文,否则会乱码识别不出来)。到这里和data目录下的yaml文件就修改好了。
然后找到主函数的入口,这里面有模型的主要参数。模型的主要参数解析如下所示。
if __name__ == '__main__':
"""
opt模型主要参数解析:
--weights:初始化的权重文件的路径地址
--cfg:模型yaml文件的路径地址
--data:数据yaml文件的路径地址
--hyp:超参数文件路径地址
--epochs:训练轮次
--batch-size:喂入批次文件的多少
--img-size:输入图片尺寸
--rect:是否采用矩形训练,默认False
--resume:接着打断训练上次的结果接着训练
--nosave:不保存模型,默认False
--notest:不进行test,默认False
--noautoanchor:不自动调整anchor,默认False
--evolve:是否进行超参数进化,默认False
--bucket:谷歌云盘bucket,一般不会用到
--cache-images:是否提前缓存图片到内存,以加快训练速度,默认False
--image-weights:使用加权图像选择进行训练
--device:训练的设备,cpu;0(表示一个gpu设备cuda:0);0,1,2,3(多个gpu设备)
--multi-scale:是否进行多尺度训练,默认False
--single-cls:数据集是否只有一个类别,默认False
--adam:是否使用adam优化器
--sync-bn:是否使用跨卡同步BN,在DDP模式使用
--local_rank:DDP参数,请勿修改
--workers:最大工作核心数
--project:训练模型的保存位置
--name:模型保存的目录名称
--exist-ok:模型目录是否存在,不存在就创建
"""
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default='yolov5s.pt', help='initial weights path')
parser.add_argument('--cfg', type=str, default='', help='model.yaml path')
parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path')
parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path')
parser.add_argument('--epochs', type=int, default=300)
parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs')
parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
parser.add_argument('--rect', action='store_true', help='rectangular training')
parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
parser.add_argument('--notest', action='store_true', help='only test final epoch')
parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
parser.add_argument('--project', default='runs/train', help='save to project/name')
parser.add_argument('--entity', default=None, help='W&B entity')
parser.add_argument('--name', default='exp', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--quad', action='store_true', help='quad dataloader')
parser.add_argument('--linear-lr', action='store_true', help='linear LR')
parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')
parser.add_argument('--upload_dataset', action='store_true', help='Upload dataset as W&B artifact table')
parser.add_argument('--bbox_interval', type=int, default=-1, help='Set bounding-box image logging interval for W&B')
parser.add_argument('--save_period', type=int, default=-1, help='Log model after every "save_period" epoch')
parser.add_argument('--artifact_alias', type=str, default="latest", help='version of dataset artifact to be used')
opt = parser.parse_args()
训练自己的模型需要修改如下几个参数就可以训练了。首先将weights权重的路径填写到对应的参数里面,然后将修好好的models模型的yolov5s.yaml文件路径填写到相应的参数里面,最后将data数据的hat.yaml文件路径填写到相对于的参数里面。这几个参数就必须要修改的参数。
parser.add_argument('--weights', type=str, default='weights/yolov5s.pt', help='initial weights path')
parser.add_argument('--cfg', type=str, default='models/yolov5s_hat.yaml', help='model.yaml path')
parser.add_argument('--data', type=str, default='data/hat.yaml', help='data.yaml path')
还有几个需要根据自己的需求来更改的参数:
首先是模型的训练轮次,这里是训练的300轮。
parser.add_argument('--epochs', type=int, default=300)
其次是输入图片的数量和工作的核心数,这里每个人的电脑都不一样,所以这里每个人和自己的电脑的性能来。这里可以根据我的电脑的配置做参考,我的电脑是拯救者R9000,3060版本的显卡,cpu的核心数是8核。我的电脑按默认的参数输入图片数量为16,工作核心为8的话就会出现GPU显存溢出的报错。报错信息如下:
这里就要调小这两个参数了,每个人的电脑配置不一样,所以可以根据自己的电脑配置来修改参数。
parser.add_argument('--batch-size', type=int, default=8, help='total batch size for all GPUs')
parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
以上都设置好了就可以训练了。但是pycharm的用户可能会出现如下的报错。这是说明虚拟内存不够了。
可以根据如下的操作来修改,在utils路径下找到datasets.py这个文件,将里面的第81行里面的参数nw改完0就可以了。
至此,就可以运行train.py函数训练自己的模型了。
3.4启用tensorbord查看参数
yolov5里面有写好的tensorbord函数,可以运行命令就可以调用tensorbord,然后查看tensorbord了。首先打开pycharm的命令控制终端,输入如下命令,就会出现一个网址地址,将那行网址复制下来到浏览器打开就可以看到训练的过程了
tensorboard --logdir=runs/train
如下图所示,这是已经训练了100轮了。
如果模型已经训练好了,但是我们还想用tensorbord查看此模型的训练过程,就需要输入如下的命令。就可以看到模型的训练结果了。
tensorboard --logdir=runs
4 推理测试
等到数据训练好了以后,就会在主目录下产生一个run文件夹,在run/train/exp/weights目录下会产生两个权重文件,一个是最后一轮的权重文件,一个是最好的权重文件,一会我们就要利用这个最好的权重文件来做推理测试。除此以外还会产生一些验证文件的图片等一些文件。
找到主目录下的detect.py文件,打开该文件。
然后找到主函数的入口,这里面有模型的主要参数。模型的主要参数解析如下所示。
f __name__ == '__main__':
"""
--weights:权重的路径地址
--source:测试数据,可以是图片/视频路径,也可以是'0'(电脑自带摄像头),也可以是rtsp等视频流
--output:网络预测之后的图片/视频的保存路径
--img-size:网络输入图片大小
--conf-thres:置信度阈值
--iou-thres:做nms的iou阈值
--device:是用GPU还是CPU做推理
--view-img:是否展示预测之后的图片/视频,默认False
--save-txt:是否将预测的框坐标以txt文件形式保存,默认False
--classes:设置只保留某一部分类别,形如0或者0 2 3
--agnostic-nms:进行nms是否也去除不同类别之间的框,默认False
--augment:推理的时候进行多尺度,翻转等操作(TTA)推理
--update:如果为True,则对所有模型进行strip_optimizer操作,去除pt文件中的优化器等信息,默认为False
--project:推理的结果保存在runs/detect目录下
--name:结果保存的文件夹名称
"""
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)')
parser.add_argument('--source', type=str, default='data/images', help='source') # file/folder, 0 for webcam
parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='display results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default='runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
opt = parser.parse_args()
这里需要将刚刚训练好的最好的权重传入到推理函数中去。然后就可以对图像视频进行推理了。
parser.add_argument('--weights', nargs='+', type=str, default='runs/train/exp/weights/best.pt', help='model.pt path(s)')
对图片进行测试推理,将如下参数修改成图片的路径,然后运行detect.py就可以进行测试了。
parser.add_argument('--source', type=str, default='000295.jpg', help='source')
推理测试结束以后,在run下面会生成一个detect目录,推理结果会保存在exp目录下。如图所示。
视频的测试是一样的,只不过是将图片的路径改为视频的路径而已。利用摄像头进行测试只需将路径改写为0就好了。但是好像还是会报错,这一点卡了我很久。报错如下
解决方法:首先找到datasets.py这个py文件。
打开文件,找到第279行代码,给两个url参数加上str就可以了,如图所示,就可以完美运行电脑的摄像头了。
至此yolov5训练自己的模型就完全搞定了。
4 yolov5 移植华为Atlas 200平台进行图像推理
4.0 移植模型结构目录
├── model
│ ├── yolov5
│ ├── yolov5.cfg 模型配置文件
│ ├── yolov5.names 标签文件
│ ├── yolov5.om
│ ├── aipp_yolov5.cfg 转om模型用到
│ ├── atc.sh
├── main.cpp
├── run.sh
├── yolov5_example.pipeline
需要改的地方:
{1} yolov5.names中改为自己的模型所分的类别:
person
other_clothes
reflective_clothes
hat
{2} 修改模型配置yolov5.cfg
一般只改CLASS_NUM,其他不用变
CLASS_NUM=4
BIASES_NUM=18
BIASES=10,13,16,13,33,23,30,61,62,45,59,119,116,90,156,198,373,326
SCORE_THRESH=0.6
OBJECTNESS_THRESH=0.6
IOU_THRESH=0.3
YOLO_TYPE=3
ANCHOR_DIM=3
MODEL_TYPE=1
RESIZE_FLAG=0
{3}yolov5_example.pipeline业务流文件
需要改的地方:
"resizeHeight": "640",
"resizeWidth": "640"
"modelPath":
"postProcessConfigPath":
"labelPath":
"postProcessLibPath":
{
"classification+detection": {
"stream_config": {
"deviceId": "0"
},
"mxpi_imagedecoder0": {
"factory": "mxpi_imagedecoder",
"next": "mxpi_imageresize0"
},
"mxpi_imageresize0": {
"props": {
"parentName": "mxpi_imagedecoder0",
"resizeHeight": "640",
"resizeWidth": "640",
"resizeType": "Resizer_KeepAspectRatio_Fit"
},
"factory": "mxpi_imageresize",
"next": "mxpi_modelinfer0"
},
"mxpi_modelinfer0": {
"props": {
"parentName": "mxpi_imageresize0",
"modelPath": "/home/HwHiAiUser/zhy750v2ai_eval/750AI/yolov5_hou/models/yolov5/hou.om",
"postProcessConfigPath": "/home/HwHiAiUser/zhy750v2ai_eval/750AI/yolov5_hou/models/yolov5/show_new.cfg",
"labelPath": "/home/HwHiAiUser/zhy750v2ai_eval/750AI/yolov5_hou/models/yolov5/show_new.names",
"postProcessLibPath": "/home/HwHiAiUser/zhy750v2ai/750AI/detection/mxVision/lib/libMpYOLOv5PostProcessor.so"
},
"factory": "mxpi_modelinfer",
"next": "mxpi_dataserialize0"
},
"mxpi_dataserialize0": {
"props": {
"outputDataKeys": "mxpi_modelinfer0"
},
"factory": "mxpi_dataserialize",
"next": "appsink0"
},
"appsrc0": {
"props": {
"blocksize": "409600"
},
"factory": "appsrc",
"next": "mxpi_imagedecoder0"
},
"appsink0": {
"props": {
"blocksize": "4096000"
},
"factory": "appsink"
}
}
}
{4} main.cpp
需要改的地方是 待测试图像的路径,和pipeline路径和cv::Mat src = cv::imread("./test_helmet.jpg")路径
/*
* Copyright (c) 2020.Huawei Technologies Co., Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <cstring>
#include "MxBase/Log/Log.h"
#include "MxStream/StreamManager/MxStreamManager.h"
#include "opencv4/opencv2/opencv.hpp"
namespace {
APP_ERROR ReadFile(const std::string& filePath, MxStream::MxstDataInput& dataBuffer)
{
char c[PATH_MAX + 1] = { 0x00 };
size_t count = filePath.copy(c, PATH_MAX + 1);
if (count != filePath.length()) {
LogError << "Failed to copy file path(" << c << ").";
return APP_ERR_COMM_FAILURE;
}
// Get the absolute path of input file
char path[PATH_MAX + 1] = { 0x00 };
if ((strlen(c) > PATH_MAX) || (realpath(c, path) == nullptr)) {
LogError << "Failed to get image, the image path is (" << filePath << ").";
return APP_ERR_COMM_NO_EXIST;
}
// Open file with reading mode
FILE *fp = fopen(path, "rb");
if (fp == nullptr) {
LogError << "Failed to open file (" << path << ").";
return APP_ERR_COMM_OPEN_FAIL;
}
// Get the length of input file
fseek(fp, 0, SEEK_END);
long fileSize = ftell(fp);
fseek(fp, 0, SEEK_SET);
// If file not empty, read it into FileInfo and return it
if (fileSize > 0) {
dataBuffer.dataSize = fileSize;
dataBuffer.dataPtr = new (std::nothrow) uint32_t[fileSize];
if (dataBuffer.dataPtr == nullptr) {
LogError << "allocate memory with \"new uint32_t\" failed.";
return APP_ERR_COMM_FAILURE;
}
uint32_t readRet = fread(dataBuffer.dataPtr, 1, fileSize, fp);
if (readRet <= 0) {
fclose(fp);
return APP_ERR_COMM_READ_FAIL;
}
fclose(fp);
return APP_ERR_OK;
}
fclose(fp);
return APP_ERR_COMM_FAILURE;
}
std::string ReadPipelineConfig(const std::string& pipelineConfigPath)
{
std::ifstream file(pipelineConfigPath.c_str(), std::ifstream::binary);
if (!file) {
LogError << pipelineConfigPath <<" file dose not exist.";
return "";
}
file.seekg(0, std::ifstream::end);
uint32_t fileSize = file.tellg();
file.seekg(0);
std::unique_ptr<char[]> data(new char[fileSize]);
file.read(data.get(), fileSize);
file.close();
std::string pipelineConfig(data.get(), fileSize);
return pipelineConfig;
}
}
int main(int argc, char* argv[])
{
// read image file and build stream input
MxStream::MxstDataInput dataBuffer;
APP_ERROR ret = ReadFile("/home/HwHiAiUser/zhy750v2ai_eval/750AI/yolov5_hou/6.jpg", dataBuffer);
if (ret != APP_ERR_OK) {
LogError << "Failed to read image file, ret = " << ret << ".";
return ret;
}
// read pipeline config file
std::string pipelineConfigPath = "/home/HwHiAiUser/zhy750v2ai_eval/750AI/yolov5_hou/yolov5_example.pipeline";
std::string pipelineConfig = ReadPipelineConfig(pipelineConfigPath);
if (pipelineConfig == "") {
LogError << "Read pipeline failed.";
return APP_ERR_COMM_INIT_FAIL;
}
// init stream manager
MxStream::MxStreamManager mxStreamManager;
ret = mxStreamManager.InitManager();
if (ret != APP_ERR_OK) {
LogError << "Failed to init Stream manager, ret = " << ret << ".";
return ret;
}
// create stream by pipeline config file
ret = mxStreamManager.CreateMultipleStreams(pipelineConfig);
if (ret != APP_ERR_OK) {
LogError << "Failed to create Stream, ret = " << ret << ".";
return ret;
}
std::string streamName = "classification+detection";
int inPluginId = 0;
auto startTime = std::chrono::high_resolution_clock::now();
// send data into stream
ret = mxStreamManager.SendData(streamName, inPluginId, dataBuffer);
if (ret != APP_ERR_OK) {
LogError << "Failed to send data to stream, ret = " << ret << ".";
return ret;
}
// get stream output
MxStream::MxstDataOutput* output = mxStreamManager.GetResult(streamName, inPluginId);
auto endTime = std::chrono::high_resolution_clock::now();
double costMs = std::chrono::duration<double, std::milli>(endTime - startTime).count();
LogInfo << "[SendData-GetResult] cost: " << costMs << "ms. ";
LogInfo << "[SendData-GetResult] fps: " << 1000/costMs << "fps";
if (output == nullptr) {
LogError << "Failed to get pipeline output.";
return ret;
}
std::string result = std::string((char *)output->dataPtr, output->dataSize);
LogInfo << "Results:" << result;
web::json::value jsonText = web::json::value::parse(result);
if (jsonText.is_object()) {
web::json::object textObject = jsonText.as_object();
auto itInferObject = textObject.find("MxpiObject");
if (itInferObject == textObject.end() || (!itInferObject->second.is_array())) {
return 0;
}
auto iter = itInferObject->second.as_array().begin();
cv::Mat src = cv::imread("./test_helmet.jpg");
for (; iter != itInferObject->second.as_array().end(); iter++) {
if (iter->is_object()) {
auto modelInferObject = iter->as_object();
float x0 = 0;
float x1 = 0;
float y0 = 0;
float y1 = 0;
auto it = modelInferObject.find("x0");
if (it != modelInferObject.end()) {
x0 = float(it->second.as_double());
}
it = modelInferObject.find("x1");
if (it != modelInferObject.end()) {
x1 = float(it->second.as_double());
}
it = modelInferObject.find("y0");
if (it != modelInferObject.end()) {
y0 = float(it->second.as_double());
}
it = modelInferObject.find("y1");
if (it != modelInferObject.end()) {
y1 = float(it->second.as_double());
}
cv::Rect rect(x0, y0, x1 - x0, y1 - y0);
cv::rectangle(src, rect, cv::Scalar(0, 255, 0),5, cv::LINE_8,0);
}
cv::imwrite("./result.jpg", src);
}
// destroy streams
mxStreamManager.DestroyAllStreams();
delete dataBuffer.dataPtr;
dataBuffer.dataPtr = nullptr;
delete output;
return 0;
}
}
{5}run.sh
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
CUR_PATH=$(cd "$(dirname "$0")" || { warn "Failed to check path/to/run.sh" ; exit ; } ; pwd)
# Simple log helper functions
info() { echo -e "\033[1;34m[INFO ][MxStream] $1\033[1;37m" ; }
warn() { echo >&2 -e "\033[1;31m[WARN ][MxStream] $1\033[1;37m" ; }
export MX_SDK_HOME="${CUR_PATH}/../detection/mxVision"
export LD_LIBRARY_PATH="${MX_SDK_HOME}/lib":"${MX_SDK_HOME}/opensource/lib":"${MX_SDK_HOME}/opensource/lib64":"/usr/local/Ascend/ascend-toolkit/latest/acllib/lib64":"/home/data/miniD/driver/":"/home/HwHiAiUser/Ascend/acllib/lib64/":${LD_LIBRARY_PATH}
export GST_PLUGIN_SCANNER="${MX_SDK_HOME}/opensource/libexec/gstreamer-1.0/gst-plugin-scanner"
export GST_PLUGIN_PATH="${MX_SDK_HOME}/opensource/lib/gstreamer-1.0":"${MX_SDK_HOME}/lib/plugins"
# complie
g++ main.cpp -I "${MX_SDK_HOME}/include/" -I "${MX_SDK_HOME}/opensource/include/" -I "${MX_SDK_HOME}/opensource/include/opencv4" -L "${MX_SDK_HOME}/lib/" -L "${MX_SDK_HOME}/opensource/lib/" -L "${MX_SDK_HOME}/opensource/lib64/" -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -Dgoogle=mindxsdk_private -fPIC -fstack-protector-all -g -Wl,-z,relro,-z,now,-z,noexecstack -pie -Wall -lglog -lmxbase -lstreammanager -lcpprest -lmindxsdk_protobuf -lopencv_world -o main
# run
./main
exit 0
修改好文件,运行命令即可完成图像推理
sh run.sh
4.1 转换离线模型--pt文件转换为onnx文件
python export.py --weights runs/exp/weights/best.pt --img 640 --batch 1 # export at 640x640 with batch size 1
运行结果:生成yolov5s.onnx文件。
4.2 onnx模型简化和修改模型Slice算子
首先对导出的onnx图使用onnx-simplifer工具进行简化。命令:
python -m onnxsim --skip-optimization yolov5s.onnx yolov5s_sim.onnx
运行结果:生成yolov5s_sim.onnx文件。
然后利用附件脚本modify_yolov5s_slice.py(见末尾)修改模型Slice算子,运行如下命令:
python modify_yolov5s_slice.py yolov5s_sim.onnx
运行结果:生成yolov5s_sim_t.onnx文件。
用netron工具打开修改前后onnx模型文件进行比较,修改前:
修改后:
注意:脚本中编号根据实际模型进行修改
modify_yolov5s_slice.py如下:
import sys
import onnx
INT_MAX = sys.maxsize
model_path = sys.argv[1]
model = onnx.load(model_path)
def get_node_by_name(nodes, name: str):
for n in nodes:
if n.name == name:
return n
return -1
"""
before:
input
/ \
slice4 slice14 slice24 slice34
| | | |
slice9 slice19 slice29 slice39
\ \ / /
concat
after:
input
/ \
slice4 slice24
| |
t t
/ \ / \
slice9 slice19 slice29 slice39
| | | |
t t t t
\ \ / /
concat
"""
model.graph.node.remove(get_node_by_name(model.graph.node, "Slice_24"))
model.graph.node.remove(get_node_by_name(model.graph.node, "Slice_34"))
prob_info1 = onnx.helper.make_tensor_value_info('to_slice9', onnx.TensorProto.FLOAT, [1, 3, 640, 320])
prob_info3 = onnx.helper.make_tensor_value_info('to_slice19', onnx.TensorProto.FLOAT, [1, 3, 640, 320])
prob_info5 = onnx.helper.make_tensor_value_info('from_slice9', onnx.TensorProto.FLOAT, [1, 3, 320, 320])
prob_info6 = onnx.helper.make_tensor_value_info('from_slice19', onnx.TensorProto.FLOAT, [1, 3, 320, 320])
prob_info7 = onnx.helper.make_tensor_value_info('from_slice29', onnx.TensorProto.FLOAT, [1, 3, 320, 320])
prob_info8 = onnx.helper.make_tensor_value_info('from_slice39', onnx.TensorProto.FLOAT, [1, 3, 320, 320])
# slice4 slice24后的Transpose
node1 = onnx.helper.make_node(
'Transpose',
inputs=['131'],
outputs=['to_slice9'],
perm=[0, 1, 3, 2]
)
node3 = onnx.helper.make_node(
'Transpose',
inputs=['141'],
outputs=['to_slice19'],
perm=[0, 1, 3, 2]
)
# slice9 slice19 slice29 slice39后的Transpose
node5 = onnx.helper.make_node(
'Transpose',
inputs=['from_slice9'],
outputs=['136'],
perm=[0, 1, 3, 2]
)
node6 = onnx.helper.make_node(
'Transpose',
inputs=['from_slice19'],
outputs=['146'],
perm=[0, 1, 3, 2]
)
node7 = onnx.helper.make_node(
'Transpose',
inputs=['from_slice29'],
outputs=['156'],
perm=[0, 1, 3, 2]
)
node8 = onnx.helper.make_node(
'Transpose',
inputs=['from_slice39'],
outputs=['166'],
perm=[0, 1, 3, 2]
)
model.graph.node.append(node1)
model.graph.node.append(node3)
model.graph.node.append(node5)
model.graph.node.append(node6)
model.graph.node.append(node7)
model.graph.node.append(node8)
# slice9 slice19 换轴
model.graph.initializer.append(onnx.helper.make_tensor('starts_9', onnx.TensorProto.INT64, [1], [0]))
model.graph.initializer.append(onnx.helper.make_tensor('ends_9', onnx.TensorProto.INT64, [1], [INT_MAX]))
model.graph.initializer.append(onnx.helper.make_tensor('axes_9', onnx.TensorProto.INT64, [1], [2]))
model.graph.initializer.append(onnx.helper.make_tensor('steps_9', onnx.TensorProto.INT64, [1], [2]))
newnode1 = onnx.helper.make_node(
'Slice',
name='Slice_9',
inputs=['to_slice9', 'starts_9', 'ends_9', 'axes_9', 'steps_9'],
outputs=['from_slice9'],
)
model.graph.node.remove(get_node_by_name(model.graph.node, "Slice_9"))
model.graph.node.insert(9, newnode1)
newnode2 = onnx.helper.make_node(
'Slice',
name='Slice_19',
inputs=['to_slice19', 'starts_9', 'ends_9', 'axes_9', 'steps_9'],
outputs=['from_slice19'],
)
model.graph.node.remove(get_node_by_name(model.graph.node, "Slice_19"))
model.graph.node.insert(19, newnode2)
# slice29 slice39 换轴
model.graph.initializer.append(onnx.helper.make_tensor('starts_29', onnx.TensorProto.INT64, [1], [1]))
model.graph.initializer.append(onnx.helper.make_tensor('ends_29', onnx.TensorProto.INT64, [1], [INT_MAX]))
model.graph.initializer.append(onnx.helper.make_tensor('axes_29', onnx.TensorProto.INT64, [1], [2]))
model.graph.initializer.append(onnx.helper.make_tensor('steps_29', onnx.TensorProto.INT64, [1], [2]))
newnode3 = onnx.helper.make_node(
'Slice',
name='Slice_29',
inputs=['to_slice9', 'starts_29', 'ends_29', 'axes_29', 'steps_29'],
outputs=['from_slice29'],
)
model.graph.node.remove(get_node_by_name(model.graph.node, "Slice_29"))
model.graph.node.insert(29, newnode3)
newnode4 = onnx.helper.make_node(
'Slice',
name='Slice_39',
inputs=['to_slice19', 'starts_29', 'ends_29', 'axes_29', 'steps_29'],
outputs=['from_slice39'],
)
model.graph.node.remove(get_node_by_name(model.graph.node, "Slice_39"))
model.graph.node.insert(39, newnode4)
# onnx.checker.check_model(model)
onnx.save(model, sys.argv[1].split('.')[0] + "_t.onnx")
4.3 onnx文件转换为om文件
(1)终端命令转化
atc --model=yolov5s_helmet_sim.onnx --framework=5 --output=yolov5som_helmet --soc_version=Ascend310
如果yolo版本为v6.0,则转化命令:
atc --model=yolov5s_helmet_sim.onnx --framework=5 --output=yolov5som_helmet --soc_version=Ascend310 --out_nodes="Transpose_260:0;Transpose_308:0;Transpose_356:0"
(2)脚本转化 首先在onnx模型同一目录下新建aipp_yolov5.cfg和atc.sh文件
aipp_yolov5.cfg:
aipp_op {
aipp_mode : static
related_input_rank : 0
input_format : YUV420SP_U8
src_image_size_w : 640
src_image_size_h : 640
crop : false
csc_switch : true
rbuv_swap_switch : false
matrix_r0c0 : 256
matrix_r0c1 : 0
matrix_r0c2 : 359
matrix_r1c0 : 256
matrix_r1c1 : -88
matrix_r1c2 : -183
matrix_r2c0 : 256
matrix_r2c1 : 454
matrix_r2c2 : 0
input_bias_0 : 0
input_bias_1 : 128
input_bias_2 : 128
var_reci_chn_0 : 0.0039216
var_reci_chn_1 : 0.0039216
var_reci_chn_2 : 0.0039216
}
atc.sh:
export PATH=/usr/local/python3.7.5/bin:/home/local/Ascend/ascend-toolkit/latest/atc/ccec_compiler/bin:/home/local/Ascend/ascend-toolkit/latest/atc/bin:$PATH
export PYTHONPATH=/home/local/Ascend/ascend-toolkit/latest/atc/python/site-packages:/home/local/Ascend/ascend-toolkit/latest/atc/python/site-packages/auto_tune.egg/auto_tune:$/home/local/Ascend/ascend-toolkit/latest/atc/python/site-packages/schedule_search.egg
export LD_LIBRARY_PATH=/home/local/Ascend/ascend-toolkit/latest/atc/lib64:$LD_LIBRARY_PATH
export ASCEND_OPP_PATH=/home/local/Ascend/ascend-toolkit/latest/opp
export SLOG_PRINT_TO_STDOUT=1
/home/local/Ascend/ascend-toolkit/latest/atc/bin/atc --model=./hou_sim_t.onnx \
--framework=5 \
--output=./hou \
--input_format=NCHW \
--input_shape="images:1,3,640,640" \
--enable_small_channel=1 \
--insert_op_conf=./aipp_yolov5.cfg \
--soc_version=Ascend310 \
--log=info
运行atc.sh, 命令:sh atc.sh,生成hou.om模型