【避坑记录】小白的yolov5和face_recognition环境配置

孟浩慨
2023-12-01

抱着试试水的想法参加了robocup校赛,想借此试试人工智能识别。比赛给了诸多实现方案,我选择了其中需要自己搭网络的方案。不料配置环境一路坎坷,特此记录
比赛分为人脸识别和物体识别。人脸识别是face_recognition,物体识别是yolov5。出于电脑洁癖和项目本身要求,我租了个ubuntu服务器,在上面折腾,随心所欲下环境。

服务器配置: 

apt-get update
apt install git
按照百度到的方法下载了anaconda

物体识别 

https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data
项目地址如上。跟着readme走,出了一个问题:pytorch装了,但是就是import不了。
在租服务器之前,我本来想在在线编程网站(Log In - Replit)配置好环境的,但是torch好不容易安装好,一import就是no modual。装了4次甚至把磁盘装满了。。。
百度到的解决方法都是要conda的,而在线编程网站没有conda也没有sudo,所以买服务器,装anaconda(说实话我对ananconda印象不好,那么多没用的包占我空间。我心中的的py就是方便日常的小工具,大项目还是用别的语言。不过买了服务器就可以放心随便霍霍)
然后。。。就行了。。。
顺便不得不去学了一些conda的知识


人脸识别:

https://github.com/ageitgey/face_recognition
↑face_recognition项目所在。
首先pip install cmake和boost,然后按照readme所说直接pip install face_recognition
这个折腾了好久,dlib一直安装不成功,百度都说和py版本有关。搞了两天,不知道怎么回事就用3.10的py配置好了
以下是我的做法:
    先是conda create了一个3.10的py虚拟环境。
    pip install face_recognition,其他都很顺利,就是dlib不行。
    发现dlib一直装不好,首先怀疑py版本(用的py3.10)。于是新开虚拟空间,用py3.6安装,发现还是报错:
    C++: fatal error: Killed signal terminated program cc1plus(之前也有这种报错,但是没在意也没往这想)
    于是发现是运行内存不够,按照 https://blog.csdn.net/weixin_44796670/article/details/121234446 开空间给内存(插一句,服务器2G的内存,我之前跑这个都把远程连接卡掉了,开了nohup也照样卡掉,没想到运行内存可以这样拓容),跑成了,截取日志如下:

[ 14%] Building CXX object dlib_build/CMakeFiles/dlib.dir/logger/logger_config_file.cpp.o
[ 15%] Building CXX object dlib_build/CMakeFiles/dlib.dir/misc_api/misc_api_kernel_1.cpp.o
make[2]: *** wait: No child processes.  Stop.
make[2]: *** Waiting for unfinished jobs....
make[2]: *** wait: No child processes.  Stop.
make[1]: *** [CMakeFiles/Makefile2:144: dlib_build/CMakeFiles/dlib.dir/all] Error 2
make: *** [Makefile:84: all] Hangup
SIGHUP
CMake Error: Generator: execution of make failed. Make command was: /usr/bin/make -j1
... # 省略超多报错,虽然报错了但是还在挣扎
ERROR: Failed building wheel for dlib
Running setup.py clean for dlib
Failed to build dlib
Installing collected packages: dlib, Click, face-recognition
Running setup.py install for dlib: started
Running setup.py install for dlib: still running...
Running setup.py install for dlib: finished with status 'done'
DEPRECATION: dlib was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.
Successfully installed Click-8.0.4 dlib-19.24.0 face-recognition-1.3.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
# 然后就成功啦

运行期间(大概是以上日志第一次出现error的时候)按照网上的建议install了boost,不知道和成功有没有关系(也许是我之前一直忘了装导致之前一直失败?但是我没看到有指向这个的报错。。。官方文档也没说啊啊啊
然后发现。。。3.10的py环境也可以import face_recognition了。。。把3.6的虚拟环境删了也可以。。。挺秃然的

再记录一点其他的:第一次用服务器没有装anaconda,dlib装了好几次浪费了许多内存,于是重置了服务器,但是ssh连不上了。sh-keygen -R [服务器IP] 之后才能重新ssh(把之前的记录删掉)

linux的root可以同时多人登诶

2022/10/1 更新 

robocup过预赛啦。更新一下用yolo训练自己的模型的过程。

mytrain.yaml 

train: /root/beshar/yolov5/trainSetting/train.txt
val: /root/beshar/yolov5/trainSetting/val.txt

#number of classes
nc: 26

#class names
names: ['shampoo', 'coffee', 'laundry_detergent', 'chocolate', 'orion_friends', 'cola', 'water_glass', 'folder', 'AD_calcium_milk', 'slippers', 'fruit_knife', 'dish_soap', 'dish soap', 'prawn_crackers', 'book', 'biscuits', 'paper_napkin', 'soda', 'toilet_water', 'potato_chips', 'water', 'melon_seeds', 'pen', 'hammer', 'toothpaste', 'fan']

这是训练数据配置文件,train后面是训练集,val后面是验证集,txt文件中,一行就是一个图片的地址;names后面是标签列表,nc后面是names列表的length。txt文件的内容在后面说。

yolov5s.yaml

# YOLOv5  by Ultralytics, GPL-3.0 license

# Parameters
nc: 26  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

是训练的权重。可以直接从yolov5项目里已经训练好的模型里面复制一份。注意要改nc的值,和mytrain.yaml一样。

文件组织 

trainSetting
 ├---images
 |    ├---1.jpg
 |    ├---2.jpg
 |    ├---……
 ├---labels
 |    ├---1.txt
 |    ├---2.txt
 |    ├---……
 ├---train.txt
 ├---val.txt
 ├---mytrain.yaml
 ├---yolov5s.yaml

“images”和“labels”两个文件夹分别存放所有图片和所有标签,文件夹名最好不要变。最好像这样组织文件,因为所有配置文件中似乎都没有提及标签的地址,应该是训练时程序自己查找的。如果改了文件夹名或者放在其他地方,很可能找不到标签。注意,标签名和对应文件名应该相同。最主要的是train.txt和val.txt的文件内容,每一行放的是图片地址,举例如下:

train.txt中:
/root/beshar/yolov5/trainSetting/images/12597.jpg
/root/beshar/yolov5/trainSetting/images/7840.jpg
/root/beshar/yolov5/trainSetting/images/11908.jpg
/root/beshar/yolov5/trainSetting/images/7975.jpg
/root/beshar/yolov5/trainSetting/images/10270.jpg
/root/beshar/yolov5/trainSetting/images/3818.jpg
/root/beshar/yolov5/trainSetting/images/9384.jpg
/root/beshar/yolov5/trainSetting/images/6641.jpg
/root/beshar/yolov5/trainSetting/images/4746.jpg
/root/beshar/yolov5/trainSetting/images/11053.jpg
…… # 省略了后面的很多图片
val.txt中:
/root/beshar/yolov5/trainSetting/images/3041.jpg
/root/beshar/yolov5/trainSetting/images/8569.jpg
/root/beshar/yolov5/trainSetting/images/6977.jpg
/root/beshar/yolov5/trainSetting/images/4770.jpg
/root/beshar/yolov5/trainSetting/images/5245.jpg
/root/beshar/yolov5/trainSetting/images/8169.jpg
…… # 省略了后面的图片地址

训练集和验证集的数目我用的比例是9:1。两个文件的制作写个python就解决了:

# 生成测试集和验证集
import os
import random
list = os.listdir('/root/beshar/yolov5/trainSetting/images')
train = open('/root/beshar/yolov5/trainSetting/train.txt','w')
val = open('/root/beshar/yolov5/trainSetting/val.txt','w')

for i in range(0,len(list)) :
    file = os.path.join('/root/beshar/yolov5/trainSetting/images',list[i])
    if random.random() < 0.1:
        val.write(file+'\n')
    else:
        train.write(file+'\n')

train.close()
val.close()

开始训练

python /root/beshar/yolov5/train.py --img 640 --batch 16(一次输入几张图片给网络) --epochs 300(训练几趟) --data {mytrain.yaml的地址} --cfg {yolov5s.yaml的地址} --weights {.pt文件的地址}

说一下最后一个“.pt文件”:如果是第一次训练,可以从yolov5的github项目上下载一个.pt;在一次训练后,会生成新的.pt文件,一般在yolov5/runs/train/exp/weights/best.pt,这个是本次训练的结果,如果要在这个基础上继续训练,应该以这个best.pt的地址为参数。某次训练我实际使用的参数如下: 

sysctl vm.swappiness=8
python /root/beshar/yolov5/train.py --img 640 --batch 1 --epochs 2 --data /root/beshar/yolov5/trainSetting/mytrain.yaml --cfg /root/beshar/yolov5/trainSetting/yolov5s.yaml --weights /root/beshar/yolov5/runs/train/exp7/weights/best.pt

(之所以有第一行,而且batch值是1,是因为服务器只有2G内存。经过测试,只有这一个参数直接决定用多少内存) 

训练的时候可以在开头加个nohup ,在结尾加个 &,挂服务器后台训练

使用模型

import os
# besharImg是待识别的图片所在文件夹
root = '/root/beshar/besharImg'
list=os.listdir(root)
for i in range(0,len(list)):
    os.system(f"python /root/beshar/yolov5/detect.py --source {os.path.join(root,list[i])} --weights /root/beshar/yolov5/runs/train/exp2/weights/best.pt --project /root/beshar/output --conf-thres 0.5 --exist-ok")

说一下几个参数:

--source:识别的图片地址

--weights:用哪个模型

--project:自定义结果输出位置。我的服务器上默认在/root/beshar/yolov5/runs/detect

--donf-thres:最低置信度,低于这个的值的就不标在图上

--exist-ok:不要传参,写了就表示:每识别一张,结果放在已经有的文件夹下面(默认是识别一张新建一个子文件夹)

face_recognition的使用

import face_recognition
from PIL import Image, ImageDraw
import numpy as np
import os

knownbase = '/root/beshar/faces/known'
unknowbase = '/root/beshar/input'
knowlist = os.listdir(knownbase)
known_face_encodings = []
known_face_names = []

for i in knowlist:
    known_face_encodings.append(face_recognition.face_encodings(face_recognition.load_image_file(os.path.join(knownbase,i)))[0])
    known_face_names.append(i[:-4])

unknowlist = os.listdir(unknowbase)
print(unknowlist)
for i in range(0,len(unknowlist)):
    unknown_image = face_recognition.load_image_file(os.path.join(unknowbase,unknowlist[i]))
    face_locations = face_recognition.face_locations(unknown_image)
    face_encodings = face_recognition.face_encodings(unknown_image, face_locations)
    pil_image = Image.fromarray(unknown_image)
    draw = ImageDraw.Draw(pil_image)
    name = "Unknown"
    for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):
        matches = face_recognition.compare_faces(known_face_encodings, face_encoding)
        face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)
        best_match_index = np.argmin(face_distances)
        if matches[best_match_index]:
            name = known_face_names[best_match_index]
        draw.rectangle(((left, top), (right, bottom)), outline=(0, 0, 255))
        text_width, text_height = draw.textsize(name)
        draw.rectangle(((left, bottom - text_height - 10), (right, bottom)), fill=(0, 0, 255), outline=(0, 0, 255))
        draw.text((left + 6, bottom - text_height - 5), name, fill=(255, 255, 255, 255))
    del draw
    pil_image.save(f"/root/beshar/besharImg/{unknowlist[i]}")
    print(i)

为什么不详细解释face_recognition的使用呢,因为完全比不过百度的人脸识别api。其实物体识别也试过百度的easydl,但是比赛的时候掉链子了,一张图只框一个东西,类别还是错的。实际比赛用的是百度人脸识别api+yolov5。 

 类似资料: