为什么我的HAARCascade模型不准确?Python+OpenCV

2024-10-17 04:32:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Cascade Trainer GUI获取XML文件。我有100张正片和400张负片。培训过程只花了大约5分钟,结果并不准确。我训练模型的对象是一把小螺丝刀。生成的.xml文件只有31.5 KB。请看图片。 enter image description here

此外,照片中的矩形很小,更不用说不准确了

除了添加更多正面和负面图像外,我应该如何创建更精确的模型?我最终也需要做图像跟踪。谢谢

#import numpy as np
import cv2
import time
"""
This program uses openCV to detect faces, smiles, and eyes. It uses haarcascades which are public domain. Haar cascades rely on
xml files which contain model training data. An xml file can be generated through training many positive and negative images. 
Try your built-in camera with 'cap = cv2.VideoCapture(0)' or use any video. cap = cv2.VideoCapture("videoNameHere.mp4")
"""

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_eye.xml')
smile = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_smile.xml')
screw = cv2.CascadeClassifier('cascade.xml')
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_SIMPLEX

prev_frame_time, new_frame_time = 0,0
while 1:
    ret, img = cap.read()
    img = cv2.resize(img,(1920,1080))
    #faces = face_cascade.detectMultiScale(img, 1.5, 5)
    #eyes = eye_cascade.detectMultiScale(img,1.5,6)
  #  smiles = smile.detectMultiScale(img,1.1,400)
    screws = screw.detectMultiScale(img,1.2,3)

    new_frame = time.time()
    try:
        fps = 1/(new_frame_time-prev_frame_time)
    except:
        fps = 0
    fps = int(fps)
    cv2.putText(img,"FPS: "+str(fps),(10,450), font, 3, (0,0,0), 5, cv2.LINE_AA)

   # for (x,y,w,h) in smiles:
        #cv2.rectangle(img,(x,y),(x+w,y+h),(0,69,255),2)
       # cv2.putText(img,"smile",(int(x-.1*x),int(y-.1*y)),font,1,(255,255,255),2)

    for (x,y,w,h) in screws:
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,255),2)
        cv2.putText(img,"screwdriver",(int(x-.1*x),int(y-.1*y)),font,1,(255,0,255),2)
  
   # for (x,y,w,h) in faces:
       #  cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
       #  cv2.putText(img,"FACE",(int(x-.1*x),int(y-.1*y)),font,1,(255,255,255),2)
      #  roi_color = img[y:y+h, x:x+w]
       # eyes = eye_cascade.detectMultiScale(roi_color)
       # for (ex,ey,ew,eh) in eyes:
          #  cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
    
    
    cv2.imshow('img',img)
    k = cv2.waitKey(30) & 0xff
    if k == 27:
        break
    prev_frame_time = new_frame_time

cap.release()
cv2.destroyAllWindows()


Tags: inimgdatatimexmlcv2framecascade
2条回答

如果上面的图像是“典型”图像,那么使用级联就永远无法工作

那些需要可靠的纹理和姿势,你的场景缺少这两者

(我还猜测,你没有真正的100张正面图片,但你试图仅从几张或一张图片中“合成”它们,事实证明在现实生活中不起作用)

不要在这上面浪费更多的时间。 获取更多(真实的!)图像,并阅读对象检测cnn的内容,如SSD或YLO,它们在您的情况下更加可靠

关于该主题的大多数资源都建议正面和负面图片各使用3000-5000张。这很可能是精确度较低的原因

一些资源:

  1. Link 1 - sonots
  2. Link 2 - opencv-user-blog
  3. Link 3 - computer vision software
  4. Link 4 - pythonprogramming.net

相关问题 更多 >