Ping Pong Ball Detection

物体検出

Abstract

For a part of one of my project, we were needed to detect a ping pong ball using a webcam live so that our robot can navigate to it. Thus, this is the painful journey of trial, testing and crying that I went through...

Enjoy!

What I end up doing

In the end, the sweet spot combination that was used was a...

Raspberry Pi 3

Movidius(Intel's Neural Compute Stick)

Webcam

And darknet's tiny-yolov2 model that is converted to caffemodel which is then converted to a "graph" (Movidius's model)

If you just want to know how the above is done click HERE

If you are still here, then you are just a sadist who loves to hear people suffer. Oh well, I'm writing this so that I wouldn't have to repeat whatever I have done to reach this state too, so let's go

Chapter 1: And so it begins...

For the project, we needed a small form factor object detection device that can read a live webcam feed to locate a ping pong ball and send whatever signal it has to depending on the position of the ball in the video

I was doing this AFTER I was just finished training the object detection model over on my computer using darknet's yolov2. So after a little Googling, I found out that it was possible to use darknet's model and OpenCV 4.0.0-pre to just detect a webcam stream. Just like that the fist solution comes to mind and I immediately start prototyping.

Ubuntu 16.04

OpenCV 4.0.0-pre

a lil python code (THAT IS NOT MINE)

And walah, it worked! Here is the code

OpenCV Object Detection with Darknet's yolov2

Just install the latest OpenCV with its contrib repo and run the python file with a terminal

DISCLAIMER: I DID NOT WRITE THIS CODE, this guy did --> Arun Ponnusamy
python OpenCVxDarknet.py --weight [.weight file] --config [.cfg file] --classes [file with all the names of the classes]

Just change all the [things like this] to the directory of the respective file

It worked fine so I decided to bring the thing to a Raspberry Pi, just like what I need it to do

So I loaded up my Raspbery Pi 1 B (yes "ONE") and tried installing OpenCV

...

THE COMPILING OF OPENCV TOOK 3 WHOLE DAY

I literally let my Raspberry Pi run for a whole 3 day just compiling OpenCV from source

Not only that, IT FAILED AT 90%!!!!

After crying myself to sleep, I comforted myself by installing the pre-compiled OpenCV

sudo apt-get install python-opencv

After that I opened up python interactive interpreter and run the following

import cv2
print cv2.__version__

And what was printed made me tear up a lil

OpenCV 2.4.0

THIS IS NOT THE VERSION I NEED

Btw, running the object detection python file will just lead to an error, so don't bother

With this, I need another plan. So making the best with what I had, I found out that I could run circle detection using OpenCV 2.4.0. And since ping pong ball is a circle....Yeah

Chapter 2: Detecting Circle

Again, I Googled for codes to detect circle using OpenCV and came across HOUGH_CIRCLE_DETECTION. So I downloaded it and modified it so that it uses a webcam for feed instead

Here is the code which again I DIDN'T WRITE

Detecting round stuff with OpenCV
#If you got an error about no such module as cv2.cv.CV_HOUGH_GRADIENT change it to this
cv2.HOUGH_GRADIENT

Again, it worked so but knowing the limitation of my Raspberry Pi 1
(Yes again, ONE) I tried to optimize the code

The first was by only allowing it to detect up to 2 circles, and the second by lowering the resolution of the image that it will be proccesing

This was met with... At least it worked :D. But the image was more like a slide show than a video. And... It crashes when it detects any circle at all... so there was that

[INTERNAL SCREAMING]

Chapter 3: Ascending from Hell

Afer realising that NO ONE, and I mean NO ONE THAT HAS A PROPER WORKING MIND, will actually use Raspberry Pi 1 for Machine Vision and expect it to puke out an usable framerate. Thus, I went back to my teacher and borrowed a Raspberry Pi 3 B instead.

Upgrading from a RasPi 1 to RasPi 3 was like going from a Nokia flip phone to an iPhone Xs (not that I have used one before) but I am pretty sure it will feel the same

Then I procced to repeat using OpenCV 4.0.0-pre's dnn module to infer from darknet's model for object detection

This time, the whole installation took a little under 3 hours and it works. I repeat, THE COMIPLATION DID NOT STOP AT 90%!!!!

I then procced to test the thing out and it works... at least for the purpose of a slideshow. The FPS was awful and the latency was no better. The video had a 7 second delay, as if I was looking to the past. And the framerates were no better, it was like i said, a slideshow

Back to the drawing board

I then vaguely remember our teacher having this thing called the "Movidius", it is supposed to be a inference acceleration. Hence, I asked to borrow the item and procced on testing

Chapter 4: Hope

So firstly, you have to install the sdk for Movidius

BTW, Movidius is a Intel NCS (Neural Compute Stick). It is basically a GPU (or VPU for Vision Proccesing Unit as they call it) that can be pluged in using the USB port and somehow by magic it enhances Machine Learning Inference? At this point, I am just hoping for a miracle that could give me the solution that i need

So for installtion of NCSDK 2.0

Do the following

git clone -b ncsdk2 http://github.com/Movidius/ncsdk
cd ncsdk
make install

BUT BEFORE YOU DO THE ABOVE, there is something you can do to accelerate the installation

Just type the following in a terminal and run it

sudo nano /etc/dphys-swapfile

This will open the file, scroll down to CONF_SWAPSIZE and change 100 to something like 1024

Then run the follwing

sudo /etc/init.d/dphys-swapfile restart

You may now install ncsdk 2.0

The installation actually takes quite a while so just wait

Next, if your model is from caffe or tensorflow, it is easier. For tensorflow..... I dont know, just Google it, it is supported. For Caffe, just run the following to compile your caffemodel into a movidius "graph" (you would later use this for inference)

mvNCCompile [.prototxt] -w [.caffemodel] -s 12
#Example mvNCCompile yoloV2Tiny20.prototxt -w yoloV2Tiny20.caffemodel -s 12

Then there are quite alot of code in the internet that could just use this "graph" and run inference with it, but I was not using a caffemodel, I trained it with darknet's yolov2 model WHICH IS NOT SUPPORTED (at least in the time of writing)

Thus, I need to research my way around making it possible and have finally come up with a solution

Chapter 5: Finale

The work around that I have come up with is to convert my yolo model into caffe model which I can then convert into a NCS model for inference

Firsly clone this repo

git clone https://github.com/duangenquan/YoloV2NCS.git
This only works with Darknet's Yolov2 tiny (which is what I used)

Navigate to the cloned folder and open it

Then get your darknet's .weight and .cfg file and copy it inside models/yolomodels

Make sure that both the .weight and the .cfg file have the same name. Eg (yoloTiny.weight and yoloTiny.cfg)

Inside the models folder, run the following in terminal

source convertyo.sh

After convertion is complete, your caffemodel and prototxt file will be in the caffemodels folder

Go back to root folder of the repo and navigate to src/Region.cpp

Change class name to = "class_name" (It is in a form of list)

Go back to root again, and navigate to detectionExample/ObjectWrapper.py

Change self.classes to correct class number

Change self.thershold to a very low number... (for safety)

In the same folder, edit Main.py

Change "cap = cv2.VideoCapture(videofile)" in line 43 to "cap = cv2.VideoCapture(0)"

Navigate back to the root folder and generate graph using the same command. This is so that the graph file will be generated in the repo root folder

NOTE: You can move item using the terminal with => mv (source path) (destination path)
mvNCCompile .prototxt -w .caffemodel -s 12

Still in the root folder of the repo make the file

make

Then run the detection code

python3 detectionExample/Main.py --video 0 --graph [graph name]

And then you are done :D