Abstract
For a part of one of my project, we were needed to detect a ping pong ball using a webcam live so that our robot can navigate to it. Thus, this is the painful journey of trial, testing and crying that I went through...
Enjoy!
What I end up doing
In the end, the sweet spot combination that was used was a...
Raspberry Pi 3
Movidius(Intel's Neural Compute Stick)
Webcam
And darknet's tiny-yolov2 model that is converted to caffemodel which is then converted to a "graph" (Movidius's model)
If you just want to know how the above is done click HERE
If you are still here, then you are just a sadist who loves to hear people suffer. Oh well, I'm writing this so that I wouldn't have to repeat whatever I have done to reach this state too, so let's go
Chapter 1: And so it begins...
For the project, we needed a small form factor object detection device that can read a live webcam feed to locate a ping pong ball and send whatever signal it has to depending on the position of the ball in the video
I was doing this AFTER I was just finished training the object detection model over on my computer using darknet's yolov2. So after a little Googling, I found out that it was possible to use darknet's model and OpenCV 4.0.0-pre to just detect a webcam stream. Just like that the fist solution comes to mind and I immediately start prototyping.
Ubuntu 16.04
OpenCV 4.0.0-pre
a lil python code (THAT IS NOT MINE)
And walah, it worked! Here is the code
OpenCV Object Detection with Darknet's yolov2Just install the latest OpenCV with its contrib repo and run the python file with a terminal
Just change all the [things like this] to the directory of the respective file
It worked fine so I decided to bring the thing to a Raspberry Pi, just like what I need it to do
So I loaded up my Raspbery Pi 1 B (yes "ONE") and tried installing OpenCV
...
THE COMPILING OF OPENCV TOOK 3 WHOLE DAY
I literally let my Raspberry Pi run for a whole 3 day just compiling OpenCV from source
Not only that, IT FAILED AT 90%!!!!
After crying myself to sleep, I comforted myself by installing the pre-compiled OpenCV
After that I opened up python interactive interpreter and run the following
print cv2.__version__
And what was printed made me tear up a lil
THIS IS NOT THE VERSION I NEED
Btw, running the object detection python file will just lead to an error, so don't bother
With this, I need another plan. So making the best with what I had, I found out that I could run circle detection using OpenCV 2.4.0. And since ping pong ball is a circle....Yeah
Chapter 2: Detecting Circle
Again, I Googled for codes to detect circle using OpenCV and came across HOUGH_CIRCLE_DETECTION. So I downloaded it and modified it so that it uses a webcam for feed instead
Here is the code which again I DIDN'T WRITE
Detecting round stuff with OpenCVcv2.HOUGH_GRADIENT
Again, it worked so but knowing the limitation of my Raspberry Pi 1
(Yes again, ONE) I tried to optimize
the code
The first was by only allowing it to detect up to 2 circles, and the second by lowering the resolution of the image that it will be proccesing
This was met with... At least it worked :D. But the image was more like a slide show than a video. And... It crashes when it detects any circle at all... so there was that
[INTERNAL SCREAMING]
Chapter 3: Ascending from Hell
Afer realising that NO ONE, and I mean NO ONE THAT HAS A PROPER WORKING MIND, will actually use Raspberry Pi 1 for Machine Vision and expect it to puke out an usable framerate. Thus, I went back to my teacher and borrowed a Raspberry Pi 3 B instead.
Upgrading from a RasPi 1 to RasPi 3 was like going from a Nokia flip phone to an iPhone Xs (not that I have used one before) but I am pretty sure it will feel the same
Then I procced to repeat using OpenCV 4.0.0-pre's dnn module to infer from darknet's model for object detection
This time, the whole installation took a little under 3 hours and it works. I repeat, THE COMIPLATION DID NOT STOP AT 90%!!!!
I then procced to test the thing out and it works... at least for the purpose of a slideshow. The FPS was awful and the latency was no better. The video had a 7 second delay, as if I was looking to the past. And the framerates were no better, it was like i said, a slideshow
Back to the drawing board
I then vaguely remember our teacher having this thing called the "Movidius", it is supposed to be a inference acceleration. Hence, I asked to borrow the item and procced on testing
Chapter 4: Hope
So firstly, you have to install the sdk for Movidius
BTW, Movidius is a Intel NCS (Neural Compute Stick). It is basically a GPU (or VPU for Vision Proccesing Unit as they call it) that can be pluged in using the USB port and somehow by magic it enhances Machine Learning Inference? At this point, I am just hoping for a miracle that could give me the solution that i need
So for installtion of NCSDK 2.0
Do the following
cd ncsdk
make install
BUT BEFORE YOU DO THE ABOVE, there is something you can do to accelerate the installation
Just type the following in a terminal and run it
This will open the file, scroll down to CONF_SWAPSIZE and change 100 to something like 1024
Then run the follwing
You may now install ncsdk 2.0
The installation actually takes quite a while so just wait
Next, if your model is from caffe or tensorflow, it is easier. For tensorflow..... I dont know, just Google it, it is supported. For Caffe, just run the following to compile your caffemodel into a movidius "graph" (you would later use this for inference)
#Example mvNCCompile yoloV2Tiny20.prototxt -w yoloV2Tiny20.caffemodel -s 12
Then there are quite alot of code in the internet that could just use this "graph" and run inference with it, but I was not using a caffemodel, I trained it with darknet's yolov2 model WHICH IS NOT SUPPORTED (at least in the time of writing)
Thus, I need to research my way around making it possible and have finally come up with a solution
Chapter 5: Finale
The work around that I have come up with is to convert my yolo model into caffe model which I can then convert into a NCS model for inference
Firsly clone this repo
Navigate to the cloned folder and open it
Then get your darknet's .weight and .cfg file and copy it inside models/yolomodels
Make sure that both the .weight and the .cfg file have the same name. Eg (yoloTiny.weight and yoloTiny.cfg)
Inside the models folder, run the following in terminal
After convertion is complete, your caffemodel and prototxt file will be in the caffemodels folder
Go back to root folder of the repo and navigate to src/Region.cpp
Change class name to = "class_name" (It is in a form of list)
Go back to root again, and navigate to detectionExample/ObjectWrapper.py
Change self.classes to correct class number
Change self.thershold to a very low number... (for safety)
In the same folder, edit Main.py
Change "cap = cv2.VideoCapture(videofile)" in line 43 to "cap = cv2.VideoCapture(0)"
Navigate back to the root folder and generate graph using the same command. This is so that the graph file will be generated in the repo root folder
Still in the root folder of the repo make the file
Then run the detection code
And then you are done :D