Tools & Things To Know About Adaboost

As promised in my last blog, I re-wrote my video viewing program to handle facial detection from more angles (it was originally written just for fronatal face images); a link to the code is provided at the bottom of this web page. My last xml file was created with just 700 images of faces; the one on this pages only has 749 images; not a whole lot more, but the images (as well as my code) was a lot cleaner this time. I wrote several utilities to create a cleaner haarcascade xml file; once again, links are provided at the bottom of this web page.

I actually got more smile hits in my last blog, but I lowered the error rate used in creating my haarcascade file:

haartraining.exe -data cascades -vec vector/facevector.vec -bg bg.txt -npos 749 -nneg 1610 -nstages 10 -mem 8192 -maxfalsealarm 0.34 -mode ALL -w 25 -h 25 -nonsym

My face images are 125 px X 125 px, but some of them aren't (too keep the same aspect ratio as the original), hence the -nonsym parameter (for non-symetrical). The 2001 paper by Viola and Jones said they used 384px X 288 px images at a rate of 15 fps (frames per second). VJ also used a sample size of 24px (24 X 24), which goesinto 288 an integer number of times; 12. I'm using 25px sampling, which goesinto my 125px wide images 5 times each. Notice that I'm using the default Adaboost for OpenCV; GAB (Gentle AdaBoost). There's a Real Adaboost mode (-bt RAB) for OpenCV, so I might use that for my next blog (Real AdaBoost being the Adaboost used by Viola and Jones in 2001 - I think). Notice that I used a Max False Alarm rate of 0.34; the default is 0.5. I found that the default value for maxfalsealarm caused my haarTraining session to end early (in this case, before 10 stages were created), but setting it at 0.34 seemed just right.

So, how did I get my facial images? In the 2001 Viola Jones paper, they said they captured the images off of the Internet; me too. In the past I'd crawl some website that I knew had a lot of pictures of people smiling, and download the images. I'd have to go through each image and make sure it was indeed a picture of somebody smiling. I finally decided to just google "Woman Smiling", "Man Smiling", etc. (e.g., smiling salesman) and download the pictures that google.com found. It turns out that Google supplies a neat way of downloading pictures, that I discovered at

https://32hertz.blogspot.com/2015/03/download-all-images-from-google-search.html

...and I put all of the image URLs in a text file, data.txt, and download the images with a Perl script getFiles.pl

Naturally, the images I download can have all sorts of things in them that I don't want to consider as being part of a smile; e.g., a Chevy Silverado (I used that as an example in an earlier blog). According to the 2001 paper by Viola and Jones, their algorithm takes care of extraneous information in pictures; the best laid schemes o' mice an' men, gang aft a-gley. So my script facial.py cuts out just the face section of the image (and converts the image to grayscale).

There are five other utilities that are extremely important for creating haarcascade files from images:

facialSingles.py renameFiles.pl listFiles.pl renameDisparateFiles.pl convert2Grayscale.py

That does it for the utilities I wrote for creating haarcascade xml files. I'm also using the utilities I downloaded from

https://www.cs.auckland.ac.nz/~m.rezaei/Tutorials/Creating_a_Cascade_of_Haar-Like_Classifiers_Step_by_Step.pdf

but those are the applications for creating haarcascade files, and they can be replaced with the executables that come with the opencv package; opencv_createsamples, and opencv_traincascade. There are several other executables supplied with the package, that can be found at /opencv/build/x64/vc14/bin (at least in the Windows version of opencv).

I followed the rule of thumb I came across on the Internet; use twice as many negatives as positive images. If you don't want to use more negatives then positive images, you can lower the parameter value for maxfalsealarm, but you have to play around with it, because the training session will hang up on you. Furthermore, the value you use will probably have to incorporate all 6 digits of maxfalsealarm; it's a lot easier to just make sure you have twice as many negatives as positive images. Six digits means that this program uses single precision floating point numbers; IEEE 754. Viola and Jones said they used a 700 MHz Pentium III processor; the Coppermine core. This processor used an instruction set (SSE) that allowed extremely high speed single precision computations with the floating point registers (and this processor added four 32 bit registers for IEEE 754 numbers to be used with the SSE instructions). Today's 64 bit processors also contain double precision registers (they're 64 bits wide), but VJ had to stick with single precision too get reasonable adaboost performance. So it's possible (maybe probable) that opencv_ createsamples and opencv_traincascade support double precision numbers, but I haven't checked that out yet.

createPosTxt video_streamer14.py getFiles.pl facial.py

Return To My Blog Page       Haarcascade Smiles File       Return To My Programming Page