how to improve processing speed of beagleboard xm?

Hi All,

I am using beagle board XM rev c3 with Linux Angstrom OS . I had a project on Image Processing with open CV. I compiled and run the algorithm in my system Ubuntu 10.04. The main processing taken about 1.5 seconds. But when I run the same algorithm in beagle board it takes about 60 seconds. What can I do next?

Please help me.

sumith

sumith s pillai wrote:

Hi All,

        I am using beagle board XM rev c3 with Linux Angstrom OS . I had a project on Image Processing with open CV. I
compiled and run the algorithm in my system Ubuntu 10.04. The main processing taken about 1.5 seconds. But when I run
the same algorithm in beagle board it takes about 60 seconds. What can I do next?

profile, optimize, repeat

Hi,

Thanks for your reply.

The process includes feature extraction. It is for object detection. Taking the hog features to learn the algorithm.
About 60 seconds takes to execute the following function in beagle board.

 void detectMultiScale(const GpuMat& img, vector<Rect>& found_locations,
                          double hit_threshold=0, Size win_stride=Size(),
                          Size padding=Size(), double scale0=1.05,
                          int group_threshold=2);

Thank you

do you use hw or sw floating point?

Replace double precision with single precision for a performance boost.

If you’re spending a lot of time in library image processing routines, you need to ask if the libraries were compiled with the best optimizations (NEON option, etc.). You could try compiling the libraries yourself instead of using the distro’s version.

Intelligent use of the DSP processor can give a big speed-up, but this is a whole other world of coding. Another difference between the BB ARM and your typical “big” PC is memory. Your application may not use memory cache very well or it may just need > 512 MB. Choosing a better algorithm or restructuring your data may make a big improvement, but it will take some work. Using single precision floats instead of double, or even better, using integer arithmetic will give you a boost if your application doesn’t need the precision. And make sure you are not using expensive floating point emulation.

Good luck!
Martin

You need to profile things first before going off into -noatime land

Hi,

Thank You.

Your post was helpful to me. The processing speed is improved.

Speaking of expensive floating point emulation - is the narcissus builder for angstrom output armhf builds or do I have to build manually?

Nathaniel Lewis

armhf is not faster than what Narcissus gives you.

Hi All,

I am using beagle board XM rev c3 with Linux Angstrom. Currently my processing time is 7.7 sec in beagle board for the process hog detection with function

void detectMultiScale(const GpuMat& img, vector<Rect>& found_locations,
                          double hit_threshold=0, Size win_stride=Size(),
                          Size padding=Size(), double scale0=1.05,
                          int group_threshold=2);

     without thread.

   I had to add gpio pin 132 to start and stop the process. Then I used thread programming. But now the processing time is increased to 16 seconds.
Why this happened?

        Have any solution?

    Thank you.

Hi All,

I am using beagle board XM rev c3 with Linux Angstrom. Currently my processing time is 7.7 sec in beagle board for the process hog detection with function

void detectMultiScale(const GpuMat& img, vector& found_locations,
double hit_threshold=0, Size win_stride=Size(),
Size padding=Size(), double scale0=1.05,
int group_threshold=2);

without thread.

I had to add gpio pin 132 to start and stop the process. Then I used thread programming. But now the processing time is increased to 16 seconds.
Why this happened?

Have any solution?

I see a number of doubles in the call. I suspect that means they are in the function as well. I am serious about there being a big difference between single precision performance and double.

Thank you.

I increased the priority of main thread , so the processing time is changed to an acceptable level. Does changing priority of my process affect the performance of beagle board?

Hi,

I am trying to perform object detection with hog descriptor algorithm. I done it in my Ubuntu 12.04 system with processing time of 300ms. Then I tried to perform the same using beagle board xm rev c3 and then I got some problems. The major issue is the processing time it takes is too high, almost 45 seconds for grabbing an image and detect an object. With your support I reduced the processing time to 8 seconds. I changed double precision to single precision.

But still it takes more than 8 seconds. As i said it is due to one single function
hog.detectMultiScale(img, found, THRESHOLD, cv::Size(2,2), cv::Size(0,0), 1.05, 2);

which takes 8 seconds to process.

I saw in some sites that this function require huge processing power to process so we need some better GPU.

But in our case beagle-board xm has 1GHz processor and 512 MB Ram.

(My system having 2 Ghz processor and 1 Gb Ram)

I need to reduce the processing time from 8 seconds to at least below 2 seconds.