NEON FPU

Hi,

There are some benchmark results of the NEON FPU ?

Someone has tested FFTW[1] in the beagleboard?

[1] www.fftw.org

thanks,

    Andrés Calderón
    Cel: +57 (300) 275 3666
    Email: andres.calderon@emqbit.com
    Web: www.emqbit.com

There is a free NEON FFT implementation available in the OpenMAX DL
implementation available for download from ARM. Note this is written
for the ARM toolchain so the asm will need some rework into gas
format, or you could use the ARM Realview eval tools which can
interwork with gcc to build Linux apps.

http://www.arm.com/products/multimedia/openmax/
http://www.arm.com/products/multimedia/openmax/v7libraries.html

The FFT code is in:
OX002-BU-00010-r1p0-00alp0\OX002-BU-00010-r1p0-00alp0\sp\src

Would be great to find someone who was keen to put this NEON code (or
similar) into FFTW.

As a example of NEON FFT performance: 256-point, 16-bit signed complex
numbers takes 4.7us (on 500MHz Beagle)

Off hand, do you know what sort of license is involved?

In other words, is this usable with a GPL project such as gnuradio?

Philip

• Subject to the provisions of this Agreement, ARM hereby grants to YOU (either an individual or single entity), under ARM's copyright in the Software, a perpetual, non-exclusive, non-transferable, royalty free, worldwide licence to ; (i) use, copy, modify, the Software for the purposes of developing or having developed software applications and; (ii) distribute and sublicense the right to use, copy and modify the software applications to third parties.
  • THE SOFTWARE IS LICENSED “AS IS”. ARM EXPRESSLY DISCLAIMS ALL REPRESENTATIONS, AND WARRANTIES EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF SATISFACTORY QUALITY, MERCHANTABILITY, NON-INFRINGMENT OR FITNESS FOR A PARTICULAR PURPOSE.
  • Your use of this Software and the right to redistribute any software applications developed by or for YOU and which are derived from the Software may require you to obtain patent licences from third parties (“Third Party Patents”). ARM therefore requires and YOU hereby agree that prior to exercise of any of the rights to distribute any software applications in accordance with the licences granted under this Agreement, YOU shall have obtained all necessary rights and licences to Third Party Patents, of which YOU are aware of or become aware during the term of this Agreement, to enable YOU to distribute the ARM Software in accordance with the licences granted hereunder without infringing the Third Party Patents whether as a primary, secondary, indirect or contributory infringer, or otherwise, and the copyright licences contained herein are conditional on you agreeing to obtain such licences. For the purpose of interpretation of this Clause 3, any allegation by a third party that any action by YOU infringes any Third Party Patents shall be presumed as valid until properly rebutted by YOU and ARM may suspend the licences granted in Clause 1 until any such allegation is resolved in favour of YOU or YOU reach a settlement with the party making the allegation. If any breach by YOU of the provisions of this Clause 3 results in ARM being subject to a claim for infringement of any Third Party Patents, YOU shall indemnify against and hold ARM harmless from any claims, demands, damages, costs and expenses made against or suffered by ARM as a result of any such claim or action.
  • No licence, express, implied or otherwise, is granted to YOU under the provisions of Clause 1, to use the ARM tradename in connection with the Software or any products based thereon. Nothing in Clause 1 shall be construed as authority for YOU to make any representations on behalf of ARM in respect of the Software.
  • If you are downloading the Software on behalf of a company, partnership or other legal entity, you represent and warrant that you have authority to bind that entity to these terms and Conditions. If you do not have this authority you should not proceed to download the Software.
  • Any breach by YOU of the terms of this Agreement shall entitle ARM to terminate this Agreement with immediate effect. Upon termination of this Agreement, all licences granted to YOU shall cease immediately and YOU shall at ARM's option either return to ARM or destroy all copies of the Software including any modifications or derivatives thereof.
  • This Agreement shall be governed by and construed in accordance with the laws of England and Wales.

Andrés Calderón wrote:

Hi,

There are some benchmark results of the NEON FPU ?

Someone has tested FFTW[1] in the beagleboard?

[1] www.fftw.org

thanks,

    Andrés Calderón
    Cel: +57 (300) 275 3666
    Email: andres.calderon@emqbit.com
    Web: www.emqbit.com

fftw is built upon the premise that it will be used with tests run on the machine to find optimized code. For most machines of interest (desktops and servers) there are optimized SIMD codelets to do the heavy lifting in fftw. For Intel there is use of SSE, SSE2. For PPC there is use of altivec. For Cell, there is use of SPE, etc.

Without the neon SIMD optimizations, it would be native compile under ARM-gcc for fftw. The performance will be poor. NEON codelets are needed to attain good throughput.

Bob McGwier

When you run gnuradio on machines with OpenGL enabled to get speed up of
the wxPython and/or Qt widgets in it, do you have a license to
distribute the Nvidia restricted driver to make the OpenGL go fast?

No of course not. So in GnuRadio, given the Neon FFT license, it may
not be checked into the GnuRadio repository and distributed with the GPL
v3.0 code checked in there. But individual users, under the license
granted (as quoted in another email message) will be quite useful for
people to download and install and run GnuRadio all day, every day.

You simply make it a requirement for GnuRadio/OMAP3530 under OE.

GPL is a distribution license. Not a usage license.

Bob

Philip Balister wrote:

The numbers I just posted suggest fftw will be horrible on the OMAP3
until we add NEON optimizations.

Philip

If anyone wants to do these FFTW optimizations, please let me know and
I can help (know a bit about NEON here :wink: )

Ian

The fftw sources have a directory called "simd" with files for altivec
and sse. Adding NEON support may be as straight forward as inserting
code here. I haven't looked at the rest of the code. It certainly
would be a start.

Philip

Hi Ian,
I want to do a performance comparison b/w DSP and Arm cortex A15(using neon core) for floating point and Digital Signal Processing(fft).
Could you suggest me a fair benchmark(open source) for the same?
what i have done for dsp is, I used some fft function(optimized asm code) of dsplib. and obtained cpu cycles required for it.
Now i want to implement similar function for Cortex A15, What should be done?