How to use NEON?

Hi,

I would like to know if anyone can explain me how can I use the NEON for some fast floating point operations that I need.

I found this flags

-mfloat-abi=hard -mfpu=neon

in this website (https://gist.github.com/fm4dd/c663217935dc17f0fc73c9c81b0aa845)

But they don’t seem to be doing anything…

I know that the -O3 flag or the -Ofast flag can help to get faster performance but I would still like, if possible, to make it even faster with NEON.

I noticed the robotics cape library has a rc_neon_function.c program and it mentions something about restrict. But what is this? Is this what is required?

Regards,
Oscar