Real time perfomance improvement


I am working on beagle board sine one month.

I want to run my appliction with minimum MIPS requirement.

So, for this what cache ( memory ) setting is requied.
how can i put my code in Onchip ram(OCRAM), Or allocate my application
memory from Onchip ram(OCRAM).

I am using performance monitor regiter to do profiling.

On my application store operation takes more cycle than load cycle.

Thank in advance for your reply