Performance Counters

Hi everyone,

I'm trying to set up my beagle board to do some profiling with
performance counters.

From reading the Cortex A8 manual, (Table3-99 Signal settings for the
Performance Monitor Count Registers) it seems that for the performance
counters to work, there needs to be a debug signal called DBGEN (DBGEM
in the TI TRM for omap3530 - a spelling mistake I assume) or NIDEN
(non-invasive debug) asserted.

Oprofile assumes that this signal is high after boot, and doesn't try
to set it at init time.

From what I can see, the only way to assert this signal is with a
debugger, and I don't have a debugger, nor do I want to have a
debugger attached.

Has anyone gotten oprofile or a custom driver working with the
performance counters on the beagleboard?

Is there a way to asset DBGEN or NIDEN from software?

Any help appreciated!
Etienne

Hi Etienne,

I have used the performance counters extensively. I do not believe
anything external needs to be setup, whatever is needed (if anything)
is set up correctly by the bootloaders. Remember that some options of
the counters can only be used from kernel mode, so if you want to use
them in linux you will probably need to write a custom driver to do
so. I have not used them in linux so I do not have such a driver.

another approach would be to hack the linux kernel init to globally
enable the registers for user level use. This would get you most of
the functionality with minimal effort, but since you cannot use
interrupts limits you to fairly short benchmark runs (2^32 cycles).

Good luck,
Matthew Warton

I have written a driver based on the code in the oprofile armv7 code
in linux/arch/arm/oprofile/op_model_v7.c

The code is based on a description in an ARM knowledge base article,
but still no go.

The ARM Cortex A8 TRM shows that the performance counters are only
enabled when the DBGEN signal is asserted, and from what I can see, it
is not.

Are you under the impression that x-loader or u-boot should set up the
DBGEN signal?

Etienne

I'm adding Siarhei to the loop; he has been working with OProfile.

Hi Etienne,

As far as I know the DBGEN signal is set up in the correct state by
default, I did not have to modify anything in relation to this signal.

Matthew Warton

Mans has a patch to enable userspace access to the performance
counters

http://git.mansr.com/?p=linux-omap;a=commit;h=09cd9cb3e9ee39cf2501d72a8f29b31e6743f16e

Many people are using this successfully, for example for timing
sections of critical code.

I've just tried enabling the userspace access for the performance
counters, and writing an application to count the number of
instructions issued in each cycle.

I still get counts of 0. I tried using the software increment
instruction and still nothing.

I'm going to try the linux tree where that patch is located, and see
if that changes anything.

Etienne

OK, just tried that kernel and the provided omap3beagle_defconfig but
with "enable userspace access to performance counters".

I tried my userspace program to measure instructions issued, still
nothing.

I'm going to post the code for the userspace program. Can someone test
it on a known working system?

elesueur <elesueur@cse.unsw.edu.au> writes:

OK, just tried that kernel and the provided omap3beagle_defconfig but
with "enable userspace access to performance counters".

I tried my userspace program to measure instructions issued, still
nothing.

I'm going to post the code for the userspace program. Can someone test
it on a known working system?

This sequence of instructions reset the cycle counter to zero and
enable it:

        mov r0, #15
        mcr p15, 0, r0, c9, c12, 2
        mov r0, #1<<31
        mcr p15, 0, r0, c9, c12, 1
        mov r0, #5
        mcr p15, 0, r0, c9, c12, 0

Replace r0 with any free register. To read the current value into the
r0 register, use this instruction:

        mrc p15, 0, r0, c9, c13, 0

Just to confirm, I have no trouble resetting and reading the cycle
counter.

Etienne

Hello Etienne,

I also intend to use the performance counter to get the cycles
information of an application.
As a starting point,I compiled and tried to run the code posted by you
on the OMAP3530 EVM,but when i run the code on the EVM,i am getting an
error message saying "illegal instruction" for the instruction:
        __asm__ __volatile__ (
                 "mcr p15, 0, %0, c9, c12, 1"
                 ::"r" (0x8000000f));

i am using the codesourcery toolchain version 2007q3.
I am not sure,what i am missing..................
Help in this regard will be very much appreciated.

Dharshan

Etienne,

I enabled the user space access to performance counters and ran the
code posted by you,i guess the problem in your code is at the last asm
call which is reading the cycles count,instead of calling mrc P15, 0,
%0, c9, c13, 2,you should call mrc P15, 0, %0, c9, c13, 0.
I got the following output:
value of usren reg is: 0x1
control value before: 0x41002001
writing control value: 0x41002007
counter value = 499500
control value before: 0x41002001
writing control value: 0x41002000
counter value = 66642

Cheers,
Dharshan

As I stated earlier, I can read the cycle counter just fine. I want to
read the /performance counter/ registers.

The code I posted above should measure the number of issued
instructions for that test loop, and the count should accrue in
register PMCNT0 which is read from register c9, c13, 2 after selecting
PMC0 using the PMNXSEL register.

What does the code do for you with no modifications after enabling
userspace access?

Etienne

Hello,

I am able to read the performance counter and the output of your
program that measure the number of instruction is as follows:
value of usren reg is: 0x1
control value before: 0x41002000
writing control value: 0x41002007
test = 499500
control value before: 0x41002001
writing control value: 0x41002000
counter value = 22668

i am using the 2.6.22.18-omap3 kernel running on OMAP3530 EVM.
But, when i run the executable of your program repeatedly, the counter
value always changes. It is the same case with the cycles also, each
time i run the program, i get a different value. i suspect its to do
with the cache, have to check with disabling the cache.

Dharshan

Dharshan

That's excellent - thank you so much for testing that. It means my
code should be working.

Etienne

Can you just confirm that you are indeed using 2.6.22

Hello,

I am new to linux and using beagle board for the programming. can you
please tell me the configuration to set cycle counter to count program
execution time?

Thanks in advance,
Ritika

Hi, I could enable the cycle counter in my code..