Capture one Analog channel in continuous mode at near maximum speed?

William_Hermans · June 21, 2016, 1:32am

When you’re writing directly to memory addresses ( registers ), you can’t tell me what is, and what isn’t. This is exactly what the PRU does when accessing peripherals modules. But, you’d be surprised what you can accomplish from userland when you pay close attention to what you should not do in order to remain performant.

Anyway, reading from a memory location, and putting that value into another location does not really take a lot of computational power, and then if you’re using an rt kernel. The scheduler is going to run in a tighter loop, offering lower latency. But again, you have to be smart how you go about things. Printing every, or even any samples to stdout will slow thigns down considerably. Also, if you use a lot of API calls that have to go back and forth to / from the kernel . . . that’s going ot slow things down considerably also. etc . . .

John_Syn · June 21, 2016, 1:36am

You make a very good point and we have no idea what the ADC front end bandwidth is. You would think that TI would add some provisions in the ADC setup that can be whatever you want as long as you do not exceed 200ksps. The 200ksps I believe comes from a setting where clock divider = 8, open delay =1, sample delay = 0 and oversample = 1x. The fact that they allow a 24MHz clock, and no warning is interesting. This was also discussed on E2E and no one from TI shut the idea down.

Anyway, if I can get the DMA to work, then we can test the concept If nothing else, at least we get a proper 200ksps ADC working on the ARM directly. No need for PRU.

Regards,
John

John_Syn · June 21, 2016, 1:43am

The latency in the RT kernel is only relevant for interrupts (preemption). The scheduler will still swap out user space apps to service background tasks and that is the same as the regular kernel. In any case, the ADC uses a sequencer to do the samples and those samples get stored in the fifo, so you have to wait until the fifo count is greater than a certain number and then read until the fifo is empty and poll again for samples. From what I recall, you were just reading the samples from the IIO driver. I don’t believe you had code to setup the sequencer, or maybe I’m wrong?

Regards,
John

William_Hermans · June 21, 2016, 2:04am

I never shared my /dev/mem + mmap() code with the group / publicly. For what should be obvious reasons. I fact I do not think I shared it with anyone. My reasoning is that if someone can figure out how to do this on their own, then they probably earned the right, and with that comes knowledge and responsibility.

As far as interrupts and rt. That does not matter. Interrupts happen all the time, and the more you have, for whatever reason, the slower your application will be. Regardless if your app generated that interrupt or not.

So if i understood what someone was talking about concerning remoteproc not long ago. Moving interrupts to userspace . . . would be a very very bad idea. You’d get all kind of context switching between processes / interrupts fighting for time. Then jumping into and out of kernel space, potentially copying data . . . yeah, very very bad idea.

William_Hermans · June 21, 2016, 2:27am

I seemed to have lost a post here that I made. Somehow . . .

Anyway, I never shared my /dev/mem + mmap() code with anyone, and I will never post it on this group. So no one here would know what I’ve done in code concerning that. My reason is simple. In order to use code of that nature, you need to earn the right to do so, and hopefully have an understanding of could happen if you’re not careful.

Most of that code is peripheral setup, and the rest is simply reading from the ADC buffer, and then printf() to screen. However, in order to get the best performance you never let that data get put on screen by piping the output of the executable to a file. That increases performance drastically, and is a happy medium for not having to write a bunch of read() / write() / open() calls for a simple test. Perfect ? Who cares, I never did. I proved a point to myself, and that is all that matters to me. I proved to myself that /dev/mem/ + mmap() works fine, and if you have an application that does not need to spend a lot of CPU time doing things. Then it would work fine. As it is. Reading from the ADC multiple channels as fast as you can. Should probably be done using the PRUs. Simply so you can use that CPU time saved to do other things. Perhaps even display that data to the outside world from a web server.

Interrupts. They happen, and frequently. So it does not matter if your app generated interrupts or not. Your app will constantly be interrupted by them. So if you’re using an rt kernel, “return latency” will be less. Meaning, your app should be able to get things done faster.

Which brings me to another point. I hope I was misunderstanding someone earlier this week talking about remoteproc and bringing interrupts to userspace. That would be a terrible idea, and would generate all kinds of context switching between userland, kernel space, processes, interrupts, copying data to / from kernel space. . . yeah it would be a bloody mess. But you know what. That will just give me another reason to avoid what I’m already avoiding now. So, for me, no big deal. I guess.

William_Hermans · June 21, 2016, 2:49am

BY the way, when I say read from the ADC buffer. I do not mean that piece of garbage /dev/iio:device0. I mean the ADC hardware buffer. FIFO0DATA described on page 1095 of the TRM.

John_Syn · June 21, 2016, 6:22am

Ah, I thought you were talking about this solutions:

http://www.embeddedhobbyist.com/2015/10/beaglebone-black-adc/

Otherwise, you would have to replicate much of the ADC driver in userspace and then loop, waiting for FIFO0COUNT>0, read samples from FIFO1DATA until FIFOCOUNT-0 and doing this in a way that doesn’t hog the CPU but still be fast enough to overflow the FIFO. At 1.5msps, you would have to do this in less than 21uS assuming a average FIFO Count of 32. I just don’t see ti, but maybe you have a trick that I don’t know about

Did you enable StepID? This way you can see if you missed any samples.

Regards,
John

DTJF · June 21, 2016, 8:36am

Hello Stewart!

(None of the libpruio built-in examples deal with rapid sampling or large amounts of data.)

Next libpruio version contains an example called rf_file, which will exactly do want you target. It uses the ring buffer mode to fetch data and writes them to file(s). It will be published when kernel development (>4.x) isn’t experimental any more and I can start to write install instructions.

But you can use it today. See the following thread for details: https://groups.google.com/forum/#!topic/beagleboard/kxxucJAci2c

Regarding sampling rate:

I was able to sample at 1.6 Mhz (one channel). But I wasn’t able to fetch the samples from the FiFo at that speed. That means the number of samples is limited to 256. (I didn’t try DMA yet.)

The maximum transfer rate for the FiFo is different on each chip. My boards reach about 240-250 ksps. I guess TI specifies 200 ksps to be on the save side. Targeting 800 ksps is a waste of time, for that target you’ll need additional hardware.

You can tune libpruio for higher sampling rates by adapting the number 5000 to your needs (period in [ns]) in the code of the following link https://groups.google.com/d/msg/beagleboard/kxxucJAci2c/5nwXwyXZJQAJ

BR