USB Sniffer - Isochronous transfers - Webcam

Hi everyone,

New blog post:
http://beagleboard-usbsniffer.blogspot.com/2010/07/isochronous-transfers-webcam.html

Any hints about the MUSB/DMA problem would be appreciated... I'll try to dig into that, but without documentation (I guess this is under NDA, right?), it may be a bit tricky...

Side note: I was wondering if that bug may already have been fixed in some later kernel version (I don't see anything obvious by running a diff, but there may be some side effects from other modifications, who knows...).
If I want to run 2.6.35-rcX kernel on the BeagleBoard, does anyone know which git tree/branch I should use? linux-omap-2.6.git/master? linux-omap-2.6.git/dss2? I know that with a vanilla kernel (Linus' tree), MUSB does not work properly (even g_ether is buggy), I tried that before...

Best regards,

Nicolas

Hi Nicolas,

Great work - I'm not 100% I follow you on the MUSB/DMA problem? Can you
elaborate a bit further on this?

Do you know it's a MUSB or DMA issue? Or a combination? It's not totally
clear to me...

Best regards
  Søren

Hi Frans,

[...]

Nicholas, as a workaround would it be possible to add some padding to
the data?

Very probably not, that would confuse the driver. The data that is being transfered is some JPEG image, adding some padding would surely break image decoding.

BTW: I've seen other systems where DMA was required to be a multiple of
8, 16 or sometimes even 256.
Also DMA might impose some alignment requirements on the data (e.g.
start at an 8 or even sometimes 512 byte boundary.
Did you check the alignment of the data in case things do not work.

Well, I'm not sure about that, and about the alignement that kmalloc gives, since the data blocks are kmalloc (I guess at least 4 bytes, maybe more). Anyway, if it was an alignement issue, I would expect it to work more or less well, occasionally (since kmalloc
addresses are probably relatively random...). Also, I don't see any other gadget driver taking any special care about alignement, so I guess it's ok.

PS: after typing the above peeked quickly at the code:
the zero size packet seems to be there to simply to flush out any
pending data.

Ok, here is what I understand about zero-length packets: For control/bulk/interrupt transfers, they serve to signal the end of a data transfer. For example, if you send 90 bytes over an endpoint with a maximum packet size of 32, you would send three packet with 32, 32, then 26 bytes. The last packet (26 bytes), is incomplete, so it tells the host the data transfer is over. Hovewer, if you send 96 bytes, then you would send 3 32-bytes packets, plus a 0-byte packet, to finalize the transfer.

Isochronous endpoints work differently. Since these endpoints are best-effort, packets can be lost, and you cannot rely on incomplete packets or zero-length packets to signal the end of a transfer. Well, in theory, isochronous is supposed to mean a constant rate of data flow (like raw audio capture), so there is no real start/end of a transfer. In reality, the USB spec allows to use isoc endpoints for non-isoc data stream (just like the JPEG data from my webcam).

The spec only allows to send 0-length packets on isoc endpoints in this case: "An isochronous IN endpoint must return a zero-length packet whenever data is requested at a faster interval than the specified interval and data is not available." (USB 2.0 spec, 5.6.4). I suppose that this should be handled automatically by the driver (or the controller).

In any case, there is a "zero" flag in the request struct in the kernel, and I don't see why a zero-length packet should be sent if that flag is not set. The PIO code seems correct to me, but not the DMA code (http://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/blobs/stable-20100702/drivers/usb/musb/musb_gadget.c, line 491):
/*
  * First, maybe a terminating short packet. Some DMA
  * engines might handle this by themselves.
  */
if ((request->zero && request->length
         && request->length % musb_ep->packet_sz == 0)
#ifdef CONFIG_USB_INVENTRA_DMA
         >> (is_dma && (!dma->desired_mode ||
               (request->actual & (musb_ep->packet_sz - 1))))
#endif
) {

BTW, "(request->actual & (musb_ep->packet_sz - 1))" will show some funny behaviour if musb_ep->packet_sz is not a power of 2....

One other question on the usbmon part:

0:0:250
0:768:292 0:1536:338 0:2304:362 0:3072:254

usbmon.txt says:

The word consists of 3 colon-separated decimal numbers for
   status, offset, and length respectively.

Can't really explain why your first packet is 250 bytes but the next one
has an offset of 768.
The only reason I can think of is that the data is compressed by the cam
and that 250 is the length of a compressed 768 byte framesegment (or
maybe a line or so).

Agree, that bit is a little confusing. I wanted to explain about how isochronous transfers work in the Linux kernel (on the host side), but I didn't have time to do it.

So, for bulk endpoints, if you submit an URB to ask for 2048 bytes to be transfered from the device (with an endpoint max packet size of 512 bytes), the EHCI kernel driver/controller will automatically generate multiple request, until 4 packets have been transfered (without errors, in order: bulk transfers guarantee that). The URB callback is called, and then you can do something with the data.

Now, for isoc transfers, it's a bit different. You can also ask for 4 packets worth of data, but any of these transfers can fail, or only partially complete, so, within you URB, you would have 4 "iso_frame_desc" (look at struct urb in include/linux/usb.h). When you submit the URB, you would ask packet 0 to be put at offset 0 in your buffer, packet 1 at offset 512, etc... When you get your callback, you would need to pick the good data from the buffer, as there may be some gaps. Why it has been implemented that way in the kernel, I don't really know (I don't really see why it would not be possible to append all the data together, maybe some controller/alignement issues).

So, to get back to the usbmon log, the full line is:
ffff880080052000 2351992862 C Zi:6:075:1 0:1:831787618:0 32 0:0:250
0:768:292 0:1536:338 0:2304:362 0:3072:254 6807 = ...

What matters too is the "32", meaning that 32 768-bytes packets are requested. The first one, offset 0, got back with status 0, and a length of 250 bytes, the second one: status 0, offset 768, actual length 292, etc. I assume that this is because the camera cannot compress the image as fast as the data is requested from the bus, and simply sends everything it currently has in its FIFO when a request comes from the host.

Thanks,

Best regards,

Nicolas

> One other question on the usbmon part:
> ♫
> 0:0:250
> 0:768:292 0:1536:338 0:2304:362 0:3072:254
>
> usbmon.txt says:
>
> The word consists of 3 colon-separated decimal numbers for
> status, offset, and length respectively.
>
> Can't really explain why your first packet is 250 bytes but the next one
> has an offset of 768.
> The only reason I can think of is that the data is compressed by the cam
> and that 250 is the length of a compressed 768 byte framesegment (or
> maybe a line or so).

Agree, that bit is a little confusing. I wanted to explain about how
isochronous transfers work in the Linux kernel (on the host side), but I
didn't have time to do it.

So, for bulk endpoints, if you submit an URB to ask for 2048 bytes to be
transfered from the device (with an endpoint max packet size of 512
bytes), the EHCI kernel driver/controller will automatically generate
multiple request, until 4 packets have been transfered (without errors,
in order: bulk transfers guarantee that). The URB callback is called,
and then you can do something with the data.

Now, for isoc transfers, it's a bit different. You can also ask for 4
packets worth of data, but any of these transfers can fail, or only
partially complete, so, within you URB, you would have 4
"iso_frame_desc" (look at struct urb in include/linux/usb.h). When you
submit the URB, you would ask packet 0 to be put at offset 0 in your
buffer, packet 1 at offset 512, etc... When you get your callback, you
would need to pick the good data from the buffer, as there may be some
gaps. Why it has been implemented that way in the kernel, I don't really
know (I don't really see why it would not be possible to append all the
data together, maybe some controller/alignement issues).

Isn't it fair enough to leave open space in case an isochronous transfer fail due to CRC (or the device doesn't have any data to transmit)? Seen from a user point of view I would prefer to know this compared to just compacting the data without me knowing where in the stream I have the/a hole. I admit this could be implemented in other ways, but think the implementation is due to the HW implementation in the EHCI Host controller, where it will most likely just add the packet size to the memory destination address pointer after an ISO transfer (failing/missing or not)...

Hope what I was trying to say is clear?

PS: In case you are still stuck on the MUSB/NDA-issue you might ask Jason to look at the following thread and contact the TI employee who has answering the following thread at TI E2E: http://e2e.ti.com/support/dsp/omap_applications_processors/f/447/p/54697/195465.aspx#195465

Best regards - Enjoy your weekends everybody
  Søren

until we get the docs, is it good to copy linux-omap and linux-usb, or at least felipe, ajay, and swami?