BBAI-64 and PCIe switch problem(s)

alftel · July 16, 2022, 1:58am

Hello,
While I am aiming to create a pure PCIe cape to allow different modules/cards installation and testing I do want to mimic the whole thing as a bench-top jig. It seems to work file, except one important thing - PCIe switch, which will be in a heart of envisioned cape. Here are my setup(s):

PCIe NVME SSD (working fine)
PCIe Netwotk (Intel gigabit) (working fine)
PCIe USB 3.0 Hub (working fine)
PCIe 4-port Switch (no devices behind) (not working)

By “not working” I mean that board does not boot, at all. I do observe PCIe reset through on-board LED at PCIe switch board, and correct link negotiation (another LED), but thta’t about it. I already tries a few different PCIe switch boards (different port and lanes numbers) without any devices attached with the same result. 12V rail is correctly supplied to external board in question through adapter (M.2 key A/E to Key M adapter sandwiched with M.2 Key M to PCIe connector adapter where 12V rail is injected towards test card, high quality shielded PCi Gen 3 certified cable). I am sure it is not a signal integrity issue - more likely something in SW/Firmware land. PCIe switch is transparent device by nature and supposed to work by default, unless there is no support for Type 1 configuration space and boot fails before kernel starts.

"
PCIe Device can have either Type-0 (Endpoints) or Type-1 ( RC or Switches or Bridges) Configuration Space.

–Type-0 device can have total of 6 BARs while Type-1 can have only 2 BARs.

–BAR gives the information about address space needed by the device.

–Each BAR is 32 bit , out of which first 4 bit 3:0 are always Read Only.

– 2^(Position of last R/W bits from least significant bit) = Address window required by particular BAR.
"

and

"
Type 0 configuration requests are sent only to the device for which they are intended.

Type 1 configuration requests are sent to switches/bridges on the way; the last one before the actual target device will convert it to type 0.

A device that receives a type 0 request knows that this request is intended for this particular device, so it uses the bus/device number fields in the request to find out what its own bus address is.
"

I know that it may seems like advanced question beyond a “general” question pool, but can somebody from core BBAI-64 development team take a look and advice?

Cheers,
Alex

benedict.hewson · July 16, 2022, 11:11am

Can you connect to the uart connector on the bb ? You will then be able to tell how far along the boot process you are getting. Probably give some hint as to what is going wrong.

silver2row · July 16, 2022, 1:17pm

@benedict.hewson ,

Are you discussing u-boot output?

Seth

P.S. If so, the older UART adapter for the BBAI, not the BBAI-64, does work on the current board in question. I tested it w/ a FTDI adapter. Works! Works w/ the AI and AI-64!

alftel · July 16, 2022, 4:48pm

Yeah, probably uarts for early boot and main processor uart will show something (potentially) - any references to cables and/or part number for mating connectors? I do have a few FTDI dongles in a lab that I can use (both TTL and RS-232 levels - I presume BBAI-64 dos require 3.3V TTL level?). At this point anything will help - during boot with PCIe switch attached nothing is happening except power LED being on, and since all other x1 PCIe cards do work I assume that this is not an electrical issue.

benedict.hewson · July 16, 2022, 9:38pm

Yes J3 (JST -ZH 1.5mm pitch) connector is u-boot & kernel serial output.
There is also a little on the MCU uart. not sure where that is coming from.

alftel · July 16, 2022, 10:24pm

I guess these ones below?
UARTS

Also, I assume that voltage level should be 3.3v (FTDI has two versions - one 5V level and another is 5V - would be nice to mention it in user guide for clarity)

UART-Description

alftel · July 17, 2022, 8:06pm

While waiting these debug cable to arrive I lured schematic of one of our PCIe switch boards in order to understand if there are any potential electrical issues such as over-current on 3.3V rail as an example. Interface is very clear and does not use 3.3v rail from M.2 slot at all - just clock, reset and TX/RX differential pairs - see picture below. It is clear that failure takes place more likely at uboot level. Once cable are i hand I will try to capture any debug messaging from uboot (kernel failure is ruled out, for now - Debian boot logging via bootlogd is empty). On the other hand, booting board with PCIe switch board detached and attaching it after does not lead to any kernel halt, reset button on PCIe switch board correctly exercises link negotiation for upstream port and all down stream ports, BUT command “echo 1 > /sys/bus/pci/rescan`” hands the board immediately. Very strange.

test-switch

alftel · July 18, 2022, 3:48am

Checking Errata for TDA4VM silicon, PCie section, I discovered the following below - I think it explains why board hangs when it sees PCIe switch of any kind (I yet to have to verify it with multi-function device as well if I will find one). Can anybody confirm correctness of my assumption that this errata prevents use of PCIe switch with TDA4VM (as well as any pcie multi-function device)?

PCIe-Errata

alftel · July 21, 2022, 5:27pm

Got my serial cables and FTDI dongle top match - here is boot sequence with board hang below. Since I do not get much going with these questions, maybe somebody will refer me to TI sources that I can ask?

boot-failure

jkridner · July 31, 2022, 3:17am

Come onto the Slack using the GSoC invite: BeagleBoard.org - gsocmeet. There might be a TI person there to check with. Also, IRC can be helpful at times. Of course, the official way to ask TI for support is https://e2e.ti.com. Make sure to pose this as a TDA4VM question, rather than a board-level question. Anyway, the errata does seem to explain your issue, so I’m not sure what else there is to discover–perhaps implementing one of the work-arounds in Linux?