The heterogeneous multicore projects and hardware requirements

Hi!

I'm Yaman Umuroglu, currently in my third undergrad year and with a
passion for embedded Linux and doing low-level software work. I've
been active in the GP2X and Pandora communities (although more as a
lurker as of late) and would love to be able to work with the
BeagleBoard for the GSoC.

I would like to work on one of the projects that utilize the DSP (or
possibly the GPU and/or the NEON instructions with the ARM). I think
part of the BeagleBoard's (and the OMAP3's, in general) bigger
potential lies in the heterogeneous multicores, and being able to
contribute something that will be useful towards this would be very
exciting to me.

But there is one question that I could not find the answer by myself:
would it be possible to do development for the DSP or the GPU without
actually having the hardware itself (perhaps with emulators)? Barring
the possibility that the Pandora gets shipped quite soon, and
considering what I've read on the mailing list about the shortage of
supply of BeageBoards, it does not seem likely that I can come into
possession of the hardware in a short while.

Thanks!
Yaman

It is possible to use the TI simulator, but all approved applicants will get a beagleboard.

Do you have any IPC or message-passing experience? Can you digest the architecture of DSP/Link and MSGQ?

I do not have any first-hand practical experience with multi-processor
processing (except with GPUs and shader programming, which is not
totally relevant :)) and what knowledge I have of how the OMAP3 does
inter-processor communication comes from browsing around in TI's
documentation. So I do have an idea of how the GPP and DSP communicate
via message passing and interrupts, what the bridge driver and xdais
are for, but it's all floating ideas without much solid foundation.

Thus I'm not sure as to how I can offer some solid proof showing that
I'm suitable for this kind of project. Other GSoC mentoring
organizations seem to expect a demonstration of SVN checkout of the
code, altering some elementary functionality in some place and sending
a screenshot of the modified result. I could do something like that
but I doubt how useful it would be as a demonstration regarding what I
can do when it comes to this subject - any suggestions?

Yaman

Showing that you can rebuild DSP/Link would help. If you can create a binary that attempts to use DSP/Link, I can try running it.

Hi again,

Just wanted to ask for clarification on several things..

- For the DSP/Link using app, is it enough to compile one of the
DSP/Link samples in the package, or shall I try to make something
more, erm, customized :)?

- For building DSP/Link's GPP side and the app itself, should I use
the CodeSourcery toolchain, or is OpenEmbedded's toolchain preferred?

My thanks,
Yaman

Hi again,

Just wanted to ask for clarification on several things..

  - For the DSP/Link using app, is it enough to compile one of the
DSP/Link samples in the package, or shall I try to make something
more, erm, customized :)?

  - For building DSP/Link's GPP side and the app itself, should I use
the CodeSourcery toolchain, or is OpenEmbedded's toolchain preferred?

the subject is a good sign you will make things work. As far a a qualification task, running the dsplink samples from an image built from narcisuss would be fine with me. (Koen, can narcissus create images with working dpslink demos)

Philip

Hi again,

Just wanted to ask for clarification on several things..

- For the DSP/Link using app, is it enough to compile one of the
DSP/Link samples in the package, or shall I try to make something
more, erm, customized :)?

- For building DSP/Link's GPP side and the app itself, should I use
the CodeSourcery toolchain, or is OpenEmbedded's toolchain preferred?

From my point of view, being able to carry on a good conversation about the subject is a good sign you will make things work. As far a a qualification task, running the dsplink samples from an image built from narcisuss would be fine with me. (Koen, can narcissus create images with working dpslink demos)

You can 'opkg install ti-dsplink-examples', but it isn't exposed thru narcissus yet.

regards,

Koen, who gets blocked by google when posting to bb-gsoc

I've succeeded in building DSP/Link 1.60 and the accompanying sample
apps, using the CodeSourcery lite toolchain and the other (appearantly
standard) dependencies (DSP/BIOS, xdctools, TI CGT, the Linux base
port/platform support for OMAP3530). But how do I prove it? A
directory tree dump of all the build folders? Sending binaries? :slight_smile:

One thing that going through this made me understand is that
appearantly utilizing the DSP even for simple tasks is not quite
trivial (which is understandable considering the complexity of the
underlying architecture). I still have a lot of documentation I have
to read through to have a better overview of the DSP/Link API and how
to design a multicore processing strategy using the services it
provides.

But in the meantime, I have yet another (more general) question :slight_smile:

The GSoC project proposal entitled "Compile POSIX applications on a
slave DSP" (http://elinux.org/BeagleBoard/GSoC/Ideas#Compile_POSIX_applications_on_a_slave_DSP)
looks intriguing to me, the idea of utilizing the DSP especially for
the purposes of prototyping easier sounds great. But the goal
description confuses me a little: "Build a C64x+ POSIX library that
utilizes resources on the ARM over DSP/Link and compile with a simple
script that looks like 'gcc'."

Why would the DSP utilize the resources on the GPP via a DSP-side
library? Wouldn't it be more useful to have a GPP side library that
utilizes the resources on the DSP via a GPP-side library instead?
Could someone explain it to me in simpler terms? :slight_smile:

I checked the DSPEasy project page (linked as "existing project") but
couldn't glean much information as to how much progress was made, and
just how will "DSP Ease-of-Use" will happen.

My thanks in advance,

Yaman

appearantly utilizing the DSP even for simple tasks is not quite
trivial (which is understandable considering the complexity of the
underlying architecture). I still have a lot of documentation I have

The CELLBE linux was (WAS!!) nice in this regard. To make a simple
'hello world' run on an SPE you just compiled the same source with a
different compiler, and ran it on the command line. The pluggable
executable format loader then invoked runspe (or whatever it is, it's
been a while) which then just loaded it into the spu and executed it
through a simple api in libspe2. And when you got the prototyping out
the way you just use libspe2 directly to load and manage your work
kernels (or even lower level stuff), which for the most part was also
a very simple api.

Why would the DSP utilize the resources on the GPP via a DSP-side
library? Wouldn't it be more useful to have a GPP side library that
utilizes the resources on the DSP via a GPP-side library instead?
Could someone explain it to me in simpler terms? :slight_smile:

Well I pose this question: How would you implement 'printf' for
something executing on the DSP? Do you think that's something you
would want if you were prototyping DSP code?

!Z

Oh. No, of couse I would prefer to spend time on working on my own
code on the DSP instead. Thank you for pointing that out - I suppose
this made me realize how much I take printf for granted :slight_smile: I also had
somehow misread "a POSIX library" as "a POSIX compliant library". Now
it all makes much more sense.

I'd like to undertake this project and I'm ready to devote my whole
summer to it, but the architectural complexity scares me a bit as I'm
quite new to heterogeneous multicore processing in general and
OMAP3530's DSP/Link in specific.

Yaman

appearantly utilizing the DSP even for simple tasks is not quite
trivial (which is understandable considering the complexity of the
underlying architecture). I still have a lot of documentation I have

The CELLBE linux was (WAS!!) nice in this regard. To make a simple
'hello world' run on an SPE you just compiled the same source with a
different compiler, and ran it on the command line. The pluggable
executable format loader then invoked runspe (or whatever it is, it's
been a while) which then just loaded it into the spu and executed it
through a simple api in libspe2. And when you got the prototyping out
the way you just use libspe2 directly to load and manage your work
kernels (or even lower level stuff), which for the most part was also
a very simple api.

Why would the DSP utilize the resources on the GPP via a DSP-side
library? Wouldn't it be more useful to have a GPP side library that
utilizes the resources on the DSP via a GPP-side library instead?
Could someone explain it to me in simpler terms? :slight_smile:

Well I pose this question: How would you implement 'printf' for
something executing on the DSP? Do you think that's something you
would want if you were prototyping DSP code?

The DSPEasy project implements this and it should be opening up source
soon. We need to ping the author, Daniel Allred. It used MSGQ to
pass messages to/from the ARM to use the STDIO services available
through Linux on the ARM. It presents a script that behaves like GCC
but loads the body of the code on to the DSP using DSP/Link.

If you are a student looking at this project, please contact me
directly by e-mail or by IRC and let's make sure Daniel provides you
access to that code to help you write your application.

Oh. No, of couse I would prefer to spend time on working on my own
code on the DSP instead. Thank you for pointing that out - I suppose
this made me realize how much I take printf for granted :slight_smile: I also had
somehow misread "a POSIX library" as "a POSIX compliant library". Now
it all makes much more sense.

Well, someone has to do it if we all want to use it (and focus on our
DSP algorithms instead). The Codec Engine methodology is great for
once you already have an algorithm and want to integrate it, but it is
nice to put off worrying about that until you've already debugged your
algorithm with the normal printf world.

I'd like to undertake this project and I'm ready to devote my whole
summer to it, but the architectural complexity scares me a bit as I'm
quite new to heterogeneous multicore processing in general and
OMAP3530's DSP/Link in specific.

I think if you can take a good look at the DSPEasy project, it might
help you get a good grasp on the complexity.

I've submitted (a rather drafty version of) my proposal now - more
comments / reviews appreciated when you have the time.

And regarding this part:

"If your project is successfully completed, what will its impact be on
the BeagleBoard.org community? Give 3 answers, each 1-3 paragraphs in
length. The first one should be yours. The other two should be answers
received from feedback of members of the BeagleBoard.org community, at
least one of whom should be a BeagleBoard.org GSoC mentor. Provide
email contact information for non-GSoC mentors."

I would be grateful if you could tell me how you think the project
would impact the community if successfully completed - not only to
copy and paste into the application (wink wink), but to give me a
wider perspective on things so I might tailor specific parts more
in-depth to make things better.

Yaman

If successful it should open up DSP development to more curious people
and not just specialised developers, which can only help to increase
the number of developers able or at least willing to consider the DSP
for their projects. This in turn will add extra processing capability
to those applications which should provide a better `user experience'
for everyone. `Heterogeneous computing' is also increasingly becoming
a common industry solution used for computer applications, so this
could attract further developers interested in the area.