GSoC Interested to contribute

Hello all :slight_smile:
Im Steven from Germany and Im very interested to contribute to the beagleboard community in this years GSoC!
Im interested in GPGPU with GLES, as written on the ideas page. The beagleboards gpu only supports OpenGL ES 2.0, so doing general purpose computations with this api is not that straight forward. I could set up some samples, together with a tutorial how to do convolution or matrix multiplication with support of gpu.
This could be used to accelerate some deep learning framework.
Im looking for a mentor to discuss this idea and set up a reasonable scope for the project :slight_smile:
If you have some ideas for what you want to use the gpu on a beagleboard please let me know :slight_smile:

Best Regards

ds2 on IRC might be helpful. Also, can you check out

Hi Steven,
Imagination have recently published code + documentation on how to use OpenCL on the SGX530 on the BeagleBoard Black. This gives the matrix type examples you were proposing. The key thing is that on this generation of graphics core the OpenCL is not faster than C or Neon on mathematical operations. The benefit comes from being able to offload the processing from the A8 core and leave it free for doing other stuff.
You can request a download of the materials from

Hi Steven,

How familar are you with GLES 2 and GPGPU in GLES SL?

When I wrote up the project idea, what i had in mind was to do 2 things:
Create some sample code for computation in GLES SL plus the ARM side. Things
like matrix multiplication, convolution are the things in I have in mind. But
it may be useful to also do a basics example (addition of 2 matrices showing
how data flows from the ARM side along with the abstractions to the GPU and

The 2nd half of it is to generate some comparative timing numbers between the
GLES SL version and an optimized ARM version. The exact types of comparism is
open for discussion as there are several aspects that could be/should be

Please try to hang out in the #beagle-gsoc channel to speak to mentors.

Hey Hunyue,
Im very familiar with OpenGL and I did some GPGPU using openCL. OpenGLES is just a subset of the “normal” openGl, so it should be no problem for me to adapt. I dont have experience with GPGPU using GLES 2, as there was no reason for me to do that when you can just use compute shaders or openCL/Cuda, but I have some ideas how to do it. In generall you can use uniforms, textures or vertex arrays to send data to the gpu and read the data back from the framebuffer. I could do some research how to do it most effectively depending on the use case.
Im not sure how difficult it is to get all the drivers running on a beagleboard for openGLES? Do you have a specific beagleboard in mind for the project?
I could do some timings to show the difference of a computation, like matrix multiplication, running on the CPU vs GPU. Im not sure if it will be faster on the SGX530 gpu, but it could still free up the CPU to do other computations.

Hi Steven,

I suggest we try to talk on #beagle-gsoc if we can find a mutual time. I am on
PDT but my schedule is a bit odd.

Getting the drivers going on the Beagleboard/Beaglebone does take some tricks
but I can help with that. I went through the process of setting it up before
and there are other resources that can help with that. For this usage, a lot
of the problems with SGX and X won't necessarily apply as we just need the
ability to render off screen. The main difference is getting the initial context
from EGL.

In my opinion, I'd suggest targeting the AM335x family of BeagleBone's. The
pocket beagle is kind of handy for this. Other mentors may suggest another
board but we can talk about it.

that seems like a good idea! I tried to write you but got no response. When are you typically online?
In the meantime I set up a first draft for the proposal:

Hi Steven,

To help with your comparisons here are the matrix multiplication timing results for the OpenCL example in the Fun with Beagle package. They are based on an [MxN]x[NxP] multiplication.


So for the [256x256]x[256x256] case the OpenCL implementation on GPU is faster than a native A8 but still significantly slower than using Neon intrinsics. The main benefit however is that A8 is offloaded.

For your proposal the interesting point for me would be whether a GL ES implementation would be faster than OpenCL. It is not something I tried as I had no experience in using GE ES for computation.


Hey Iain,thank you very much for that info!
I requested a download for the material of the course, so maybe I can try it out myself.
I saw you created that course, that is very cool! Do you know if there a plans to release the OpenCL driver for public? Also I wonder how a OpenCL driver is possible in the first place? In the datasheet of SGX530/550 it says only OpenGL ES 2.0 is supported.
Why do you think a OpenGL ES implementation could be faster? I doubt it will be much a difference since OpenGL ES 2.0 is not targeted for general purpose computations.

Best Regards,

Hi Steven,
The OpenCL driver is “as is” and so there will be no further releases of libraries and no source. I’m assuming by releasing for public you mean in source code. That will not happen.
A datasheet only describes what a manufacturer wants to support in public at time of publication, hence no mention of OpenCL from Imagination or TI.
I didn’t mean to suggest OpenGL ES would be faster, I was just trying to highlight the potential value of your work in being able to relate existing OpenCL to OpenGL ES.

Hey Iain,
thank you again. Is it okay when I use your matrix multiplication timing figure in my proposal / future elinux tutorial? Also I wonder if you know where I can find more information about the SGX 530/544/550, like a datasheet. At the moment Im researching what texture formats are supported, I found this datasheet apparently for the SGX 530 (?)
where it states RGBA8 is supported. I wonder how the other SGX GPU’s perform on that regard and where I can find this information?

Best Regards