This is the weekly report for the GSoC Project OpenGLES Acceleration for DL
Hello Everybody. Here’s my Week 0 progress along with the targets set for Week 1.
Week 0 - Community Bonding Period
Tasks Done
- Getting thorough with Beaglebone AI-64 documentation. Hardware as well as SSH connection setup.
- Introductory Video submitted.
- Understanding Vulkan APIs
Week 1 - Conding Period Begins
- Implement the backend on the local and benchmark the performance.
- Also implement the backend on the AI-64 board and compare the performances.
- Reasearch about TIDL Implementation.
- Make a blog post.
Week 1
Task Done:-
- Research and familiarize with Vulkan.
- Implementing and benchmarking darknet on the host.
- Reasearch about TIDL Implementation.
Summary of Week 1:-
During the first week of the project, the main objective was to benchmark the performance of Darknet, an open-source neural network framework, on a host machine (laptop). Additionally, efforts were made to understand the compatibility and optimization requirements for targeting Darknet on a BeagleBoard. Furthermore, research was conducted on TIDL (Texas Instruments Deep Learning) implementation to explore its potential for optimizing deep learning execution on TI processors.
Blockers:-
For porting and testing the model, I’m using BB AI-64. I connected my laptop’s USB-C port and 5v adaptor to the Beaglebone. However, the booting process stalled out. I attempted to flash both flasher and non-flasher images on the SD-Card, but nothing changed. I’ll interact with the mentors and will see the solution.
Week 2:-
Tasks
- Porting Darknet to the Target Platform (BeagleBoard)
- Research TIDL APIs
- Edge AI
Week 2
Task Done:-
- Porting Darknet to the Target Platform Beaglebone AI-64.
- Research TIDL APIs.
- Go-through of Edge AI.
Blockers:-
So after interaction with the mentors, they suggested me to use the UART and check the logs. I used Minicom to interact with the board. By default, the device name was set to /dev/modem
. I changed it to /dev/ttyUSB0
.By pressing the boot button, I connected the board, and the logs are now visible. Then I got the propmt and gave the command to boot. Then it booted successfully. But now the propmt shows to update the kernel hence I am working on that.
Week 3
Tasks:-
- Sharing the cross-compiled darknet folder to the BBAI-64.
- Implementation and benchmarking on the BBAI-64.
Week 3
Task Done:-
- Sharing the cross-compiled darknet folder to the BBAI-64.
- Implementation and benchmarking on the BBAI-64.
Blocker:-
The 32GB SD Card has now been flashed with a non-flasher image. After inserting the SD Card on the board, ‘df -h’ reported that 80% of the SD Card was being used. The partitioning problem was evident.
Week 4
- Create a script that can be reproduced to benchmark the host and compare the outcomes.
- Start with matrix multiplication in OpenGLES.
- Inspect the TIDL code and look for necessary APIs.
Week 4
Task Done:-
- Made a reproducible script to benchmark on host and other platforms.
- Worked on Matrix Multiplication code.
Week 5
- Complete the Matrix Multiplication code.
- Benchmarking of the matrix multiplication on host and Beaglebone AI-64
Week 5
Task Done:-
- Done with matrix multiplication.
- Benchmarking of the matrix multiplication on host and Beaglebone AI-64.
Blockers
- I spent some time going over the OpenGLES API, which is helpful for the matrix multiplication code.
Week 6
-
Making the
forward_convolutional_layer
function use OpenGLES. -
Get the reading of the input buffer parameters.
Thank you for working on this project.
Another big help would be to document your source using doxygen format. This will make your work more valuable to others when they need to work on it. Doxygen will also help you tremendously when developing by providing you with a visual guide of what is going on.
doxywizard makes setting up your project very simple.
$sudo apt install doxygen
$sudo apt install doxygen-gui
Week 6
- Added algorithm for Forward Pass computation .
- Added input buffer objects and allocated memory for each buffers.
Week 7
- Adding atleast one layer and successfully running it on GPU.
Thank you so much for your feedback @foxsquirrel
I will definitely try doxygen.
For the time being, you can take a look at my blog post. I have documented everything and would be thankful enough to have your feedback.
Good work.
The doxygen has several formats so the comments will get picked up with the function.
/**comments for the function start here
*/
They have several ways to do it so what ever you prefer is the best.
Leave enough doc so you can come back to the project 6 months later and be able to understand what you did originally. That is sometimes good enough for others to look at it for the first time.
I try to leave some docs at the top that is an overall of the project, if you must use some not so obvious and possibly confusing code to work around a rough spot be sure to doc that too.
Keep up the good work.
Week 7
- Added GPU version of
fill_cpu
,im2col_cpu
,General Matrix Multiplication
.
Blockers
- Errors in GPU kernel code and support for OpenGLES 2.0.
Week 8
- Adding GPU support for
add_bias
,activate_array
. - Adding a layer (maxpool layer).
Week 8
- Added GPU support for
add_bias
,activate_array
. - Added maxpool layer.
Blockers
- Facing
__glewActiveTexture
error.
Week 9
- Resolve the
__glewActiveTexture
error. - Successfully execute Maxpool layer.
Week 9
- Resolved the
__glewActiveTexture
error. - Added vertex and fragement compute shader code.
Blockers
- Facing error while linking the shaders
Week 10
- Compiling and linking the shaders
- Adding compute code for other functions
Week 10
- Successfully linked the compute shaders
- Upgraded the compute shader code for add_bias, fill_gpu.
Blockers
- No blockers as such. I was facing issue while compiling the code which shall be resolved.
Week 11
- Adding predictions for calculating number of layers
- Function to determine the class ID from a given prediction value.
Week 11
- Added Predictions for calculating number of layers here.
- Added function to determine the class ID, confidence and class name from a prediction values here
Blockers
- No blockers
Week 12
- Modifing detection structure to include class detection and ID fields
- Populating and printing class detection results.