GSoC 2016 Application - "Port/implement MAV (drone) stereo image processing"

Hello!

I am Yash Oza (IRC nickname - yashoza), a third year student in SSN College of Engineering, Chennai, pursuing my degree in the Electrical and Electronics Engineering discipline. I have been toying around with the Beaglebone for about a year now and am truly fascinated by its capabilities. I was looking to get into Open Source development and think this is the best way I can start off - by contributing to the Beagleboard community through the Google Summer of Code 2016.

As a part of GSoC 2016, I wanted to pursue the project “Port/implement MAV (drone) stereo image processing , using the BeagleBone Black (via BBIO cape) as Ardupilot platform”. In a wide variety of image processing applications, explicit depth information is required in addition to the scene’s gray value information (representing intensities, color, densities, etc.). Examples of such applications are found in 3-D vision (robot vision, photogrammetry, remote sensing systems), in medical imaging (computer tomography, magnetic resonance imaging, microsurgery), in remote handling of objects for instance in inaccessible industrial plants or in space exploration, and in visual communications aiming at virtual presence (conferencing, education, virtual travel and shopping, virtual reality). In each of these cases, depth information is essential for accurate image analysis or for enhancing the realism. In remote sensing the terrain’s elevation needs to be accurately determined for map production, in remote handling an operator needs to have precise knowledge of the three dimensional organization of the area to avoid collisions and misplacements, and in visual communications the quality and ease of information exchange significantly benefits from the high degree of realism provided by scenes with depth.

So I have broken down the project into different phases, as discussed with the mentors on the IRC channel.

The first phase of the project is the acquisition of data from the two mounted stereo cameras, using the PRU’s. The PRU’s will be used for capturing the image data that is provided from the camera. The camera that I’m planning on using is the FLIR Lepton Camera [0].The camera is an easy-to-interface camera and is widely adopted in the maker community. When the data is ready to be imported from the camera, a trigger signal (mavlink message) will be sent to the PRU and then the data from the camera will be captured. For the data transfer, vrings would be implemented as the observations made in [1] indicate that that would be the fastest method for communication. This is the first module, and would be developed in such a way that it would be useful for later developers to use it as a standalone capture interface for stereo images. [2.5 weeks]

The next phase of the project would be to use the mavlink/mavconn libraries to establish the communication channels to move around the image data. The library provides a low latency communication between processes (about 100 microseconds). It also supports ROS and the LCM library for inter-process communication. The results obtained in this paper [2] indicate that the LCM library is very efficient, and would do a good job. [1.5 weeks]

The following phase of the project is the image processing part on the PowerVR SGX using the OpenCLIPP (OpenCL Integrated Performance Primitives) library [3]. OpenCLIPP is a library providing processing primitives (image processing primitives) implemented with OpenCL for fast execution on dedicated computing devices like GPUs. Using OpenCLIPP, a depth map will be generated from the images obtained from the stereo cameras [4]. This [5] paper demonstrates how mobile robot navigation can be accomplished using generation of depth map (disparity map) using stereo image processing. Using the established mavlink framework, the depth map will be sent to the main memory. The depth map can be further used for other image processing applications, which can be overlayed on the generated depth map. [2 weeks]

After the individual modules are built and ready, the individual components will be integrated and then optimization of the entire system will be taken care of. This would involve increasing the frames per second captured from the camera, and increasing the resolution of the captured image. Depending upon the results, and with the help from my mentors, the optimum process will be verified. [2 weeks]

The next step would be to integrate the system in ArduPilot through MAVLink. After this would be the period where I fix any unexpected bugs, do integration checks, carry out any further improvements, complete the documentation, and create tutorials so my work can be utilized further by anyone who needs it.

[0] https://groupgets.com/manufacturers/flir/products/flir-lepton

[1] https://hackaday.io/project/5837-pruss-support-for-newer-kernels

[2] http://people.csail.mit.edu/albert/pubs/2010-huang-lcm-iros.pdf

[3] https://github.com/CRVI/OpenCLIPP

[4] http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_calib3d/py_depthmap/py_depthmap.html

[5] http://arxiv.org/pdf/1412.6153v1.pdf

I have also made up a timeline, which I am still working on optimizing with the mentors.

  • Week 0
    Aim : To get the working environment set up, connecting hardware and achieving basic readouts from sensors

This time frame will be used to get the basic working environment set up, including the hardware and the software. The stereo camera arrangement will be completed so that I can be up and running as the coding period starts. I also aim to complete the labs suggested by Stephen Arnold; get a better understanding of the working of MAVLink library, which is instrumental part of my project during this period.

  • Week 1, 2

Aim : To set up data reception to the PRU from the stereo camera

This would involve getting continuous stream of images, of size 80x60 pixels, initially at 10 fps . As soon as the data is ready to be captured, a mavlink message will be sent and the data acquisition process will start. A vrings implementation would facilitate data transfer. As explained in the project briefing, the data capturing module would be set up in such a way that it would be useful for later developers to use it as a standalone capture interface for stereo images.

  • Week 3, 4

Aim : To establish the communication channels

The aim of this working period would be set up mavlink/mavcomm libraries to establish communication data to move around image data. This point could be a bottle-neck, and the rate of exchange of data will have to be synchronized with the rate at which data is obtained from the camera.

  • Week 5, 6

Aim : Perform stereo image processing to obtain the depth map in the SGX

The next task would be to use the images that are obtained from the PRU, and use them for performing stereo image processing. This would be done using the PowerVR SGX using the OpenCLIPP library. From here, the depth map will be obtained, would be made available to the main memory, so that it can be used for other applications.

  • Week 7, 8

Aim : Optimizations

The following couple of weeks would be spent in optimizing the data capture rate, and the resolution of the captured frame. I plan on giving user a choice as to what frame rate and what resolution they would prefer (for e.g., if it’s on a robot that’s moving super slow it might prefer higher resolution @ a lower frame rate).

  • Week 9, 10, 11, 12

Aim : Integration with ArduPilot, Bug fixes, improvements and documentation

This period would be used for integration with ArduPilot using MAVLink. Also, a check to see if all the integrated modules are working as expected will be done. Any bugs encountered would be fixed and possible improvements would be made in this period. The documentation, code commenting checks and the submission work will be taken care of.

The timeline is just a rough draft, and I will work with the mentors to optimize it and make it an achievable!

I request all the mentors to review my application and provide feedback! Would help to a great degree!

Yash Oza

Link to the Google doc :https://docs.google.com/document/d/1WDq5zcQAIUx6BA3M5C4u7m_gzjdtIAhn28mI-8iMgkQ/edit

You might want to ask for Blue (preferred) with alternate of Black plus cape such as bbmini or protocape with supported sensors (pixhawk firecape is another option).

Otherwise looks good, so keep optimizing the timeline (add some dedicated testing periods to go with each “thing” that is testable).

Steve