GSoC 2013 Proposal and questions

jkridner · May 2, 2013, 11:34pm

Hi everyone,

My name is Matei Florin and Im a 1st year MSc student at the Faculty of Electronics Telecommunications and Information Technology of the Politehnica University of Bucharest, Romania. I majored last year in Applied Electronics and currently Im undergoing a master`s program called “Multimedia technologies used in biomerical and information security applications”.

Ive been following the BB project for about 3 or 4 years now but Ive never had the opportunity to work with a BB of any kind because I could not afford it and so it was difficult to contribute to the community. Luckily for me my university did purchase an OMAP3 EVM with which I fell in love and which I hijacked as soon as I laid eyes on it. Using the EVM I created my graduation project called “Home Automation system using speech recognition on an embedded platform”. The project consisted of the OMAP as the brain of the system and 2 other boards which where used as sensors and actuators. The OMAP had a speech recognition toolkit which had the task of performing voice activity detection and continuous speech recognition and the result of this was fed into a command detection block which decided if the recognized utterance is a predefined command or not and if it was it would send it out to the actuators.

Are you talking about TIesr? https://gforge.ti.com/gf/project/tiesr/

The actuators where 2 PIC microcontroller boards which I also programmed. They had a wireless network between themselves and on of them had an USB connection with the OMAP. On the 2 boards different technologies were used like RGB LED control and Capacitive sensing. Using voice commands you could turn on/off the lights/tv or control something else and it also had distress calls. The speech recognition toolkit I used is CMU`s pocketsphinx. Atop of the pocketsphinx library I created some wrapper C++ classes in order to be able to use Qt to create a GUI for the on-board screen.

I have experience in working with C/C++, Java, (some) Qt, cross-compiling, rebuilding linux, low-level PIC programming (8 and 32 bit) , some ARM assembly, board design and layout.

Related to this years GSoC I have a personal proposal and if that is unfit Id like to ask some questions about one of the proposed ideas.

My proposal:
Ive seen Open CV ported to the BB but I havent seen anything regarding speech recognition and I thought I`d pick that task up seeing as the BB processors are powerful enough to do statistical based speech recognition. I was thinking of adding support for that for the BB using the CMU Sphinx pocketsphinx.

This doesn’t sound like it would take more than a day or two to build and test.

That would include (from a general point of view):

at least cross-compiling the libraries. Fixing any potential issues (I had some issues with audio capture) and maybe some optimizations for the architecture (I`ve only used the compiler optimizations)

create a daemon like interface which would enable the user to launch it to perform voice detection + speech recognition + command detection. Based on the detected command and with the use of a config file another process of piece of code could be loaded or started i.e. allowing the user the define some word(s) ↔ programs/actions relations → saying “forward” or “move forward” would toggle a pin.

I’m not sure this does enough to enable users.

Questions regarding one of the proposed ideas - “PRU Firmware loader”:

I find this project idea to be very interesting, but I would like to know somethings in order to have a clearer idea on what needs to be done. maybe some questions are stupid but as I said I havent worked with BBs or capes. I know that each cape has an I2C based EEPROM which stores the ID of the board but how is the detection of a board being plugged in done? Is it USB OTG style by toggling a pin? Is it based on polling on the bus for known addresses?
What`s the desired functionality/behavior? Should there be a daemon running which gets triggered each time it detects a board or will it be user triggered i.e. calling a program which tries to detect the cape?

Have you tried contacting Matt Porter or Matt Ranostay to get feedback on their idea?