Week 2 Progress Report

Anirban_Banik · May 28, 2018, 9:19am

Hi, this is my week 2 Progress report.

Tasks Done:

Created the Encrypter Game.

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/86f1e7b66fdd3ed03037384d708ded126433bbeb

Worked on the sample audio files provided to me…to generate decoding accuracy statistics…

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/e1aac8692cf5eec0a6e1f02fbf16858ee74a9ddb

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/24197352a003dfb7a71ace89cacdfabce07add46

Working on updating the language model based on the results at each time.
Split up the entire language model into commands and characters…so that the accuracy is increased further. Switching between those language models from time to time according to the task being done.

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/6a7a98672ae50ae92fbf70048b46ea91737324b3

Updated documentation for the entire project.

Issues Faced:

Accuracy of decoding is not up to the mark…having no problem with the commands…but can’t recognize characters. For some…like X, Y and Z the accuracy is 0% which is really bad.
The updated dictionary too… would work only for the given input audio files…and may not be efficient for others.

I am thinking of discarding the idea of updating the model…and instead work on improving the prediction capacity…

Some suggestions would be welcome.

Work to be done :

Working on the Crossword game.
Updating the work on the speech-decoder.
Updating accuracy stats after each operation…be it running the games…aur launching operations…or just testing.

Regards,
Anirban

HY0 · May 29, 2018, 9:21pm

<SNIP>

Issues Faced:
1. Accuracy of decoding is not up to the mark....having no problem with the
commands...but can't recognize characters. For some...like X, Y and Z the
accuracy is 0% which is really bad.

Can you elaborate it a bit on this? In particular:
- Are you finally on the Pocket Bone HW w/the AGC mic?
- Assuming yes to above, you you confirmed (probally with audacity) that there
is sufficient clean signal recorded to even try recognizing? Thinking something
along the lines of:
- Do a recording with the same parameters as you are using in the code.
Make sure it is the same mixer settings, word size (8, 16, 32, etc),
stereo/mono, and rate. I think by default Pocket Sphinx wants 16KHz sample
rate, mono and 16 bit but that can be changed.
- Load it into something like audacity and see if the signal is either too
low, clipping, or otherwise distorted.
- What is it returning for those letters giving your dictionary? Nothing or?
- Try adding "Ex" for X and "Why" for Y and "Zee"/"Zed" to the dictionary and
see if those report hits. A work around might to map those to the letters on
the SW side if you are in a context to read letters.

- Related to above, what are you using for 'Z'? That is, are you using 'Zzzzz'
(like the sound of snoring) or 'Zed'? In otherwords, is it a British or US Z.

Any other detail of your setup that you think may be relevant?