Week 2 Progress Report

Hi, this is my week 2 Progress report.

Tasks Done:

  1. Created the Encrypter Game.

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/86f1e7b66fdd3ed03037384d708ded126433bbeb

  1. Worked on the sample audio files provided to me…to generate decoding accuracy statistics…

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/e1aac8692cf5eec0a6e1f02fbf16858ee74a9ddb

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/24197352a003dfb7a71ace89cacdfabce07add46

  1. Working on updating the language model based on the results at each time.

  2. Split up the entire language model into commands and characters…so that the accuracy is increased further. Switching between those language models from time to time according to the task being done.

https://github.com/AnirbanBanik1998/Modern_Speak_and_Spell/commit/6a7a98672ae50ae92fbf70048b46ea91737324b3

  1. Updated documentation for the entire project.

Issues Faced:

  1. Accuracy of decoding is not up to the mark…having no problem with the commands…but can’t recognize characters. For some…like X, Y and Z the accuracy is 0% which is really bad.

  2. The updated dictionary too… would work only for the given input audio files…and may not be efficient for others.

I am thinking of discarding the idea of updating the model…and instead work on improving the prediction capacity…

Some suggestions would be welcome.

Work to be done :

  1. Working on the Crossword game.
  2. Updating the work on the speech-decoder.
  3. Updating accuracy stats after each operation…be it running the games…aur launching operations…or just testing.

Regards,
Anirban

<SNIP>

Issues Faced:
1. Accuracy of decoding is not up to the mark....having no problem with the
commands...but can't recognize characters. For some...like X, Y and Z the
accuracy is 0% which is really bad.

Can you elaborate it a bit on this? In particular:
- Are you finally on the Pocket Bone HW w/the AGC mic?
- Assuming yes to above, you you confirmed (probally with audacity) that there
is sufficient clean signal recorded to even try recognizing? Thinking something
along the lines of:
    - Do a recording with the same parameters as you are using in the code.
Make sure it is the same mixer settings, word size (8, 16, 32, etc),
stereo/mono, and rate. I think by default Pocket Sphinx wants 16KHz sample
rate, mono and 16 bit but that can be changed.
    - Load it into something like audacity and see if the signal is either too
low, clipping, or otherwise distorted.
- What is it returning for those letters giving your dictionary? Nothing or?
- Try adding "Ex" for X and "Why" for Y and "Zee"/"Zed" to the dictionary and
see if those report hits. A work around might to map those to the letters on
the SW side if you are in a context to read letters.

- Related to above, what are you using for 'Z'? That is, are you using 'Zzzzz'
(like the sound of snoring) or 'Zed'? In otherwords, is it a British or US Z.

Any other detail of your setup that you think may be relevant?