What We've Done So Far
The biggest priority when beginning our project was to collect enough data and samples so that we could start building our solution. Our goal was to determine language based on language features, so we first started collecting a large range of samples to work with. From the beginning, we have stressed the importance of creating a solution that won't be trained to a particular phrase but rather the language, so we gathered various phrases from various speakers to introduce that variety.
With our samples collected, we first used the 'leave-one-out' method and ran FFT coefficients through the SVM classifier in Matlab. We were getting an accuracy of 33%, which tells us that our algorithm needs to be refined and tweaked much more so that we can achieve a greater accuracy.
Since our preliminary tests, we've researched as a team other ways to process this data and how we can improve. One of the more promising advanced method we found is the Mel Frequency Cespral Coefficients of our audio samples. A study online leveraged this approach to achieve ~88% accuracy in a similar project. Other possible methods are analyzing language specific identifiers such as Mandarin lacking an 'R' sound. We might attempt to train our algorithm to detect this feature that is a unique to one of our languages. Finally, the last method we've researched extensively is the zero crossing rate (ZCR). When implementing this in Matlab, we found the for the "Go Blue!" sample had a ZCR of 2.466% and the Mandarin signal had a ZCR of 3.821%. These are small differences however, so we are hopeful that given the introduction of other samples we will see a more obvious identifier between the two signals.
The biggest difficulties we've encountered so far with regards to our project is that it took a while to get enough samples and that our first pass at an acceptable solution has a low accuracy. Since we've been working with .wav files in Matlab, we haven't had trouble loading in any files and haven't had difficulty finding potential filter methods either.
So far, we've collected a large range of solutions and researched methods to process our data. We now turn towards implementing some of the methods that we learned about to improve our first pass at our language differentiator.