Customer Feedback - Video Indexer Trial
I had a chance to test out the video indexing a bit more and passed it onto somebody who frequently writes up transcripts of interviews. He was able to test it out a bit and, while he was certainly interested in the technology (especially the translation feature given the large number of Spanish speaking interviews that he must translate), he determined that it is still easier to manually transcribe interviews than to go through and correct the mistakes in the indexed video.
The majority of the problems are due to the fact that most people being interviewed by police and/or detectives often don’t speak very clearly and often use some heavy slang (especially curse words, which it almost always chose to assume were some other word). Additionally, the audio sources used to record the interview may require concealed recording devices, muffling the sound somewhat, or whatever device the person has on hand (e.g. a phone) which may either not have a great microphone, or the microphone prioritizes nearby voices. I also had a video I tested with that was recorded using a phone and the service didn’t index anybody speaking in the background–even when the voice was clear–nor did it index their faces despite them being in plain sight in the video (a negligible problem for this use, but something I found odd).
So while the technology probably works for the current audience—closed captioning for videos of well-spoken individuals—it really shows the limitations of the speech recognition when the audio quality decreases and those speaking are clearly not as comprehensible.
If you hear of any increase in capability along those lines, or would like us to assist with testing out any new generation of this technology meant to tackle these problems, I’ll definitely look to see what we can do to assist. It’s definitely a solution that is desperately needed and any way I can help make it a reality, I’m certainly willing to do anything within my power to assist.