Skip to content

Who Said What?

Podzinger is a search engine with a difference - it searches audio files. Specifically, it is set up to search “podcasts” for words and phrases, so that if, for example, you were seized by a sudden urge to hear someone talk to you about Tony Blair you could do so.

So far so useless. I download a couple of podcasts for when I’m travelling, but I don’t think I want to search the internet for random people talking on topics. Life is too short. What is more interesting is the technology, or combination of technologies that make this work.

The sound files are converted to text and then searched using conventional search engine techniques. The company behind claims to have 30 years experience in this area. Without access to the dataset they are using, it is difficult to tell whether the system works well for different accents, dialects and languages, or whether material is going missing. However, the search results they do return certainly don’t seem to contain false positives. Unlike dictation software, this system is not trying to turn speech to text in real time, and one would expect its results to be that much better, assuming the company is throwing enough hardware at the problem.

BBN technologies, the company behind this project, put it to more useful purposes for its government clients, such as searching international television and, presumably, helping to monitor private communications traffic as well.

All interesting suff.