Microsoft unveils new speech recognition tools
Speech capabilities are coming to Windows Vista and Office Communications Server 2007
Microsoft used this week's SpeechTEK conference in New York to demonstrate the voice recognition capabilities coming in Windows Vista and unveil plans to integrate speech recognition capabilities into Microsoft Office Communications Server 2007.
The company said Microsoft Speech Server 2007 will be fully integrated into Office Communications Server 2007, adding speech recognition capabilities to the server's instant messaging, voice over IP, audio and video conferencing capabilities when it is launched in the first quarter of next year.
Neil Laver, head of Sales and Marketing for Microsoft's Unified Communications Group in the UK, said adding speech capabilities to Communications Server would help firms reduce the number of separate "silos" used to run their communications tools. "Making everything available on a single server will make it far easier to manage and maintain," he added.
Microsoft said the new product would allow customers and partners to create new applications featuring speech capabilities or exploit new APIs to extend the functionality of existing applications for Office Communications Server 2007. For example, it could allow instant messaging conversations between one party speaking and the other typing.
Separately, the company demonstrated the new Windows Speech Recognition functionality that will be available in Windows Vista OS, giving users with the ability to dictate text and issue commands. The new functionality will be available in eight different languages and will include an interactive training session that also allows the system to optimise itself for the user's voice, Microsoft added.
The demonstration may go some way towards restoring the credibility of Microsoft’s speech recognition capabilities after an earlier public demonstration of the new Vista functionality, at a financial analysts' event last month, revealed an embarrassing bug in the system.
In that demo the software transcribed "Dear Mom" as "Dear Aunt” and then instructions to delete and select all as “let’s set so double the killer delete select all". Microsoft later explained the bug and claimed it was fixed, but video of the failed demo was quickly made available on YouTube and other video clip sites, prompting widespread media coverage.
Despite the flawed demo, Laver insisted the reliability of speech recognition technology is improving rapidly and is gaining customers. "We've worked at length with customers on [the speech recognition functionality in] Exchange Server 2007," he said. "[Adoption] is all about customers being able to test and pilot the technology and that will drive wider acceptance."
Dale Vile of analyst firm Freeform Dynamics said Microsoft had a long way to go to challenge the credibility of speech recognition software rival Nuance. " It is really difficult to serve the dictation market as [the software] has to recognise such a broad vocabulary and there are relatively few sectors, such as legal or healthcare, where people are used to dictating," he added. "Nuance has been in this area for years and is by far the market leader."
Chris Harris-Jones of analyst Ovum said the there were still doubts about the reliability of speech recognition systems. "It is getting better, but it is making very slow progress," he said. “Microsoft wants [the technology] to allow people to issue commands to change appointments or access emails for example, but if you are doing that it will likely be because you are travelling and there will be problems with the amount of background noise. It'll be a useful addition for some people, but I don’t see people rushing out in droves to get it."