Microsoft starts testing voice dictation in latest Office apps

This content has been archived. It may no longer be accurate or relevant.

From ZDNet:

On March 12, Microsoft began testing this feature with its Office Insider testers. The @OfficeInsider account tweeted yesterday:

“Windows #OfficeInsiders, get ready to ditch your keyboard and use your voice to write documents, compose emails, and create presentations! Voice dictation is available now to #InsidersFast.”

Microsoft officials touted the coming Office dictation technology in January, saying it would be available in February 2018.

To test dictating using voice, customers must be running the latest version of Office for Windows (Office 2016) and be an Office 365 subscriber. The voice dictation feature, which uses speech recognition technology to convert speech to text, is available for Word 2016, PowerPoint 2016, Outlook 2016 and OneNote 2016 and in US English only for now. To test this, users must be in the Windows desktop Office Insider program.

. . . .

I’m not sure if Microsoft is using the Dictate technology developed by its Microsoft Garage incubator as the basis for the Office Dictate feature. Dictate originally was an add-in for Word, Outlook, and PowerPoint and used the same speech-recognition technology in Cortana for converting speech to text, coupled with real-time translation. I’ve asked the company if this is the case but haven’t heard back yet.

Link to the rest at ZDNet and thanks to Felix for the tip.

PG has been trying out computerized dictation software forever.

He thinks his first attempt was with some software from Kurzweil, then he had an extended and frustrating relationship with Dragon Dictate. He did find an article he wrote about Voice-Assisted Legal Research for the ABA Journal in 1994 (and hopes nobody relied on it to jump into voice recognition).

PG is perennially hopeful, but, in the absence of a smart legal assistant, has always found typing to be more satisfactory than dictating. He’ll try Microsoft’s Dictate when he gets a chance, but will be braced for disappointment.

14 thoughts on “Microsoft starts testing voice dictation in latest Office apps”

  1. I did try Dragon, was high on the ‘oh my god’ meter when going over all the wrong guesses the silly thing made. (needed an editor before you dare send it to an editor. 😉 )

    And I guess I could see this at home in your ‘office’ doing ‘work’, but could you imagine trying to do this in an actual office? (Or better yet, one of those ‘no walls’ open floor plans?)

  2. one has to ‘train’ Dragon Speaking and then it will be about 98-99 % accurate, but one has to spend the time to train it to voice idiosyncrasies. Not sure why microsoft wants to compete, as it seems a small niche

    • Maybe because they can do it better?
      Maybe they think they can get it out of the niche and into the mainstream?

      They already have the technology from the Cortana efforts so it’s a cheap way to add value to Office. Especially OneNote; if it works without a headset headphone it might be a cheap transcription tool for college students. And the PowerPoint support might be useful in business meetings.

      It’s not just Word getting dictation.

      This is something Alexa and Google can also do. Apple, not so sure, since Siri seems to have stalled, feature-wise.

      Dictation, transcription, and translation features are free spinoffs from the voice assistant natural language efforts.

      • if they can felix, and do it well for mac and windows, i’ll be one to try it

        Translation Felix, accurate translation, now that would be something

        and ace accurate transcription from audio would be great

        How would you see power point being teamed with this?

        • Automatic meeting notes?
          Instead of just recording the audio during a presentation and embedding it as an attachment, the app could transcribe in real time. Conceivably it could keep track of different speakers and flag comments appropriately.

          Or, at creation time there would be no need to switch from mouse/pen to keyboard to enter text. Just draw the text box, dictate, apply effects…

          Smoother workflow.

          • that would be cool felix
            esp since when asked to look at a video or listen to a recording, I just want to go fishing. Much easier to read/glance/grok a transcript. I hope your idea comes true.

    • My boss uses Dragon, and has for years. I strongly encourage him to let me edit anything important before it goes out the door.

  3. Or MS could just buy a company that already does it and roll it into Office.

    There’s precedent for that.

  4. I expect Microsoft is doing this for two reasons: 1) user expectations and 2) to get data.

    Between Google’s assistant, Alexa, and Siri, a lot of people are getting accustomed to using speech controls. If it slides from accustomed to expecting and MS doesn’t have a good option, it increases the chance of losing the market dominance of their Office franchise.

    To make it accurate, they need lots of speech data which Amazon, Apple, and Google are already getting. The more people interact with Office this way, the better they can adapt it.

    • 3) to get people locked into more fee-based Microsoft.

      They see Amazon and Netflix raking in the dough and want/need people not making a one-time payment and not having to pay again. (Like my Office2000 CDs that the code doesn’t call the mothership to confirm – and hasn’t made them another cent since the year 2000.)

    • Microsoft has their own data stream: Cortana runs on PCs and XBOXes as well as on phones.
      If anything MS has more data in more languages than Amazon.
      (Compare how many languages Alexa supports vs the languages Cortana supports.)

      It’s popular to think of Microsoft as a tech has-been but it is far from the reality. They are still on the forefront of many current and future technologies like cloud computing, mixed reality enviroments, computer vision, real AI, and, yes natural language processing.

      Microsoft language support:

      https://msdn.microsoft.com/en-us/library/hh378476(v=office.14).aspx

      Catalan – Spain

      Danish – Denmark

      German – Germany

      English – Australia

      English – Canada

      English – Great Britain

      English – India

      English – United States

      Spanish – Spain

      Spanish – Mexico

      Finnish – Finland

      French – Canada

      French – France

      Italian – Italy

      Japanese – Japan

      Korean – Korea

      Norwegian – Norway

      Dutch – Netherlands

      Polish – Poland

      Portuguese – Brazil

      Portuguese – Portugal

      Russian – Russia

      Swedish – Sweden

      Chinese – China

      Chinese – Hong Kong

      Chinese – Taiwan

      Alexa fully supports English, German, and Japanese and can translate short phrases from spanish, french, and a few other languages. They aspire to teach Alexa to be a real time translator but for now it’s just aspiration. They are far from the only ones working on that.
      (Microsoft showed a working voice-based English to chinese system back in 2012.
      https://www.technologyreview.com/s/507181/microsoft-brings-star-treks-voice-translator-to-life/)

      The other thing Microsoft brings to the voice recognition game that Dragon et al don’t have is cloud-assisted computing tools and their proprietary back end hardware. (Microsoft has designed its own AI chips optimized for pattern recognition and they are using them to power BING, Cortana, and other online services.)

      https://www.wired.com/2016/09/microsoft-bets-future-chip-reprogram-fly/

      Tying the new dictation features to Office 365 will no doubt bring in more online subscription revenue but it isn’t the only (or even dominant) reason to do it. Modern back end cloud services can deliver more specialized computing power than any existing desktop or portable hardware can provide.

      Think of it as the revenge of the mainframe.

      Just as Alexa relies on Amazon datacenters to provide its voice control services, Microsoft and Google have their own datacenters providing a mix of cloud services. Apple, too, but lots less because they don’t do cloud computing.

      The future of computing is mixed: local cpu + local gpu + cloud.

  5. Do any of these things show the dictated words streaming across the page, and allow one to also type? Could I speak, and then provide my own punctuation, or dictate twenty words, then type ten? Highlight eight words, and then replace the with six spoken words?

    • Yes.
      It acts like a second keyboard.

      You ever call up the onscreen keyboard in Windows? Or the handwriting recognition on a touchscreen WinTab? You can switch from one input stream to the other on the fly. It’s one of the things that makes OneNote so useful.

  6. When sending texts, I use dictation on my iPhone. This has led to an embarrassing habit: sometimes when I leave someone a voice mail, I include the punctuation. Examples: “I’m unable to keep our appointment period.” “Do you want to set a different date question mark?”

Comments are closed.