[Athen] Voice recognition on Macs

Sorensen, Neal B neal.sorensen at mnsu.edu
Mon Jul 22 08:33:02 PDT 2019

Interesting developments over at Apple…

Voice Commands aren’t a new thing for Mac. In Mac OS X “Tiger” I could use voice commands, though the recognition was pretty touchy. It’ll be interesting to see how that has improved.

The use of Neural TTS has been introduced in other platforms already (See this article with examples from WaveNet, Google’s voice assistant<https://www.theverge.com/2018/3/27/17167200/google-ai-speech-tts-cloud-deepmind-wavenet>). It’s used mostly in Google Home and Google Assistant. Also Amazon’s Polly system promises to do similar things. The comparisons are quite striking, and the Japanese comparison to SAPI 5 style voice and WaveNet is especially impressive. The Robotic quality of the voice is almost completely gone in the WaveNet files. Apple, as usual, is late to the party, but when they arrive it’s usually with a big entrance and people saying they were the first to get there.

Neal Sorensen
(pronouns: he, him, his)
Accessibility Resources
Minnesota State University, Mankato
132 Memorial Library
Mankato, MN 56001

Phone: (507) 389-5242
Fax: (507) 389-1199

[cid:image004.png at 01CF4281.A3698650]

CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this transmission in error, please delete it from your system without copying it, and notify the sender by reply email so that our address record can be corrected.

From: athen-list <athen-list-bounces at mailman12.u.washington.edu> On Behalf Of Shelley Haven
Sent: Friday, July 19, 2019 8:10 PM
To: ATHEN <athen-list at u.washington.edu>
Subject: Re: [Athen] Voice recognition on Macs

Hi, Kathy!

At the keynote for Apple’s annual WWDC (World Wide Developers Conference) on June 3rd, there were several announcements of interest to the AT community, including the Dragon-like voice control you mention. But first, some basic info:

* macOS 10.15, available this fall, will be called Catalina

* Includes expanded Voice Control (numbered tags, grid)
* If you want to try out a public beta early, you can sign up for Apple’s Beta Software Program at https://beta.apple.com/sp/betaprogram<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeta.apple.com%2Fsp%2Fbetaprogram&data=02%7C01%7Cneal.sorensen%40mnsu.edu%7C1bffac69a0e84e6608f508d70caf5c5b%7C5011c7c60ab446ab9ef4fae74a921a7f%7C0%7C1%7C636991819640760414&sdata=6xwAoZoxYcakdGsorNyMYu%2Bbxl0DxOJ5DYF%2BRvGdDgQ%3D&reserved=0>

* The iOS version that runs on iPads has grown to be sufficiently distinct from that for iPhones (lots of additional iPad-specific features) that its version now gets a new name: iPadOS.

Apple introduces Neural TTS for Siri
Most of you know that modern TTS voices (those of the last 10-15 years) are based on concatenating phonemes lifted from human-recorded speech. The resulting spoken text sounds mostly human, but the cadence and rhythm are… well, kind of jerky. Neural TTS, created with the help of so-called neural network deep-learning, sounds more human and less synthetic despite being generated entirely by software. The biggest advantage is reduced listening fatigue, very important to those listening to long passages or entire books. Listen to the announcement and the new Siri voice from 1:00:20 to 1:01:45 in the keynote video (you can hear the same scientific text spoken in iOS 12, then the new improved iOS 13.):

Apple introduces Voice Control (available in Catalina)
This is an expanded collection of the current "Dictation Commands" for people with physical motor limitations. Voice Control provides a wide variety of commands for navigating and controlling all aspects of Macs, iPads, and iPhones. Among these are numbered tags for menu items and controls, and screen grids — features reminiscent of those in Dragon for Windows (now that Dragon for Mac is gone). This video shows its capabilities:

Screen Time for Macs
Screen Time<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsupport.apple.com%2Fen-us%2FHT208982&data=02%7C01%7Cneal.sorensen%40mnsu.edu%7C1bffac69a0e84e6608f508d70caf5c5b%7C5011c7c60ab446ab9ef4fae74a921a7f%7C0%7C1%7C636991819640770409&sdata=gieUMCCqun5iqF37Cns796YsqSEVaCNS0K%2BBJoZG%2Bzo%3D&reserved=0>, the iPhone & iPad utility for monitoring usage, managing downtime, and setting limits for access to apps, the Web, and communication contacts, will be available in macOS Catalina this fall.


Shelley Haven ATP, RET
Assistive Technology Consultant

On Jul 19, 2019, at 12:17 PM, Kathleen Cahill <kcahill at MIT.EDU<mailto:kcahill at MIT.EDU>> wrote:

Hi Colleagues,

I’ve written previously to talk about the decision by Nuance to stop supporting Dragon Professional Individual for Mac (formerly Dragon for Mac), which has left a huge gap in the voice recognition arena, especially for users who need to control the mouse with their voice and use custom vocabularies. There is a very in-depth article on the topic at https://tidbits.com/2019/01/21/nuance-has-abandoned-mac-speech-recognition-will-apple-fill-the-void/<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftidbits.com%2F2019%2F01%2F21%2Fnuance-has-abandoned-mac-speech-recognition-will-apple-fill-the-void%2F&data=02%7C01%7Cneal.sorensen%40mnsu.edu%7C1bffac69a0e84e6608f508d70caf5c5b%7C5011c7c60ab446ab9ef4fae74a921a7f%7C0%7C1%7C636991819640780408&sdata=0dMJ%2Bi0HVjyPhv5tiuKoBTsx%2BUM7qizScYzpVfXpe6w%3D&reserved=0>

As I was reading through the comments, someone mentioned that Apple is planning to add some of this functionality to their built-in dictation with the next version of the Mac OS. “At the recent announcement of the next Mac OS a new feature called Voice control was announced. It sounds like this feature will allow voice-based control of the mouse, via commands and a grid.” That comment was from June 8th of this year.

So, that’s good news. I hope it happens soon.

Have a good weekend,

Kathy Cahill
Associate Dean, Accessibility and Usability
MIT Division of Student Life
77 Mass. Ave. 7-143
Cambridge MA 02139
kcahill at mit.edu<mailto:kcahill at mit.edu>
(617) 253-5111

athen-list mailing list
athen-list at mailman12.u.washington.edu<mailto:athen-list at mailman12.u.washington.edu>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman12.u.washington.edu/pipermail/athen-list/attachments/20190722/97cb30a0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 7621 bytes
Desc: image001.jpg
URL: <http://mailman12.u.washington.edu/pipermail/athen-list/attachments/20190722/97cb30a0/attachment.jpg>

More information about the athen-list mailing list