Apple has been granted a patent (number 11,069,336) that shows the company wants Siri to be better able to pronounce names.
About the patent
It’s dubbed “systems and methods for name pronunciation.” In the patent, Apple notes that name recognition is a particularly difficult aspect of speech recognition. Names can include names of people, businesses, and other entities. The distribution of names has a long tail.
What’s more, the way names are pronounced can be subjective and dependent on the name’s origin. There can be a few names that are very common, but an order of magnitude more names that are very rare.
Apple says that, for a speech recognition system to recognize names, a linguist is typically needed to transcribe all possible pronunciations in a phonetic alphabet supported by the locale or language in which the speech recognition system is deployed. However, the tech giant says that most existing speech recognition and synthesis system have up to hundreds or thousands of names, while there axe likely millions of actual unique names in use today.
Current speech recognition systems typically model name recognition to support tasks such as phone dialing, search and query, reminders, and events scheduling based on a named entry in a contact application of a user device. To recognize or synthesize a name, current systems often use a dictionary or a lexicon. These contain a mapping of the names to their possible pronunciations.
Apple says that, however, if a name hasn’t been modeled in the speech lexicon, the system must guess the pronunciation. For the purpose of speech synthesis, the system may also need to guess the stress on individual syllables comprised in the name.
For names not modeled explicitly in the lexicon, speech recognition systems typically depend on a pronunciation guesses that uses sophisticated letter-to-sound rules. However, because certain phonetic units are particular to a specific language, the same name may be pronounced differently by different users.
For these reasons, Apple says that existing systems aren’t capable of building an adequate pronunciation guesser that models the pronunciation of names from different languages and cultures. In many cases, a foreign name pronunciation may not be guessed properly unless explicit rules are represented within the guesser. Apple wants to improve Siri’s capabilities in these areas.
Summary of the patent
Here’s the abstract of the patent: “Systems and methods are provided for associating a phonetic pronunciation with a name by receiving the name, mapping the name to a plurality of monosyllabic components that are combinable to construct the phonetic pronunciation of the name, receiving a user input to select one or more of the plurality, and combining the selected one or more of the plurality of monosyllabic components to construct the phonetic pronunciation of the name.”