Apple Vision ProOpinionsPatents

The Apple Vision Pro may be able to generate a CGR environment based on music and lyrics

FIG. 4A illustrates a CGR environment with CGR content generated based on natural language analysis and semantic analysis of an audio file.

The upcoming Apple Vision Pro may be able to create a computer generated reality (CGR) environment based on audio input such as music.

The US$3,499 (and higher) Spatial Computer is due in early 2024. Apparently, it will only be available in limited quantities at first. 

Apple has been granted a patent (number US 11842729 B1) for a “Method And Device For Presenting A CGR Environment Based On Audio Data And Lyric Data.” 

About the patent

It generally relates to computer-generated reality environments and, in particular, to systems, methods, and devices for presenting a computer-generated reality environment based on one or more audio files.

In the patent Apple says that, in order to provide immersive media experiences to a user, computing devices present CGR that intertwines computer-generated media content (e.g., including images, video, audio, smells, haptics, etc.) with real-world stimuli to varying degrees—ranging from wholly synthetic experiences to barely perceptible computer-generated media content superimposed on real-world stimuli. 

To these ends, CGR systems, methods, and devices include mixed reality (MR) and virtual reality (VR) systems, methods, and devices. Further, MR systems, methods, and devices include augmented reality (AR) systems in which computer-generated content is superimposed (e.g., via a transparent display) upon the field-of-view of the user and composited reality (CR) systems in which computer-generated content is composited or merged with an image of the real-world environment. 

According to Apple’s patent, in various implementations, a CGR environment can include elements from a suitable combination of AR, CR, MR, and VR in order to produce any number of desired immersive media experiences. And while music is typically an audio experience, the lyrical content, sound dynamics, or other features lend themselves to a supplemental visual experience. 

Previously available audiovisual experiences, such as music videos and/or algorithmic audio visualizations, aren’t truly immersive and aren’t necessarily tailored to a user environment. Apple is looking to change this with the Vision Pro.

Summary of the patent 

Here’s Apple’s abstract of the patent: “In one implementation, a method of generating CGR content to accompany an audio file including audio data and lyric data based on semantic analysis of the audio data and the lyric data is performed by a device including a processor, non-transitory memory, a speaker, and a display. The method includes obtaining an audio file including audio data and lyric data associated with the audio data. The method includes performing natural language analysis of at least a portion of the lyric data to determine a plurality of candidate meanings of the portion of the lyric data. 

“The method includes performing semantic analysis of the portion of the lyric data to determine a meaning of the portion of the lyric data by selecting, based on a corresponding portion of the audio data, one of the plurality of candidate meanings as the meaning of the portion of the lyric data. The method includes generating CGR content associated with the portion of the lyric data based on the meaning of the portion of the lyric data.”

Dennis Sellers
the authorDennis Sellers
Dennis Sellers is the editor/publisher of Apple World Today. He’s been an “Apple journalist” since 1995 (starting with the first big Apple news site, MacCentral). He loves to read, run, play sports, and watch movies.