We experience every moment of every day through listening. Playlists, podcasts, stories and sounds fuel our emotions, shape our identities and keep us entertained. But what happens when those moments become fully seamless, or even screenless?
As voice interaction gets smarter, personalized content scales and immersive VR and AR experiences improve, listening will become the essential driving force behind our everyday interactions with technology. Here’s what to expect from the future of audio.
Alexa and Siri are household names, and they’re just the beginning. At CES in January, voice-enabled devices—TVs, set-top boxes, and even lamps and vacuums—were everywhere, and the Consumer Technology Association’s chief economist declared that voice interactions will replace the traditional computer interface. In some markets, voice-first devices like Amazon Echo (available in the US, UK and Germany) and Google Home (currently just in the US) are already making their way into more living rooms. Last year, 6.5 million of these devices were shipped, up significantly from 1.7 million the year prior. In 2017, those numbers will surge exponentially, as 24.5 million voice-first devices are expected to ship.1 Since people can speak up to four times faster they can type,2 it’s only natural that they’ll be quick to adopt voice technology as it improves.
As voice devices become more responsive, listening will become even more intimate.
While these home assistants let us cue up podcasts, weather updates and movie times, the biggest reason they appeal to people is music.3 Simply put, devices made to sync with headphones and speaker systems are great for DIY DJ’ing. With an easy command—“Play ‘Bad and Boujee’ by Migos,” or “Play the Mood Booster playlist”—we can curate our listening experience. As voice devices become more responsive to natural language patterns, listening will become even more frictionless and intimate. More conversational commands, like, “I need some music to cheer me up,” or, “What’s that song with Bon Iver and James Blake again?” will be enough to surface the right content.
Voice will direct the in-car experience, too. “We expect to see voice control of audio in the car as intuitive, responsive and accurate as it has become in the home,” says Jonathan Tarlton, auto lead at Spotify. Alexa will be integrated into Ford’s infotainment system later this year. By reducing the need to swipe and click, voice interactions will make driving safer and less distracting. Even little things will be more convenient. Once cars are synced across devices, it will be possible to start the engine or lock the doors from the couch.
“Our cultural understanding of listening and of sound will change,” says David Toop, professor and chair of audio culture and improvisation at the London College of Communication. “We’ll shift from being a visio-centric culture to one more focused on sound and hearing.”
Since streaming helps brands understand people’s moods and moments, the voice assistants of the future will be able to intuitively interact with consumers in the right context. When serving up ads, they might be able to match the mood of the listener and adjust their tone accordingly. Even better, the listener might be able to talk back directly, making the experience of an audio ad more interactive. IBM has already started using this type of ad format with the Watson Ad program, giving listeners the chance to ask questions with their voice.
"In the future, you can talk to any device, tell it what you want to hear, and it’ll start playing."
Troy Carter, Global Head of Creator Services, Spotify
Streaming already gives us insight into who people are, what they’re doing and how they’re feeling in the moment. As audio innovation grows, that consumer understanding will enable deeper personalization than ever before.
The experts we spoke to talked about the potential of “dynamic audio,” or the ability to offer mood-based targeting and creative that can adapt to your real-time context. As devices become more connected, they’ll be able to serve up increasingly relevant content. On Spotify, for example, listeners are 100% logged in with a persistent ID across devices—and since they’re listening all day long, streaming provides deep intelligence about real-time context and emotional state.
As devices become more connected, they’ll be able to serve up increasingly relevant content.
“If I know that you’re listening to certain type of music, and I know you happen to be jogging at this time of the day, and I know your age and where you live, and then we have access to dynamic audio, I can change my message to you in a moment’s notice,” says Tony Mennuto, president of Wordsworth & Booth. Imagine an athletic beverage instantly reaching a jogger with a message that intuits how they’re feeling in that moment, perhaps even referencing the fact that they’re speeding up or slowing down.
New technology will also help brands re-engage with listeners after they’ve heard the initial message. You’ll be able to keep the conversation going with visual reminders or sequential messages that add on to your earlier story, rather than repeating it.
This ability to adapt your message in real time will go hand-in-hand with the rise of programmatic audio. New programmatic offerings provide the ability to target listeners not just using demographic data, but also by incorporating playlist data and music taste. In the future, these categories will become more specific (think moods and interests), and the ability for truly intimate one-to-one messaging will rise. These programmatic offerings won’t just be limited to streaming services—podcast producers like NPR and Gimlet are already testing new technology for programmatic and dynamic audio opportunities to provide targeted messaging to listeners.4,5
"There’s a degree of customization in the creative that’s necessary to take advantage of the capabilities inside the programmatic ecosystem."
Tim Sims, VP of Inventory Partnerships, The Trade Desk
With a deeper understanding of people who are tuned in throughout the day, brands will have more opportunities to align with the audio content that their audience is most passionate about. That could mean playlist curation, podcast sponsorship, or even producing original audio content themselves.
“Ten years ago, if you put a radio ad on top of a podcast, people wouldn’t want to listen,” says Karen Pearson, CEO of the UK-based, audio production company Folded Wing. “Now, people understand that to listen to the ‘Serial’ podcast they have to listen to what’s read at the top. Audiences are becoming more open to it…[and] advertisers and content producers are finding more imaginative ways of working in advertising.”
Native audio puts brands in the heart of the storytelling, without interrupting the flow.
Sometimes, those imaginative ways include starting a podcast of their own. Companies like Slack, eBay and General Electric have all started their own podcasts to build their brand and get their message out to an audience that’s actively choosing to listen based on their niche interests. For Slack, that audience is people who are passionate about finding meaning in work, for GE, it’s people who love Sci-Fi, for eBay, it’s anyone with an entrepreneurial spirit. Some industry experts predict that branded podcasting will double in 2017, as more brands learn how to tell their story through audio.
Along with the potential in the podcast space, music also provides endless opportunities to build a brand or communicate a message with original audio. To promote his Hilarity for Charity initiative, Seth Rogen launched a 40-track “Classic Soul” playlist of his favorite songs on Spotify, adding his thoughts and commentary throughout. To pump up a new generation of protesters and showcase the Hamilton Mixtape, Lin-Manuel Miranda put together a “Rise Up Eyes Up Wise Up” playlist of 17 fight songs. To keep their community engaged, Starbucks started an interactive “Top 10” playlist curated by listeners, who can impact the playlist by voting within the Starbucks app. These approaches help put brands in the heart of the storytelling, without intruding on good music or interrupting the flow.
"Should the format be confined to 15 and 30 seconds? Should it interrupt the listener experience as opposed to wrap around it?"
Jay Richman, VP of Product, Spotify
VR. AR. AI. 4K. 3D. Tomorrow’s technology means tomorrow’s audio will be more engaging than ever before. For marketers, that means your native opportunities will also be chances to build truly immersive experiences.
Earbuds will become more than just earbuds, as “hearables” are already in development. Bragi’s Dash earbuds, for instance, allow for physical movements to serve as commands—a head nod lets you pick up a phone call, and a shake lets you deny one. Also in the works: A smart earbud that can translate languages in real-time, and another that bills itself as the first AI personal trainer, intuitively guiding you through workouts.
Tomorrow’s technology means tomorrow’s audio will be more engaging than ever before.
As our headsets get smarter, immersive audio experiences like 3D audio will allow sounds and songs to literally surround us, and high-resolution audio formats will make those sounds clearer than ever. For artists, creators, and brands, this presents an entirely new canvas for storytelling. At a new theatrical show in New York called The Encounter, for instance, headphones equipped with 3D sound were left on every seat, and the storyteller on stage guided the audience through the Amazon jungle using binaural audio technology, which recreates sound the way human ears actually hear it in daily life.6 There’s already a concept in development called PodRift that aims at combining podcasts and the Oculus Rift to transport listeners “as avatars to virtual environments.” As technology improves, storytellers can use sounds to legitimately bring people inside their story. These tech developments will make the podcasts and albums of the future remarkably immersive and experiential.
It’s no surprise that the tech world has bet big on these virtual and augmented experiences: From 2015 to the beginning of 2016, investments in VR and AR companies grew a whopping 648%.7 The opportunity for audio to take these experiences to new places is limitless. Already, artists like Ray LaMontagne and Dawn Richard have crafted interactive VR videos that truly immerse the listener in the song. The band Massive Attack launched an app experience that augments your reality by creating new remixes based on your physical environment—movement visuals, time and location. Eventually, these experiences will let you stand front row at your favorite band’s concerts, and bring “visual albums” to a whole new level of immersion.
If earbuds are becoming more than just earbuds, and music is becoming more than just music, then ads can become more than just ads. Brands and marketers can utilize these new technologies to tell their story through an immersive combination of sight and sound. Retailers, for instance, could use AR technology to unlock exclusive audio content based on visits to their stores. Movies and TV shows could take listeners inside their world with soundtracks and sound effects serving as a uniquely modern trailer. The possibilities are endless.
"People want deeper immersion into the vision the creator has."
Joy Howard, CMO, Sonos
Sources for Chapter 3:
1,3 The 2017 Voice Report, VoiceLabs (2017)
2 “Amazon and Google fight crucial battle over voice recognition,” The Guardian (2016)
4 “Podcasts Try Dynamic Ad Insertion,” AdExchanger (2016)
5 “Podcast Advertising Pokes Around In Programmatic,” AdExchanger (2016)
6 The Encounter review, The Guardian (2016)
7 Number of Deals and Amount Invested in VR/AR Companies Worldwide, eMarketer (2016)
About the Power of Audio
The Power of Audio explores the role of audio in consumers’ lives and its impact for brands and advertisers. Fourteen experts were interviewed and 46 consumers were tasked with creating audio diaries (four were interviewed in depth in the US, UK, Brasil, & Japan) between September and November 2016. Custom panel-based research measuring the effectiveness of audio advertising was conducted in partnership with Nielsen Content Solutions, December 2016 through January 2017. Nielsen’s study was conducted through an online panel with 4,000 respondents aged 15-54, entirely on mobile devices, using a pre/post exposure methodology. Each respondent was exposed to content in a mode dependent on which ad format they were allocated to. The survey covered 4 key topics: Content Engagement, Brand Perceptions, Brand Behaviors, and Ad Engagement.