Last month, I wrote an op-ed taking a critical eye to the globalization strategy of Bytedance, the Beijing-based parent company of ultra-popular Chinese news aggregation platform Jinri Toutiao. Riding a soaring valuation, Bytedance has stated its ambitions to apply their sophisticated AI-based recommendation engine to platforms aimed at overseas markets. In an ideal situation, this technology produces a win-win-win-win scenario in which readers get tailored content, creators reach the right audiences, advertisers efficiently reach their target market, and Bytedance cashes in as well.

While the platform and those in charge of it were certainly not intentionally creating a place for fake news, the issues with the platform come at a time when sensitivity around news content, and its relationship with the online platforms that distribute it, is particularly high. Western democracies are growing increasingly concerned over foreign governments’ attempts to influence voters through online media. For a Chinese news aggregator to succeed in entering those markets at this time, careful navigation will be required.

That being said, there are areas of the overseas internet in which Bytedance’s aggregation and recommendation competencies are sorely needed, and are not particularly sensitive. They just need to pick their spots. 

Bytedance’s global acquisitions

In their globally-focused acquisitions, they have made thus far, Bytedance has targeted China-connected firms with overseas popularity. In 2017, they acquired France-based news app News Republic from Chinese company Cheetah Mobile, who had acquired them the previous year. The crown jewel of their shopping spree, however, was undoubtedly, the Shanghai-based video app that took off with Western teenagers in 2016 and 2017, and for which Bytedance reportedly spent anywhere from $800 million to $1 billion. The acquisition makes sense. The videos on the app are mostly short, fun, and light, and since there is little political sensitivity around their subject matter, it is feasible that they could be a rare app to achieve a large userbase both inside and outside of China.

That being said, teens are fickle, and business models that focus on them can struggle in achieving sustainable success. Take Snap for example, who after IPO-ing at 25 dollars per share a year ago, has been trading mostly in the teens since June of 2017. Its stock plummeted again in late February after reality TV star Kylie Jenner declared it “dead” on Twitter.

Bytedance needs audio, and audio needs Bytedance

But where Bytedance may be most needed, and where their greatest overseas opportunity may be, is in audio. In fact, they are possibly better suited than anyone else to entirely revolutionize the podcast industry.

For those of you who don’t know what a podcast is…

Podcasts are downloadable audio files, most of which listeners enjoy on their smartphones. Most last anywhere from 20 minutes to 3 hours. Some focus on story-telling, others on news, but most popular ones involve in-depth interviews with experts or public figures.

Since podcasts are quite cheap to produce, it means just about anyone can start a podcast, on just about any topic. I recently was a guest on marketer Lauren Hallanan’s “China Influencer Marketing Podcast,” an English-language podcast which focuses on influencer marketing on Chinese social media platforms. A niche within a niche within a niche. But for the group of people for whom that topic is important, the information shared on Lauren’s podcast is pure gold.

Podcasts also hold tremendous influence and advertising potential. “From an advertising perspective, you have the listener trapped,” explained Bill Simmons, founder of The Ringer, a Los Angeles-based media platform which, according to Simmons, is achieving profitability primarily from the ad revenue of their robust network of sports and culture podcasts. “If someone is listening to your podcast while they’re exercising or doing dishes, they’re not going to stop what they’re doing in order to fast-forward while you talk about a sponsor for 30 seconds.”

Podcasts have become a central component to just about every major media company in the English-speaking world. From CNN to Vox to The Wall Street Journal, if a media company isn’t producing podcasts, they’re falling behind. For many of these companies, there are fewer rights restrictions on their audio content. For example, while the New York Times website limits me to ten articles per month before asking me to pay for a subscription, I can listen to their daily podcast, aptly named The Daily, for free, without limits. The same principle applies to aggregation platforms. While many media outlets seem to be getting stingier about allowing their articles to be accessed through other platforms, this doesn’t seem to be the case with podcast aggregation platforms.

However, despite the wealth of content that has become accessible through podcasts, there seems to be a consensus that both as a media format and business model as a whole, podcasts are far underperforming their potential. This is due to a number of issues, all of which Bytedance is uniquely suited to solve-and profit from.

The podcast industry is chaotically fragmented. It is in desperate need of centralization, and the efficiencies and monetization that can come from that.  What Google did for the internet as a whole, what Youtube has done for video, and what Jinri Toutiao has done for digital content in China, someone needs to do for podcasts.

Apple’s lazy pace, and missed opportunities

Apple’s iTunes is by far the worlds’ most popular podcast player—as well as being the first—offering directories, a rating system, and features which are now standard on most podcast apps. However, since their initial centralization, they have done little to continue to capitalize on the opportunities that present themselves in the podcasting space.

In June of last year, they did announce some slight advancements, offering in-episode analytics, allowing podcast producers (and likely advertisers as well) to view what parts of each episode are listened to, including whether or not listeners skip over the ads.

This is certainly an improvement, but really only a drop in the bucket. Apple seems reluctant to fully commit to being the centralized aggregator that the podcasting industry needs, an aggregator that Ben Thompson envisions would look like this:

  • The centralized aggregator would likely offer hosting to podcast creators, not only to secure the user experience and get better analytics (including on downloads through other apps) but also to dynamically insert advertisements. Those advertisements would also be available to smaller podcasts that are currently not worth the effort to advertisers.
  • Advertisers would get their own dashboard for those analytics and, more importantly, the opportunity to buy ads at far greater scale across a large enough audience to make it worth their while. Ideally, at least from their perspective, they would actually be able to target their advertising buys as well.
  • Users would, at least in theory, benefit from a far broader array of content made possible by the growth in revenue for the industry broadly.

Why hasn’t Apple capitalized on this opportunity? It’s hard to say for sure, but most convincing arguments center around the company’s identity and priorities. After all, Apple is a phone company. They specialize in making user-friendly and stylish hardware and operating systems. A hard shift into end-to-end audio content aggregation and an advertising-based business model would require a fairly dramatic overhaul of their business model, organization, and brand. When they have dipped their toe in the advertising waters, it hasn’t turned out that well, so diving in seems unlikely.

And then there’s Midroll, the podcast advertising network which acquired Stitcher in 2016. They have a few pieces of the puzzle already put together, but likely lack the financial resources or tech capabilities to become an aggregation giant. However, they would make an interesting acquisition target for a company that did…

Bytedance, on the other hand…

There are a few key areas in which centralization can have a dramatic effect. The first is through search and recommendation. Since podcasts are in audio form, searching for appropriate podcasts has long been challenging. Most search functions only search through titles of keywords, which makes search results easy for savvy podcasters to manipulate. They also tend to not search specific episodes of podcasts, just the names of the podcast series’ in general. If you’re looking for specific information, or an interview with a particular guest, finding that has long been tricky.

This is beginning to change with the advent of Natural Speech Processing (NLP) algorithms which can automatically transcribe audio into text, allowing for far more precise data on the content of each audio file.

There is one podcast app that has begun integrating audio-to-text technology into their search: an app called Castbox. Founded in 2016, Castbox is already one of the most highly-recommended podcast apps on the Apple App and Google Play stores. The brainchild of a former Google engineer, Castbox allows users to search by podcasts series title, episode title, and in-audio text. In October, they secured $12.8 million in A-round investment. I wrote a piece on them in January.

But Castbox is still a small startup, with limited resources. While it may be the best podcast app out there, it is far from reaching its potential. If Bytedance were to acquire or invest in Castbox and apply their resources and recommendation engine, it could revolutionize how the world consumes audio content.

Consider these scenarios:

Jason likes listening to the news every morning but feels as though the negativity of many news podcasts cause him to start his day in a bad mood. A sophisticated recommendation engine, coupled with the data provided by the speech-to-text algorithm, would be able to comb through the words used in each podcast and identify each’s ratio of positive words to negative words, recommending a news podcast that is a bit more upbeat.

Lucy speaks English, but it is not her first language. She also has never lived in an English-speaking country and is frustrated when cultural references are used that she doesn’t understand. The recommendation engine evaluates the level of vocabulary and complexity of language used in each podcast, as well as the speed of the speech in it, recommending one that she can easily understand and enjoy.

Janice has an 11-year-old son who enjoys listening to podcasts, but Janice is concerned about they are appropriate for children. A powerful AI could detect the subject matter and language used in each podcast, and set different levels of parental controls, so Janice can be confident, knowing that her son is only listening to content suitable for children.

Improvements in recommendation could be a world-changer for podcast producers as well, who often have to rely on their own networks, or SEO manipulation tricks to bring attention to their content. “I rely largely on my own networks on social media to get the message out about my podcast,” explains Lauren Hallanan. “An accurate recommendation engine would be very helpful.”

One more thing about Castbox: They’re a Chinese company, based in Beijing. So here is Castbox’s profile: Chinese company, popular overseas, with standout tech that could be exponentially improved, with a top-notch recommendation engine. Certainly sounds like Bytedance’s “type,” right? Methinks…

A match made in podcast distribution heaven.

But this is just the start of how Bytedance-orchestrated centralization could revolutionize the podcasting universe. Currently, despite their success as a content medium and potential for advertisers, podcasts currently suffer from crippling fragmentation across their value chain.  The media companies who produce them (The Ringer, CNN, etc) are separate from the platforms who host them (Soundcloud, Podbean, etc), which are usually separate from the apps that curate and play them (Castbox, iTunes), which are separate from the centralized ad sellers (Midroll). Each player has a piece of the data picture, but without being able to centralize and organize the data, it doesn’t mean much. Without useful data, its difficult to build a reliable and targeted ad model, and without a convincing value proposition to advertisers, it’s difficult to monetize content.

To make matters more difficult, since podcasts are downloaded by users all over the world, with relatively small numbers of listeners for each podcast, they are both nearly impossible to survey, and not worth a large-scale ad buy. In the words of tech industry analyst and blogger Ben Thompson, “podcasts suffer from being both too small and too big at the same time.”

As a result, podcast advertising is nearly entirely limited to transaction-initiated subscription-based services, which Thompson explains this way:

The “transaction-initiated” bit means that there is a discrete point at which the customer can indicate where they heard about the product, usually through a special URL, while the “subscription-based” part means these products are evaluating their marketing spend relative to expected lifetime value. In other words, the only products that find podcast advertising worthwhile are those that expect to convert a listener in a measurable way and make a significant amount of money off of them, justifying the hassle.

Regular podcast listeners will likely be very familiar with brands like Harry’s Razors, Blue Apron, and Squarespace, all subscription-based services that offer a discount if the listeners use a special URL. This is because, under the current podcasting system, they are the only products for whom podcast ads offer a tangible ROI.

Bytedance, more than perhaps any other tech company in the world, has content centralization, aggregation, recommendation, and targeted advertising in its very DNA. Take a second, scroll back up, and read Ben Thompson’s description of what an ideal podcast aggregator would look like.


Am I mistaken, or is that not precisely what Jinri Toutiao has done with written content in China? It is hard to imagine a company in the world more suited to revolutionize the audio content industry.

So why not?

It’s hard to know for sure, but here are some possible theories:

  • Podcasts aren’t popular in China. In fact, they seem to be mostly confined to the anglophone world. Even for my tech-savvy millennial friends in Beijing, they seem confused when I mention podcasts. It’s highly possible that given a lack of familiarity with the medium, Bytedance decision-makers may not feel comfortable in such territory.
  • The monetization paradox. As explained above, podcasts aren’t exactly printing money these days, and therefore attract less investment from those looking to centralize the medium… and because podcasts aren’t centralized, they’re difficult to monetize… and around and around we go…
  • My own bubble. The demographic for whom podcasts are most popular is middle-class white American men in their 20s and 30s. I am a 31-year-old middle-class white American man. I also listen to a ton of podcasts. My proximity to the medium may not provide the necessary perspective.
  • Unknown factors. I don’t pretend to know what’s going on in Bytedance’s leadership meetings, so there could be any number of reasons miles away from my radar or that of those I know.

Regardless, podcasting is poised to be disrupted, and no company has the capability to do it more than Bytedance. As they make their global expansion, it may be at least worth a try. After all, there’s far less fake news in audio form…

Elliott Zaagman is a contributor to TechNode. He is also a corporate trainer, executive coach, and writer who splits his time between Bangkok and Beijing. He focuses on Chinese companies and how they relate...

Leave a comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.