2017 was an auspicious year for Chinese AI firm iFlytek. In June, MIT Technology Review ranked the Chinese AI company as the 6th smartest company in the world, just behind Google’s parent company Alphabet and also the highest ranked Chinese company on the list. In November, the Chinese government announced plans (in Chinese) to build national AI platforms in partnership with four companies. iFlytek was tasked with the national voice AI platform. The three other companies chosen were Baidu, Alibaba and Tencent.
But it hasn’t always been smooth sailing for iFlytek. Founded in 1999, the company almost changed tracks to go into real estate after experiencing failed products, low revenues and difficulties in securing funding early on.
“In 2000, we had a meeting that is now famous in our company. At the meeting, someone suggested we go into real estate,” iFlytek co-founder Liu Qingfeng said during a segment of WeTalkTV (in Chinese). “But we made a choice that we would still make today. We said: ‘If you’re not behind voice recognition and speech synthesis technology, please leave’.”
Pioneers in Voice
Liu Qingfeng and iFlytek’s five other co-founders met at the University of Science and Technology of China (USTC) in Hefei, Anhui province. In the 90s, they were students working on speech recognition and synthesis technology (now categorized as natural language processing) at the human-machine speech communication lab. Liu led a team that won first place at a state-run high-tech competition in 1998. This caught the attention of Kai-fu Lee, now one of the most high profile investors in Chinese technology firms.
“In 1999, when I worked for Microsoft Research China, I tried to hire a top doctoral student from USTC, Mr. Liu Qingfeng, to work on speech recognition,” Lee wrote in a LinkedIn post that featured iFlytek in 2012. “But he was determined to start his own company. Starting a company in 1999 in China was no easy task, but Liu was determined, and started iFlytek, a speech recognition company.”
iFlytek’s first product was consumer-facing PC software called Changyan 2000 (畅言 or changyan means “speak freely” in English). It allowed users to give voice commands to the PC and also provided an input method that recognized handwritten script. The software package was priced at RMB 2,000–a significant amount of money even now–and advertised in over a dozen provinces in China.
It didn’t sell.
“Commercialization was very challenging for us. At the end of the year, there wasn’t even enough money to pay the staff,” Liu said in an interview with MoneyWeek (in Chinese) in 2008. Almost all of the team came from technical backgrounds and had little marketing experience. The other reasons that Changyan 2000 failed included software piracy and high operating expenses associated with the after-sale care of the software. Perhaps the biggest reason was that the consumer market was just not ready for speech recognition tech at the time.
Shift to Enterprise
After learning from their failures, iFlytek decided to go the B2B route. An initial contract to provide speech recognition and synthesis tech to Huawei’s internal platforms took some blood, sweat, and tears for the team to complete. But it worked out and turned into a long-term relationship. Other large clients followed, which included ZTE and Lenovo. Soon anything to do with voice tech, such as call centers, voice navigation, and telecommunications services in China, all used iFlytek technology. In 2002, iFlytek started to develop AI chips for voice recognition, which are inserted into home appliances and toys.
“iFlytek did it the hard way – they built the best technologies for speech recognition, found early adopters, and created a market that would otherwise be non-existent,” Kai-fu Lee wrote in his LinkedIn post.
In 2004, iFlytek began to turn a profit. From 2005 to 2007, the company maintained a compounded net profit growth of 135%. In 2008, Liu Qingfeng rang the opening bell at the Shenzhen Stock Exchange. iFlytek became the first company founded by university students and the first natural language processing company in China to IPO. However, Liu knew that the best time for voice technology was still to come.
“iFlytek probably has to toil for another two to three years,” Liu said in an interview with Yicai (in Chinese) at the time of the IPO in 2008.
All in AI
Fast forward to 2018, iFlytek has grown to a company with almost 10,000 employees and its AI technology is all around us in China, especially speech recognition and synthesis. Amap’s popular voice guide modeled by popular Taiwanese model Lin Chi-Ling’s is generated by iFlytek tech. If you come across a robot in an airport or hotel, that robot is most likely hearing your requests and replying to you thanks to iFlytek. An estimated 500 million people use iFlytek’s voice input method instead of typing on smartphones and computers.
“iFlytek now serves over 60% of the speech recognition and synthesis market in China,” iFlytek AIUI open platform supervisor Ding Rui told TechNode. For robotics, iFlytek estimated that over 80% of service-type robots uses iFlytek’s natural language processing technology. “Basically any robotics or AI hardware firm in China will consider our speech technology first.”
They have good reason to do so. The AI company’s technology has won numerous international competitions, including 8 times at the English text to speech Blizzard Challenge, the Google-hosted speech recognition CHiME Challenge in 2016 and the Winograd Schema Challenge, also in 2016.
Here’s a taste of iFlytek’s speech synthesis technology from this “rare footage” of US President Donald Trump speaking fluent Chinese (if you can’t see the Youtube video above, click here and start watching from 0:15).
Maintaining their Market-leading Position
While iFlytek is currently the market leader in voice AI technology, this position is increasingly contested as the AI race heats up. Baidu, Alibaba, and Tencent ranked lower than iFlytek on the MIT Technology Review’s list of the world’s smartest companies, but they are catching up fast in the natural language processing arena. Smaller players such as Sogou have also entered the race.
The AI company is staying on top of the competition by expanding into all areas to which AI can be applied. In education, iFlytek’s oral examination assessment technology has helped to assess over 1.7 million students sitting high school English oral exams in over 10 provinces. In the medical field, iFlytek is pushing transcription services to take down doctors’ notes and AI diagnostic medical imaging. In courts, iFlytek technology not only helps transcribe court proceedings but the company has also worked with the People’s Court to develop Project 206 (in Chinese), an AI system that streamlines the evidence collection process and provides suggestions for judges when assessing a case.
“This platform can determine within a second if the evidence collected for this case is complete or not,” iFlytek VP Jiang Tao said at a recent presentation in Tianjin. “What’s the most similar prior case to the current one? Then it’ll provide 3 suggestions to the judge based on statutes cited in the prior case, such as what is the crime, the length of the sentencing and the amount of the fine.”
iFlytek is also working with partners to provide its speech recognition technology for consumer devices such as Dingdong, Chinese e-commerce platform JD’s version of the Amazon Echo. A startups platform was launched in late 2017 to build an ecosystem of partners that are innovating AI applications. Shenzhen robotics firm Ubtech was one of the star companies to have come out of iFlytek startup ecosystem. Over 25,7000 developers use the iFlytek Open Platform to generate a myriad products and services based on its natural language processing technology.
If iFlytek had gone into real estate, they probably also would have made a ton of money right now. But this is a company whose heart is firmly in AI.
“In 1999, when we first started the company, we believed that in the future, every machine, every device, every toy, every car, would be able to ‘hear’ and ‘speak’ like humans do,” iFlytek VP Jiang Tao said at a recent event in Tianjin. “Later, we’ve extended that goal, [to let] every machine ‘hear,’ ‘speak,’ and ‘understand’.”