Microsoft is continuing to integrate and adapt its services in China, executives from the company announced at an AI developers’ conference in Beijing on May 21. To help win more clients in China, Microsoft’s data centers are to triple, its cloud services to be boosted, an open platform with leading universities created, software tailored to government needs, and the XiaoIce voice assistant has learnt to convert books into fully dramatized audio shows for Chinese children.

Government integration

Opening the “Microsoft AI Innovate” event of 1,000 developers, Alain Crozier, chairman and CEO of Microsoft Greater China Region said, “We’re the first global cloud partner to provide a fantastic, compliant and legal cloud with Azure and Office365 in China. With Windows 10 Government Edition we are designing the first ever Windows 10 for government and SOEs.”

Here “global” should be understood as “foreign” as Chinese competitors are operating around the world. Microsoft’s Azure cloud platform is in direct competition with the fast-growing domestic suppliers such as Tencent Cloud, Baidu Cloud and Alibaba’s Aliyun and party to the country’s strict data handling legislation and is now having to adapt to compete.

Alan Crozier Microsoft AI Innovate
Alan Crozier opening the event in Beijing. (Image credit: TechNode)

After the US, China is the largest market for Microsoft’s cognitive services according to Crozier, who said that the company was producing the best solutions for Chinese customers and developers. The  company now has more than 110,000 Azure customers in the country. Crozier announced a tripling of data centers in China to meet the growing demands of customers and 100,000 Chinese developers working in the Microsoft ecosystem.

Looking more broadly at Microsoft in China, the company is putting significant resources into the country. It opened its first office here, in Beijing, in 1992 and now has over 5,000 employees and works with 17,000 partners. China is its largest research base outside the US.

Partnerships and products

The conference was a chance for Microsoft to talk about its new partnerships and products. Harry Shum, executive vice president of Microsoft’s AI research group, described Azure as the “world’s computer” and also the safest in the world. The aim to “empower every individual and organization” sees Microsoft working with Chinese partners in many different fields.

Harry Shum Microsoft AI university platform
Harry Shum announcing the launch of Microsoft’s open AI platform with four Chinese universities (Image credit: Microsoft)

The software giant is creating an open AI platform with Peking University, Zhejiang University, Xi’an Jiaotong University and the University of Science and Technology of China. Microsoft will train lecturers and students with online training tools and make data sets available for AI projects.

A speaker from DJI explained how Microsoft helped build an SDK to let developers quickly code for different drone use cases. Data from the drones can be sent directly to Azure where Microsoft services can make rapid diagnoses of, for example, problems with pipelines. Providing pre-built and customizable solutions such as these are part of Microsoft’s efforts to “popularize and democratize artificial intelligence,” said Shum.

Dr Huang Xuedong, who leads Microsoft’s speech and language research, explained the progress of his team. Its AI translation services are the first to surpass professional human translators in double blind testing, said Huang (which was translated into English by humans for conference-goers). China Mobile, with its 900 million subscriptions, is partnering with Microsoft for an AI call centre, Huang announced.

Huang stated that his company’s microphone technology is now the most advanced in the world which is helping it move voice recognition beyond the current need for just one voice at a time, close to the microphone.

One of the outcomes is Microsoft’s world-first AI conference system, in a partnership with Chinese firm Roobo. The product is a slender black cone that is placed in the middle of the meeting room table. It has 8 microphones which can detect and follow multiple people. Huang mentioned that even five is no problem. In a live demo, a team of four colleagues had a meeting around the device. Speaking in perfect Mandarin, a real time transcript was created with each person identified when talking and even actionable points pulled out and put in another column.

Microsoft Roobo conference AI device
Video screen enlargement of the live demo conducted off-stage. Actors simulate a meeting around the listening device made in collaboration with Roobo (Image credit: TechNode)

Voice synthesis is another area of intense research for Microsoft. Huang introduced which collects a user’s voice to then allow a synthesized voice. This allows users to create customized voice such as telling children a story with their mother’s voice, according to Huang. (See below for more on XiaoIce.)

Microsoft Roobo AI meeting transcription
Transcript of the meeting generated by the Roobo device in real time. Participants are recognized and named in the minutes and actions listed in the righthand column (Image credit: TechNode)

Good old Office365 is being updated. Outlook can handle voice recognition and will instantly translate the message into another language. Scanning developments will read any attachments and block the sending of any files thought to contain sensitive company information. Powerpoint can now use AI to recognize what is in the images loaded into its slides and offer automatic resizing and optimized layouts. Excel will do more do understand Chinese data to automatically populate columns such as province based on an address. It will then instantly render a 3D map data visualizations. Despite the other arguably more exciting developments, it was the Office365 stall that drew the biggest crowd after the speeches.         


Microsoft’s AI social chatbot XiaoIce got a great deal of coverage.

The software stands apart from other social chatbots (which have around 200 million users across Asia) due to achieving “full duplex” in its China and Japan versions. “Full duplex” may sound fancy, but simply means the system can speak and listen at the same time. Walki-talkie communication is “half-duplex” whereas a phone call is full duplex. XiaoIce can now call users and have a realistic conversation, albeit in her manga-style voice. She’s made over 6 million calls so far.

In the demo she came across a bit pushy, but the technology worked seamlessly. This full duplex capability is now going to be made available to developers this autumn.

XiaoIce can take the lead in conversations, change topics, and keep up with threads. She has even managed a six-hour conversation–the longest in the world–where she performed 16 tasks. She now even detects whether the user is seeming hungry and uses their current location to suggest nearby food options. Li Di, of the XiaoIce team, told the conference how XiaoIce generates coupons for nearby shops which have a much higher use rate than users receiving coupons in other ways. Word of mouth even from a chatbot seems to be persuasive.

As well as hosting CCTV1 programs and singing songs (she is being made to sound more human when doing KTV, even taking audile breaths), she’s also been creating commercials and writing poetry.

XiaoIce has also been taking drama lessons from Uncle Kai (凯叔) of Uncle Kai’s Stories (凯叔讲故事), an audio book app. Unlike most students, XiaoIce has also sucked up his voice to tell stories through him. Conference-goers were asked to pick out which of three heartfelt recordings was not by Uncle Kai. Surprise: they were all XiaoIce.

The software is now being tweaked so that it can be given a text file of a children’s story, read and understand it and then script and narrate a full audio book with musical interludes. In 24 minutes. It will be possible to customize the story by telling the software the child’s name so he or she can be included in the narrative. With the customized voice synthesis also being developed by Microsoft, XiaoIce could take on a parent’s voice to tell the child a story.

“It will be able to accompany [培养 also meaning ‘to raise’] children, teaching them good life habits, and through the stories understand manners better and to not be picky when it comes to food,” said Huang Xuedong.

This new way to “accompany children 24/7” will launch in time for Children’s Day, June 1.

Frank Hersey is a Beijing-based tech reporter who's been coming to China since 2001. He tries to go beyond the headlines to explain the context and impact of developments in China's tech sector. Get in...

Leave a comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.