Since late February, China has relied on “health code” apps to control re-opening. A green code in your city mini-app gets you into markets, office buildings, and public transport. Yellow or red? You could be barred from your train, or even sent into self-isolation. 

These apps follow the ebb and flow of the virus. Though both may occasionally disappear, whenever China sees a new local outbreak, health code checkpoints are never far behind.

Like most Chinese initiatives, it’s a diverse patchwork of local solutions, improvised to manage a crisis. Cities and districts have their own health codes, sometimes creating contradictions: one Financial Times reporter was told on returning to Beijing to ditch the municipal-level app and use their district’s own. 

So what are all these codes based on? What data are they collecting? How long will they be around? 

The system’s extremely opaque—and it keeps changing—so answering these questions is hard; there isn’t exactly a CEO of Health Code to help out. But just for you, TechNode dug into the most detailed technical information available—standards issued on April 29 for the national health code system. 

They include information that helps us understand what information the system collects—and links it to a national system that could stick around long after the pandemic is done. We’ve put together a FAQ to walk you through how we’ve gotten here, and what comes next.

Keeping score

First things first: for this article, we’ll use “health code” to refer to individual city-level apps and frameworks, and “health code system” to refer to their collective China-wide use (which, remember, is decentralized). 

What data’s in the health code?

The standards outline a surprisingly modest list: 

  • Travel history (up to the district/county, or qu/xian, level)
  • Directly related health information (e.g., temperature, symptoms)
  • Overall medical test results
  • Overall risk assessment

Personally identifiable information like residence community (shequ) is also required. Beijing’s most recent outbreak used shequ-level risk assessment, as TechNode editor David Cohen experienced, with some shequs deemed medium- and high-risk while others remained low. 

Huh, seems reasonable. But wait—where’s this data come from, then?

According to the standards? Just about anywhere. Potential sources include: 

  • Diagnosed cases
  • Close contact history
  • Nucleic acid tests
  • Antibody tests
  • Self-reported data
  • Temperature data at checkpoints
  • Data from phones remaining in at-risk areas for too long

…and, you know, a couple of others.

This isn’t new. The health code system already cross-checks government data on train and plane tickets, according to a GovInsider interview with Yong Lu, vice president at the Shanghai Data Exchange Corporation. He also says it uses location-based data from telcos.

That seems like an awful lot of data. Is it actually all being used?

Great question. The standards don’t advise on that, and given how complex and varied these data sources are, that decision will probably be handled locally. 

Data from private companies is especially touchy, as these companies are reluctant to hand over customers’ personal data. During the recent Beijing outbreak, Alibaba and Tencent jointly denied a rumor that they were doing so.

It’s also not clear how detailed the location-based data is, and TechNode contributor Dev Lewis notes an absence of high-precision location-based data such as mobile payments (supplemental reading: Lewis has also gone deep on the standards docs). That doesn’t guarantee individual apps aren’t using such data, but Beijing’s recent struggle to trace contacts from the Xinfadi market outbreak suggests that at least some apps aren’t capturing that.

So are health codes permanent?

The standards aren’t definitive, but their foreword references taking a “long-term” perspective, so it sure sounds like they’re thinking about it. As we’ll discuss in a second, China’s trying to centralize government services and medical information, so the system’s likely to stick around.

Post-pandemic, the standards say health codes could even turn into a general-purpose “medical history code,” used in medical treatment, elderly care, and so on. When Hangzhou tried expanding health code use recently, it got quite a bit of flak

Going national

So then this “national health code system”… if so much is kept local, what’s left to do?

According to the standards, the “national platform” sitting above local apps would be more like a directory or catalog emphasizing interoperability, rather than a national-level database.

According to the standards, the health code system should be organized to achieve mutual recognition between different local apps. (Image credit: Standardization Administration of China; translated by Shaun Ee)

Seemingly, individual regions would operate their own databases, but the national platform would provide a “table of contents” where, for example, one province could look up another province’s information. 

Think about a library system. If you were searching for a book, it’d be really neat to know its location and some basic details. Right now, every “shelf” in China is using a different organization system, so book hunting takes forever. And China’s “library” has grown pretty fast: by March, over 200 cities were using health codes, with limited discussion of compatibility.

But…?

But to our knowledge, that “library catalog” doesn’t fully exist yet. According to Wang Zhong, associate professor at the Beijing Academy of Social Sciences (BASS), China’s current healthcare information sharing platforms are at the prefectural or provincial level and not nationally managed. 

By themselves, the standards can’t create that sort of catalog. What they can do is make sure provinces are collecting the same data, so making that catalog is easier. But we’ll talk more about that later. 

So, just to be clear, this national platform isn’t going to be, like, a single mega-database of all the health information on every Chinese citizen ever?

All of it? Ever? Uh, probably not. It would be hard to unite diverse health information data types, like images and scans, plus there’d be significant privacy and cybersecurity concerns. 

That said, increased national-level data sharing still carries some risks. For example, earlier in May, someone leaked a 640,000-row dataset documenting case updates in over 230 cities to Foreign Policy.

Plus, in the longer run, the standards talk about connecting all local health code systems to an integrated (Yitihua) platform for government services, as Dev Lewis has noted about Yitihua’s health-related dimension. Yitihua isn’t the same as our “national platform” for health data—it’s much bigger.

No, really going national

In full, Yitihua (in Chinese) is the “nationally integrated online government affairs service platform” (quanguo yitihua zaixian zhengwu fuwu pingtai). It’s been around since well before the pandemic. The State Council (in Chinese) has been pushing its development since 2018 (in Chinese), and it actually went up in November 2019, but with relatively little fanfare—it’s still in beta mode. 

In other words, it’s not just for healthcare: the aim is to have a one-stop shop for all government business. That’s not an ambition unique to China—other (smaller) countries like Estonia and Singapore have done it.

The health code system could be a jumping-off point for Yitihua in the health department because standardizing it is easy. Wang from BASS told Technode the limited range of medical data required for the health code system is easier to collect, and that provinces are already collecting this information in relatively standardized formats.

So how do Yitihua and the health code system fit together?

There actually aren’t many specifics. As early as February, news articles (in Chinese) talked up the importance of Yitihua in epidemic control, but other than the creation of a national health code app (in Chinese), details are scarce about how it might integrate information. That app is far from universally used, and even the standards say it does not replace (in Chinese) existing regional apps.

This diagram from the standards shows how health information in Yitihua relates to the health code system, for example, but the floating boxes don’t much clarify how data sources are plugged in.

An illustration of the framework for the anti-epidemic and health information service system integrated platform, according to the standards. (Image credit: Standardization Administration of China; translated by Shaun Ee)

Still, Yitihua matters because it’s a long-term commitment to national-level data integration, particularly across government departments. In that light, it’s worth seeing the standards not as definitive, but as one step in a long (but determined) journey toward greater integration.

One standard to rule them all

So looks like we’re back to the standards. What’s their role?

To have Yitihua and the health code system, you need uniform data types and interoperable databases: hence, the standards. 

Just to be clear, interoperability doesn’t mean identical implementation. Per the standards, implementation is still very much decentralized: regions get to decide risk levels and how to use them. Conveniently, they’re also in charge of dealing with any complaints.

Where did the standards come from?

China’s main standards body, the Standardization Administration of China, released them on April 29. They’re technical papers intended to allow different actors to create compatible systems.

They consist of three documents: a description of how different databases should interact, the basic requirements for an application interface, and data formats to ensure compatibility.

Who put them together?

A mix of private companies and government agencies. Companies involved included Baidu, Alibaba, Tencent, and China Electronics Technology Group, while government agencies included central agencies like the General Office of the State Council, as well as provincial ones from Zhejiang, Guangdong, Shanghai, Hebei, Guizhou, and others.

It took them fourteen days from press release to create, which is unusually fast. An engineer familiar with national standards (guobiao) told TechNode it normally takes about two years of negotiation to create a draft standard.

So what now from here?

We wait and see. The standards might be out, but that’s no guarantee of how they’ll be applied. 

But more than that, the standards can’t instruct provincial or central organizations to actually build data-sharing frameworks. After all, according to a 2019 paper by Wang, China has talked about central platforms for medical records since 2016, but has been stalled by the diversity of local medical systems. 

If there ever were a reason to get a move on with that project though, what better one than a pandemic? One day, we may well look back at that January in Wuhan and see it as the moment that injected new urgency into the national platform’s veins.

Shaun Ee

Shaun Ee is a Yenching Scholar working at the intersection of geopolitics, tech, and national security. Before moving to Beijing, he was assistant director at the Atlantic Council’s Cyber Statecraft...