“AI gap” believers make much of China’s advantages in data, especially vis-a-vis the social credit system. Unbound by privacy norms, the story goes, the state is hoovering up data that will allow the world’s most populous nation to train AI on the world’s largest data sets. Beijing has repeatedly vowed to make social credit data available to the public.
But how much of this data is actually making it to users? Not too much, finds this week’s translation—a heavily abridged academic article from the Journal of Library and Information Science. Based on a review of data availability across local governments, the researcher finds that data provision is patchy and poorly standardized.
Like so many aspects of the social credit system, the objects in view may be further away than they appear.
An analysis of open government data and credit service platforms in China
Author: Xiao Diyu
Original Publisher: Journal of Library and Information Science (2019, Volume 7)
Source: Yuandian Credit https://mp.weixin.qq.com/s/xugzyprTafQYRaoP8Ve9bQ
Outline: Accelerating the opening of government data is a priority at all levels of government in China, and is a hot topic at the forefront of e-governance and information management.
The government is the primary data-holder of society and therefore government data is an indispensable part of so-called “big data.”
The government holds more than 90% of the data resources of the entire society, and among the various types of government data, the opening of government credit information has always been one of the focuses of the public.
Government credit information is usually defined as: information such as administrative licenses, penalties, and awards that are generated and recorded by the administrative agency.
Opening up: In recent years, various local governments in China have actively explored numerous ways of opening government credit information. Among them, one of the most popular methods is the provision of government credit information through the credit service section set up in the government open data platform.
Providing data: Business, quality inspection, taxation and housing management departments in various regions are the main information providers, with data that must be differentiated and classified by attributes or specific characteristics.
The paper finds that some systems classify data by themes—e.g. inspection, industry, construction—while others classify by government offices and industrial sectors.
Data update frequency
The frequency of data updates, and commitment to maintaining this rate, is an important criterion for data timeliness assessment. Following a statistical analysis of the update frequency of credit information in the platforms of various provinces and cities, the study has found that:
- Seven of the platforms provided the credit information renewal frequency, with Beijing failing to respond
- Jinan and Guangzhou have a relatively high number of data sets that do not clearly indicate the update frequency.
- Looking at the seven data platforms, 68.2% of data is static (including annual, on-demand, irregular updates and not clearly indicated), and 31.8% of the data is dynamic data (including quarterly, monthly, weekly, daily, and real-time updates).
Usage rights: Usage rights is an important dimension for evaluating the openness of platform data. It mainly includes the following three aspects: “the freedom of data,” the right to freely use data, as well as the free dissemination and sharing of data.
In terms of “free data,” in comparing the data open-licensing agreements in various regions, Beijing, Jinan, Guangzhou, and Shenzhen all point out in the agreement that at this moment all users have the right to free access to all government data resources provided by the website. Jiangxi Province and Chengdu pointed out that users who have successfully registered through the platform have the right to access and use the data resources free of charge. Guizhou Province states in the agreement that “all data services provided by the government on the platform are free.”
In Shanghai, the explanation of what is meant by “free data” is more detailed. It indicates that users who have successfully registered and verified can obtain the existing open data for free, and have the right to free access to the open data, according to the application.
Second, in terms of the right to freely use data, Beijing, Shanghai, Guizhou, Guangzhou, and Chengdu have clearly guaranteed the user’s right to use data freely in its platform agreement. Beijing, Jinan, Guangzhou, Shenzhen, and Chengdu all require users to indicate the source of data in their research results, file on the website in a timely manner any use of data, as well as actively cooperate with relevant user demand surveys and data resource surveys. Shanghai and Guizhou Province require users to clearly indicate the source of the data and the date of download from the platform in the results.
Finally, in the free dissemination and sharing of data, only Jiangxi Province did not make any explanation. The cities of Jinan and Shenzhen require users to transfer various data resources acquired on the platform without compensation or free of charge. Shanghai does not allow users to transfer data with compensation. All the agreements clearly state that users must comply with relevant national laws when disseminating and sharing acquired data.
As China’s social credit system improves, it’s also important to increase the openness of government credit information. Deficiencies still exist, mainly in the following aspects:
- The amount of credit information provided is generally small. Shenzhen, with 39 data sets, has the most in the nation, while Jiangxi Province is the lowest with six data sets.
- The quality and quantity of credit information on platforms generally isn’t ideal. Judging from the information provided, current data sources are mainly concentrated in departments of industry and commerce, quality inspection, and taxation and housing management, while other government departments lag far behind. Many government departments have few or no data sets on the platform.
- The credit information data is not standardized.
- Some provisions limit the legal use of the data. This situation is obviously contrary to the basic principles of government open data.
- The total amount of visits and downloads of credit information is low.
- The quality of the platforms themselves are low. Data visualization is poor. Only a few data sets on a few platforms allow data preview and statistical analysis. They are also inconvenient to use.