Users are unknowingly training WeChat’s realtime image filtering system: researchers

2 min read
(Image credit: TechNode)

Research published this week has brought to light the novel methods Tencent uses to censor and limit the proliferation of “sensitive” images in realtime on popular messaging app WeChat. The report claims that users are unknowingly contributing to a database of blacklisted images.

Why it matters: Chinese companies are required to police content on their platforms to avoid government censure. The methods these firms use to filter content are largely complex and clandestine.

  • Realtime categorization of images is computationally intensive and more complex than analyzing text sent within a chat.
  • Tencent has found ways to minimize processing times, with users of WeChat not even realizing that a photo failed to be delivered.
  • Some companies employ thousands of content moderators but are also using technology to automate the process.
  • WeChat claims to have more than 1 billion daily active users worldwide.

“Tencent implements realtime, automatic censorship of chat images on WeChat based on text contained in images and on an image’s visual similarity to those on a blacklist.”

—Citizen Lab researchers Xiaong Ruohan and Jeffrey Knockel

A Tencent spokesperson refused to comment when reached by TechNode on Wednesday.

Details: The report, published by University of Toronto’s Citizen Lab, claims that users who send images on the app help to populate a blacklist of sensitive photos that are categorized and given a unique “hash” fingerprint.

  • When a user sends an image, WeChat checks to see if an image’s fingerprint, which is the same for identical images and is easy to compute, has been included on a blacklist. If it has, the image is prevented from reaching the intended recipient, according to the researchers. The process is completed in realtime.
  • If the image is not on the blacklist, it is sent to the recipient. However, the image is then retroactively analyzed for sensitive content.
  • Text in a photo is analyzed using optical character recognition. The image’s likeness is also compared to others on the blacklist for so-called harmful content. This process takes a longer time to complete.
  • If unwanted content is found, the image’s fingerprint is added to the blacklist.
  • WeChat’s Newsfeed-like feature Moments and group chats are typically more heavily scrutinized that one-one-one conversations, the researchers found.
  • WeChat’s censorship is reactive to big news events, the researchers said, including the arrest of Huawei CFO Meng Wenzhou in Canada earlier this year, China-US trade tensions, and US elections.

Context: Regulator-imposed cleanup campaigns of the Chinese internet have become more frequent and far-reaching in recent years. Companies that do not comply are held liable through suspensions of their operations and fines.

  • Late last year, WeChat pledged to strengthen its censorship mechanisms in order to crack down on pornographic and vulgar content on social media accounts.
  • The move formed part of a campaign spearheaded by the National Office Against Pornographic and Illegal Publications, which began in April 2018.
  • Administrator of group chats can be held responsible for the content shared in the groups they operate.