Editor's Note:
With the global economy, including China's, facing challenges, some Western politicians and media outlets have stepped up smear campaigns against the world's second-largest economy. They selectively highlight information and distort facts to promote their narratives like "Peak China," ignoring the nation's resilience and growth potential. The Global Times' multimedia project offers in-depth analyses and balanced perspectives on the Chinese economy. This is the 18th installment in the series.
At Ningxia Data Labeling Industrial Base in Wuzhong, in Northwest China's Ningxia Hui Autonomous Region, on December 8, 2024, young annotators are busy identifying specific words in text or speech, outlining objects in images or videos, and tagging them on their computers. Photo: Chen Tao/GT
The rapid development of artificial intelligence (AI) and large language models has attracted a growing number of young people to the emerging industry.
In 2020, the market size of the data annotation sector in China reached 3.1 billion yuan ($425 million). By 2025, the figure is expected to reach 10.5 billion yuan, according to research services provider Newsijie.com.
The Global Times recently paid a visit to the Ningxia Artificial Intelligence Industrial Park in Wuzhong, Northwest China's Ningxia Hui Autonomous Region, to explore big data and AI development as the new productive quality forces, which are expected to shore up China's economic growth.
Data annotation involves attributing, tagging, and labeling for images, audio, text, and video making, so as to train machine learning algorithms for better accuracy in understanding and classifying information.
To meet the demand for high-quality and precise data labeling, the National Data Administration in April announced the establishment of seven data labeling bases in seven Chinese cities.
The move is expected to allocate professional labeling resources to label data at high quality and large scale, improving data quality for training AI models. It could also help explore new data labeling methods, driving the overall advancement of the AI industry, experts said.
Data annotation
As the backbone of AI growth, data annotation has emerged as a promising profession, with over 90 percent of the workforce in this sector being data annotators.
Inside a former shopping mall in Wuzhong city, young people were seen meticulously tagging and processing images, videos, and text. Liu Renming, a veteran in AI, runs a company employing some 200 annotators who meticulously tag and process images, videos, and text.
Founded in 2016, DreamDate, an AI data provider, launched its fourth data annotation base in Wuzhong in June. Previously, the company has set up data labeling bases in Jiangxi, Jiangsu and Anhui provinces.
At Ningxia Artificial Intelligence Industrial Park, where DreamDate is located, Liu Yue, a 26-year-old data annotator shared her story about how she learned skills like text recognition and point cloud annotation.
"I used to be a preschool teacher in Beijing, but now I work as a trainee project manager at a data annotation company in my hometown, with a stable income of more than 4,000 yuan per month," Liu Yue told the Global Times.
"At first, I had little knowledge about data annotation. Terms like text recognition, bounding box selection, and fitting were fresh to me. Tasks, such as collecting point cloud data to identify and annotate vehicles, pedestrians, and traffic signs to train AI models used in autonomous driving, are areas I had never encountered before," Liu Yue said, adding that the job requires a high level of sensitivity, focus, and physical endurance.
Fan Min, a 34-year-old mother with an accounting background, echoed Liu Yue. Driven by curiosity about AI, Fan chose data annotation as her first job after re-entering the workforce.
"For beginners, data annotation work is relatively simple and easy to start. However, each project has unique rules that must be quickly understood and mastered, according to the data providers' requirements," she noted.
AI is empowering various industries at an accelerated pace in China and has created numerous job opportunities for Wuzhong. "A majority of data annotators at the Wuzhong base are locals who once worked in the service industry, such as delivery, hospitality, and catering, with some being jobless," Liang Kun, an official from the Ningxia Artificial Intelligence Industrial Park, told the Global Times.
Currently, the industrial park in Wuzhong has employed 600 locals, with 62 percent aged 16-24, 29 percent aged 24-30, and notably, over 90 percent holding a college degree or higher, according to Liang.
The Future of Jobs Report 2023 released by the World Economic Forum said that nearly one-quarter of all the present jobs in the world are likely to undergo change by 2027, with 69 million new jobs to be created. Some of the fastest-growing jobs include AI engineers and machine learning specialists.
Surging demand for talent
As data becomes a new type of indispensable input, playing an increasingly vital role in bolstering industrial digitalization, data labeling is essential for advancing the AI industry and drive economic growth, Pan Helin, a member of the Expert Committee for the Information and Communication Economy under the Ministry of Industry and Information Technology, told the Global Times on Tuesday.
As companies have rushed to build AI big models, the demand for data annotation or data labeling work has increased. The data annotation industry has faced a talent gap of nearly 30 million, Liu Renming told the Global Times.
Data annotators play a vital role in converting raw data into training data for machine learning - essential for optimizing AI systems. The daily tasks for a data annotator encompass a range of operations, including image recognition, voice transcription, and text classification.
The AI workers need to meticulously annotate images, videos, or audio content according to project requirements from their clients - most of which are big tech giants such as Huawei, Baidu, TikTok, as well as autonomous driving players like LiAuto, Nio, and BYD.
Their work underpins sectors like autonomous driving and health care and image recognition. For instance, in autonomous driving, annotators must label images and video data from the vehicle's driving process, including the identification, and positioning of road boundaries, traffic signs, obstacles, and other critical information.
As annotators gain experience in the industry, their starting salaries range from 1,000 to 2,000 yuan during the first six months. After the initial period, their earnings may rise to 3,000 - 4,000 yuan. With a year of experience, the average salary for annotators is expected to surpass 4,000 yuan, with project supervisors earning up to 10,000 yuan.
According to Liu Renming, the data annotation industry has seen significant changes in the last decade. "Emerged from the collection of big data, data annotation was originally handled by software engineers and programmers, and has progressively transformed into what we see today," she added.
However, data annotation used to be underestimated and misunderstood. Data annotators were seen as "assembly lines workers" in the age of AI and were often labeled as workers at the bottom of the AI industry, in which they suffer from exhausting physically demanding tasks, instead of engaging in cognitive efforts.
The industry has faced challenges such as the shortage of skilled workers, youthful workforce, and high employee turnover, Liu Renming said.
Looking ahead, companies are increasingly searching for annotators armed with specialized expertise, which may put those with more general skills at risk of being sidelined. "This trend is expected to result in a transformation of the talent landscape within the data annotation sector, ultimately raising professional standards and fostering high-quality growth within the industry," Liu Renming said.