Close

HEADLINES

Headlines published in the last 30 days are listed on SLW.

Singaporean writers object to IMDA using their works to train a large language model

Singaporean writers object to IMDA using their works to train a large language model

Source: Straits Times
Article Date: 10 Apr 2024
Author: Charmaine Lim

One of the concerns is that AI software will start assimilating material that would otherwise be copyrighted, and adversely impact the livelihood of existing authors and publishers.

The local writing community is objecting to the Infocomm Media Development Authority’s (IMDA) plans to build a South-east Asia-focused large language model (LLM).

The National Multimodal LLM Programme (NMLP), “a base model with regional context that can understand Singapore’s and the region’s unique linguistic characteristics and multilingual environment”, was announced in December 2023.

But Singapore writers whose works would have to be used to train the LLM recently voiced their displeasure about the project.

An LLM is artificial intelligence (AI) that can understand and generate text responses after being trained on a data set of written materials. The issue of writers’ works being illegally copied for data used to train LLMs such as Llama from Meta and OpenAI’s ChatGPT has triggered lawsuits in the United States.

The IMDA sent out a survey on March 28 through Sing Lit Station (SLS) to gather writers’ responses on using their works to train NMLP. The April 7 deadline was extended to April 15, and the form will remain online indefinitely “to gather a full range of views”, with subsequent responses to be shared with IMDA on a rolling basis.

An IMDA spokesman said the SEA-LION (South-East Asian Languages In One Network) and NMLP were an effort to address how LLMs “may not fully represent our local language and context”.

He added: “This is a research effort to advance understanding of how we can achieve this. The intent therefore was to consult the broader community on how we might approach this research effort. Sing Lit was approached and we understand that they are in preliminary stages of gathering inputs.”

But writers are frustrated at the short timeframe and lack of clarity about usage and payment terms.

Ethos publisher Ng Kah Gay told ST: “If implemented without due consideration and safeguards, AI software will start assimilating material that would otherwise be copyrighted, and adversely impact the livelihood of existing authors and publishers.”

New York City-based transnational literary organisation Singapore Unbound raised concerns in an Instagram post on April 9, saying: “The survey does not state anywhere that IMDA recognises that the writings it is seeking are the intellectual property of the authors, or of the publishers to which the authors have sold the rights to their work.”

Boston-based Singaporean author Ally Chua told ST that while she appreciates the preliminary survey to engage authors, “the gist of the survey was entirely centred on sharing work to train LLMs – as if it was a foregone conclusion that usage of such written material is a go-ahead, and the survey and further discussion are just a matter of negotiation”.

“Our written works are not just blank ‘data’ that one can feed into a computer. This topic (using copyrighted works to train AI) is a contentious one across the world, and more conversation about what this means for intellectual property, commodification of unique Asean works, et cetera, should be done before jumping straight to ‘what can convince you to share your written materials with us?’”

A spokesperson for SLS told ST that it was in the preliminary stages of discussion with IMDA. “Our discussion focused on exploring the merits of including Singapore literary works in regional language research, and considering ways for IMDA to work with interested partners while protecting their rights. For now, we are still collating responses from the community.”

Mr Ng said he has asked IMDA to consult different stakeholders such as writers, translators, editors and publishers. “Without such consultation, we will not be able to forecast the impact of such LLM training on the practice and income of these stakeholders.”

Source: Straits Times © SPH Media Limited. Permission required for reproduction.

Print
1484

Latest Headlines

Straits Times / 29 Apr 2024

Counting the cost of digital trust

So much of daily life is carried out online, but these activities require trust in the sharing of data across networks.

No content

A problem occurred while loading content.

Previous Next

Terms Of Use Privacy Statement Copyright 2024 by Singapore Academy of Law
Back To Top