Today: December 5, 2025
November 29, 2025
4 mins read

Latin America off the AI ​​map: why its own language model is urgent

Latin America off the AI ​​map: why its own language model is urgent

By Liliana Acosta/Latinoamérica21

ChatGPT has become the fastest growing platform in the history of the internet, reaching one million users in five days, and one hundred million in just two months. Its success is due to the novelty of the system, its ease of use and its free access. Today, people use it to work, study or solve everyday tasks, making it in the wrong way in a new search engine. And despite so many benefits, we continue to fear artificial intelligence, especially when it comes to the future of work. However, today there are other challenges that are equally worrying in the Ibero-American environment, and which should already be part of the public agenda.

Models like ChatGPT, Gemini, Bert or Claude do not constitute a single technology in themselves, but rather integrate different systems. One of them is Large Language Models (LLM), which are used to train enormous volumes of data that allow machines to process and generate text with surprising naturalness.

ChatGPT is the interface (the face) of the LLM, and according to OpenAI ChatGPT development companyThis one trained with “public and free” information available on the Internet, such as: web pages, blogs, forums, Wikipedia, articles and academic documents. This at first glance gives us a sense of diversity in information, but in practice it means that more than 70% of the data used for training is in English. And that is precisely where our problems begin: in the disparity of the origin of the data.

The linguistic bias

When ChatGPT generates responses in Spanish, these are not a product of data in this language, but are the result of automatic translations. The result is a strong Anglo-Saxon cultural influence that can distort nuances and expressions typical of the Hispanic language and thought. Use ChatGPT or any other language model It is, in a way, like watching an English-language movie with Spanish subtitles.

The really disturbing thing is that, despite the fact that Spanish is the second most used language on the Internet, its digital content only represents 6% on the webcompared to 49% in English. On platforms like Netflix, barely a third of films are not anglophoneand that third is distributed among about thirty languages. So, although it may seem that the digital world is diverse, since we are sold the idea that we all have the ability to generate content, the truth is that the majority of what we see, read and hear, has a North American accent.

Another problem, of the many that we have with technology, is of an epistemological nature, and is that it has been appropriating traditionally human terms of use, such as intelligence, reasoning, analysis, etc., and we, by not knowing technological concepts, have equated them with human meaning. So, when it is said that an LLM uses “natural language”, it does not refer to the language that people speak. It means that, thanks to mathematical and statistical models, it is able to decipher how we use words.

Therefore, the ethical question is: does ChatGPT truly understand the cultural diversity of the world or does it simply reflect the cultural limitations of its training data? I think we all know the answer.

Algorithmic Colonialism

The problem worsens when these limitations translate literally in invisibility, because the Ibero-American cultural representation in these data is minimal. And that is serious, considering that Spanish is spoken in 21 countries on three continents, and that there are more than 635 million Spanish speakers. And it is not a simple question of including words: it is a question of identity. We do not speak the same in Colombia as in Spain: we are united by language, but we are differentiated by history, miscegenation, geography, the tropics, accents and even mosquitoes. If these nuances do not exist in the data with which LLMs are trained, then our voices will be ignored, not to mention historically marginalized groups such as women, indigenous people, people of African descent, etc.

Today, when there is so much talk about colonialism, perhaps we should look towards a new, more subtle and perverse form, which is algorithmic colonialism, where Anglo-Saxon values ​​and ways of thinking dominate digital discourse. And meanwhile, we continue using ChatGPT to “improve” our texts… without noticing that, little by little, the algorithm is redefining even the way we communicate.

And this new form of cultural invisibility is already taking its toll on us. A study by the Complutense University points out that our communication via email, social networks and WhatsApp is changing: we now use short phrases and a more artificial tone. The texts generated by ChatGPT in Spanish are usually literal translations from English, which eliminates nuances and linguistic turns, simplifying the expressive richness and fragmenting the traditional Spanish paragraphs. Not to mention how we have stopped using punctuation marks and using a comma after the greeting as happens in English, when the rule in Spanish establishes that a colon should be used.

The need for an LLM in Spanish

We Latin Americans, who are always looking for the differences between us, should start thinking about the urgency of having an LLM in Spanish. And a project like this would not only allow inclusion in the digital world, but we would also have a purpose in search of the common good, which would generate jobs, knowledge, resources, alliances (universities, governments and companies) and the possibility of appearing on the world map of Artificial Intelligence.

Our region needs to find spaces that allow the integration, not of a single identity, but of a group that speaks the second language most used on the internet. We have a moral obligation to build datasets local (data sets) with linguistic and cultural diversity, because if as a region we don’t do it, who is going to do it?

Now, it is not about “imitating” Silicon Valley, but about thinking about an ethic located in Latin America, that responds to our contexts, that understands who we are, and that generates added cultural, social and technological value to the region. It is about moving from our programmers working for first world countries, as cheap technological labor for others, to programming and producing technology by and for us. And, furthermore, it is capable of being exported to the world.

Because getting on the artificial intelligence train does not mean that all Spanish speakers use ChatGPT, but that we create the local conditions to build our own technology. The real leap is not to speak to a machine in Spanish, but to teach it to think from Spanish, with our values, our voices and our way of understanding the world.

Only then will artificial intelligence stop translating us and finally recognize us.


*This text is part of the collaboration between the Organization of Ibero-American States for Education, Science and Culture (OEI) and Latin America21 for the dissemination of the Voices of Ibero-American Women platform. Meet and join HERE to the Platform.

This article was published in Latin America21 and is reproduced with the express permission of its publishers. Read the original.

Liliana Acosta is a philosopher specialized in ethics applied to technology. Founder of Thinker Soul, a consultancy aimed at the digitalization of companies and innovation. Specialized in reflecting and disseminating artificial intelligence (AI).

Source link

Latest Posts

They celebrated "Buenos Aires Coffee Day" with a tour of historic bars - Télam
Cum at clita latine. Tation nominavi quo id. An est possit adipiscing, error tation qualisque vel te.

Categories

Airlines around the world announce cancellations and delays after Airbus notice
Previous Story

Airlines around the world announce cancellations and delays after Airbus notice

Brasileirão: Neymar scores, Santos wins and leaves the relegation zone
Next Story

Brasileirão: Neymar scores, Santos wins and leaves the relegation zone

Latest from Blog

Go toTop