Categories
Artificial intelligence

Speaking Icelandic when you are an American AI: Multilingual capabilities of LLMs

This article tests the leading AI models’ capability to speak Icelandic, focusing on
Google’s Gemini and Microsoft’s Co-pilot.

Introduction

Being an author of a tech blog, one must constantly seek new or noteworthy ideas or material to talk about. Scarcity of ideas is not really a problem. Rather, the ideas are not always good. The idea behind this blog article, that is, to compare AI models in terms of their capacity to speak Icelandic (or any non-English language for that matter), would have been dismissed as ridiculous had I thought of it when I launched this website and blog in 2020. That year, I started a blog and an independent IT practice. I also started my PhD study on AI and e-government, being accepted into the PhD program at Reykjavik University. At the time, generative AI was still recent and apps such as Chat GPT did not exist. Most tech bloggers were not interested in AI’s linguistic potential. Instead, bloggers curious about AI emphasized machine learning and big data capabilities. In brief, AI applications such as machine translations (read Google Translate) have, historically, been terrible at translating small languages, particularly Icelandic.

Large language models or LLMs, such as GPT series, are enabled by free open big data on the internet and powerful cloud-based infrastructure, have given rise to the development of virtual agents like ChatGPT. Such AI systems are called generative AI (GenAI) applications and can generate text in response to a particular prompt or input from users (Hjaltalin, forthcoming).

Icelandic is a micro language, only spoken by approximately 300.000 people, most of whom live in Iceland. Icelanders ought not to take for granted that it will be in major LLMs. AI researchers in Iceland have worked hard to create an Icelandic

language model (not to be confused with LLMs, however). The aim is to ensure that AI models, including popular LLMs (e.g., GPT-4), can speak and understand Icelandic (see Heimisdottir, 2024).

The method

Against this (albeit brief) background, let’s delve into the specifics of the analysis. I chose Gemini (Google’s LLM) and Co-Pilot (powered by OpenAI’s LLM, i.e., GPT-4) as my GenAI sample. While OpenAI’s GPT-4 represents the first LLM to learn Icelandic (“Preserving languages for the future”, 2023), Gemini has, more recently, learned it as well (see below). Both LLMs were presented with the following two questions:

  1. Talar þú íslensku? / Do you speak Icelandic?
  2. Hvernig myndir þú lýsa Íslandi? / How would you describe Iceland?

The evaluation of the results is not based on the content’s size or quality but on grammatical soundness. For instance, I check if they use appropriate words and if they use the correct form (e.g., past tense, case, etc.). Particularly, fallbeyging or declination (cases shaping nouns and pronouns) is a difficult grammar rule that foreigners (and some Icelanders) struggle with when learning the language. If the model accomplishes declination of (pro)nouns, this would indicate superb Icelandic skills.

Analysis and results

In this section, I present my evaluation of the performance of the two models. For your information, I am not a linguist or Icelandic teacher, but I have a good understanding of my native tongue. I feel confident that my analysis is robust and valid for Icelandic.

Results indicate that both models speak (or write) Icelandic. Excellent!

This covers the first part of my analysis. In the second part, we will explore how well or poorly they perform in their spoken (or written) Icelandic, particularly focusing on grammar rules (see above).

Gemini’s Icelandic skills are surprisingly good. The text is remarkably elo- quent and creative in terms of vocabulary and descriptions. On the other hand, as presented in Figure 1, there were minor errors in Gemini’s response to my question.

Figure 1: Gemini’s (G) answer to question 2: How would you describe Iceland?
Figure 1: Gemini’s (G) answer to question 2: How would you describe Iceland?

Specifically, three errors were found in terms of pronouns and forming adjectives depending on the gender of the subject at hand.

Co-pilot’s Icelandic skills were decent, although I hoped for a more creative response. While it is clearly not as eloquent and rich as Gemini’s response, this is not the determining factor in the evaluation. Indeed, Co-pilot performed quite well in terms of grammar and even outperformed Gemini in some areas, such as using pronouns (see Figure 2).

Figure 2: Co-pilot’s (C) answer to question 2: How would you describe Iceland?
Figure 2: Co-pilot’s (C) answer to question 2: How would you describe Iceland?

Gemini is stronger in fallbeyging or declination; I did not detect any errors in its response in terms of declination of nouns. However, Co-pilot uses the incorrect case of Geysir, as illustrated in Figure 2. As such, Gemini’s Icelandic skills are superior to Co-pilot’s.

Conclusion

Most (if not all) Icelandic linguists can agree with the claim that declination of nouns is the “holy grail” of Icelandic proficiency. If you have mastered declination (or fallbeyging), you have reached an advanced proficiency in the Icelandic language.

Overall, I was impressed with the proficiency of both AI models in Icelandic. I am curious to know how well they perform in other languages. Of course, the benchmark used here (i.e., declination) does not apply to other languages, so this should be adapted in future research. Please share your thoughts in the comments section below.

References

Heimisdottir, L. (2024, September). The Icelandic Approach: Preserving and Re- vitalizing Linguistic and Cultural Diversity in AI.

Preserving languages for the future. (2023, March).

illugit's avatar

By illugit

I am a writer, PhD researcher, and business consultant interested in various topics, such as AI use in the public sector and managing digital transformation in organizations.

Leave a Reply