Decoding Albanian: Testing 140+ Words Across 4 LLMs

Our experience of translating over 140 Albanian words with the newest LLMs. Uncover insights on translation nuances and AI performance.

Your First 10 Rides Discounted 🚕

Use "WEB" as your referral code, for a free 50 Toko bonus.

Our team at Patoko, based in Tirana, wanted to see how four state of the art language models handle common Albanian words.

We selected over 140 words, aiming to cover a broad range of everyday usage. For each word, we ran the same prompt from different API providers. We asked for a direct translation of the word, then asked the model to craft a sentence using that word. The results highlight where each model excels and where they fall short.

Why Focus on Albanian?

Albanian is spoken by millions but is often underrepresented in AI datasets. Its grammatical rules and cultural nuances offer a strong test for any language model. By focusing on Albanian, we get a clearer sense of how these models might work (or struggle) in real-world tasks that require more inclusive language support. Consider this a version 1 of our unique LLM leaderboard.

Models We Tested on Shqip

  1. Flash 1.5 (Gemini)
    Released on May 14, 2024, Gemini 1.5 Flash is designed for speed and coverage across tasks, balancing quick responses with reasonable accuracy.
  2. DeepSeek V3
    Launched on December 26, 2024, DeepSeek V3 has 671 billion total parameters, with 37 billion activated for each token. It excels in coding and math tasks, featuring a Mixture-of-Experts architecture.
  3. Llama 3.3
    Developed by Meta, Llama 3.3 is a 70 billion parameter model released on December 6, 2024. It aims for high performance and efficiency in a range of language tasks.
  4. GPT-4o Mini
    A cost-efficient small model from OpenAI, GPT-4o Mini was released on July 28, 2024. It has a 128,000 token context window and can generate up to 16,384 tokens per request. It offers competitive performance at a lower price.

How They Performed

Here’s a table that breaks down how each word performed in context with each model’s translation and sentence creation. We encourage you to explore this table to see the detailed differences:

CategoryWords
Greetings and Courtesiesthank youpleasesir
Common Verbsisbehavedothinkgoseelookcomewaitsays
PronounsIyoumeittheyweshewho
Adjectivesgoodbadfairreadyfast
Adverbsreallyjustwellbeautifullynever
Prepositionsbyfromwithinoutsideafterbefore
Conjunctionsandorbecausethus
Time-related Words (Great for Taxis)todaydaytimesthenlast
Quantifiers and Determinersallanymorelesslotsome
Question Wordswhatwhether
Interjections and Expressionsdamncome onheckforgive
Miscellaneous Nounspeoplemancountrychurchhousesagelordprovidence
Other Common Wordsthisthattherehereupdownfrontstillmaybeonlyeverythingnothingsomething
  • Translation Accuracy varies across models. Each has moments of precise translations and moments where words are slightly off.
  • Context Sensitivity was another measure. Some models handled everyday usage better, while others did well with technical or formal usage.

Why This Matters

Accurate handling of Albanian by language models matters for connectivity. It helps in education, translation services, and daily life. We’re working to close the gap between languages for visitors to Tirana, along with many other places in the world.

If you’re looking to use something longer term, you’ll want to examine our test results based on your translation needs. If you translate business documents, look at how each model handled formal vocabulary and complex sentences. For social media content, check their performance with colloquial terms and idioms. Consider your budget alongside accuracy needs – while GPT-4o Mini is fairly cheap per million tokens, DeepSeek v3 shows stronger results with technical terms at a higher cost. For basic conversations, a model excelling in common vocabulary might work well, even if it struggles with specialized terms.

What’s Next

In future work, we plan to add more state-of-the-art models like Claude Sonnet, OpenAI’s O1, Perplexity’s custom models, and others. As more LLMs enter the field, our goal is to expand this testing to maintain an overview of how AI adapts to languages like Albanian. Want to work with us on this? Don’t hesitate to reach out.