Search Unity

Arabic Localization - Our Experience

Discussion in 'General Discussion' started by VivienS, Sep 15, 2015.

  1. VivienS

    VivienS

    Joined:
    Mar 25, 2010
    Posts:
    24
    Arabic Localization of a Game – Our Experience

    Although there are lots of comprehensive articles and informations about Arabic translation, we were missing ones that explain the challenges for developers doing a localization. So we decided to create our own summary to help other devs to quickly grasp the most important challenges. Feedback, corrections, additions are welcome!


    Last week, we were asked to localize our recent game “Lern Deutsch”, a German language learning game for beginners by the Goethe Institut, to Arabic. Translation of the texts would be handled by the client, our job was the incorporation of the language and type into the game and the correct display of the text. The cross-platform mobile game is made with Unity (NGUI), and we already had a localization system in place. Apart from that, we had no knowledge about the Arabic language or typography. Neither did we have native speakers on the project. Despite that, we managed to set-up our system for arabic support within roughly 3 person days.

    So here is what we learned:

    Some facts about Arabic script that you should know before starting
    Arabic and Arabic Script
    Like the Latin alphabet is used to write lots of different languages and has sets of extended characters, the same goes for the Arabic alphabet. “Arabic” usually means Modern Standard Arabic. Other languages using the Arabic script are Farsi, Pashto or Urdu. Some require additional character subsets.

    Which form to use?
    There are different variants of the Arabic script, used for different purposes. The most widely used that you most likely want to incorporate into your program is called Naskh.




    image credit

    Characters
    Arabic in its basic form has 28 letters. Words are separated by spaces. There is interpunctuation, although with non-latin punctuation marks. There is no distinction between capital and small letters. Instead, letters are modified depending on their context: A character will look different, depending on its position in a word. Most characters therefore have 4 different contextualized forms: isolated (default), linked left (at the end of a word), linked right (at the beginning of a word) and between two characters.

    Here are the Unicode spans of characters you definitely need for Arabic, Farsi, Pashto & Urdu. The first one contains the basic characters and numbers, the next two all the contextual variants.

    RTL (Right-to-left) support
    Arabic languages are written from right to left, top to bottom.

    A label of text that has RTL support will render the characters of a string from the right to the left, breaking overflowing text on the left side of a line and is per default aligned right.



    Though these rules might sound simple, implementing correct RTL support for a label is not particularly trivial. Luckily, you can fake it. But more about his later.

    Doing the localization
    The font
    We downloaded the Arabic fonts from Google Fonts. They are open source and “you can use them in every way you want, privately or commercially — in print, on your computer, or in your websites.” - Google

    Here are the other tools we used:

    Since we use bitmap fonts in our project, we first merged the new Arabic Font into into our default Font using FontForge. Then we re-created our font texture with BMFont. Afterwards, our font texture was almost twice as big and we had to shift some assets around to make it fit in our atlas again.

    The strings
    The nature of arabic strings took as a while to understand.To explain, I’ll start from the beginning: The input system. This is how an arabic keyboard looks like:


    image credit

    Notice how there are the 28 characters mentioned above, but not the contextualized variants. They are not part of the input, because contextualization is handled by the software. However, this usually does not change the string in the background. There is usually only dynamic contextualization at render time.

    To explain this more clearly: An Arabic string will usually contain only the isolated character variants that are part of the “Arabic” Unicode set. To display it correctly, the software needs to take care of the RTL line direction and the character replacement.

    Here’s another example: In the first line, you can see a correctly rendered string (screenshot taken from Google Chrome/Google Translate). In the second line, the same string is rendered in a label that has no Arabic support (in this case Unity).


    That said, most of the modern-day programs have Arabic language support. So keep in mind that if you look at a dynamic string of Arabic text rendered somewhere, chances are very high that it is already interpreted and the characters are displayed correctly.

    How to fake it
    Luckily, we didn’t have to re-write NGUI labels for correct RTL and Arabic support. Instead we found this great Arabic Support package by Abdullah Konash that took most of the task from our hands.

    The idea is this: Since RTL labels effectively only render characters from back to front and place the line breaks differently, you can process a string in a way that even if displayed with an LTR label it will still look like RTL.

    Processing the string from the above example with the ArabicSupport.ArabicFixer resulted in the following string (try not to get confused now ;))



    So the new string now renders correctly in the default label. All done right?

    Unfortunately not. Although the ArabicSupport package does process for order of characters, character contextualization and line breaks, there was one last issue that it could not tackle, and that was dynamic line breaks:

    Because some of our labels are shorter than the lines of text from the localization system, they are wrapped. But we saw before, LTR and RTL line breaks are not compatible, so the dynamic line breaks again messed up the strings. And this is where we finally had to get dirty and re-write parts of the NGUI label code, to add correct dynamic line breaks to the processed Arabic strings. If someone is interested in the code, we’d be happy to share.

    Conclusion
    Since our project does not feature input or editing of Arabic text, the “faked RTL labels” are more than sufficient and in the end saved a lot of time for the localization development. And since we are processing the strings at the moment of the import (which is not at runtime), we save (some small amount of) runtime calculation.

    For all who are interested in implementing RTL labels for real: Since the new Unity GUI also does not have Arabic or RTL support, there would still be demand for that in in Unity 5.
     
    v01pe_, Folstrym, jixiang222 and 8 others like this.
  2. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    I started implementing RTL support in TextMesh Pro. Here is a link which includes some of that information and example. This proof of concept (partial implementation) is included in the latest release. Unfortunately, I got tied up with a few other features and getting ready for Unity 5.2 but once the dust settles, I'll resume working on RTL support.

    Thank you for this detail post. It will certainly be helpful in this process. It is hard to add support for a language you can't read or write.
     
    MajKSA and VivienS like this.
  3. Schneider21

    Schneider21

    Joined:
    Feb 6, 2014
    Posts:
    3,512
    I tried learning a bit of Arabic before my deployment to Iraq, and this post did a better job explaining the writing than a 40 hour course with the Army. Thanks for the write up!
     
    VivienS likes this.
  4. sicga123

    sicga123

    Joined:
    Jan 26, 2011
    Posts:
    782
    This information is much appreciated, thankyou for sharing.
     
    VivienS likes this.
  5. goat

    goat

    Joined:
    Aug 24, 2009
    Posts:
    5,182
    Very nicely written.
     
    VivienS likes this.
  6. FayyadSufyan

    FayyadSufyan

    Joined:
    Feb 25, 2016
    Posts:
    33
    For all guys who want arabic in their projects here is the new accurate arabic support tool which does not even require coding. Easy Alphabet Arabic is a simple editor window to provide support for many arabic languages (Modern arabic, Farsi, Urdu) with more languages in the next versions.
     
  7. nzakharchenko

    nzakharchenko

    Joined:
    Aug 1, 2016
    Posts:
    1
    VivienS, thank you for sharing! Great post! I have all these issues and already have fixed them, but only one has left. Can you tell us more about adding correct dynamic line breaks to the processed Arabic strings?
     
  8. naseem_amjad

    naseem_amjad

    Joined:
    Sep 10, 2014
    Posts:
    1
    You can also use "Urdu Nigar Classic" to type Urdu / Arabic or Farsi (Persian) , simply copy from Urdu Nigar and Paste the text in Unity. Dont forget to set the font to "ALKATIB1"
     
  9. biosman22

    biosman22

    Joined:
    May 30, 2015
    Posts:
    4
    thats awesome. I need this code
    recently I 've been developing a project to hold up the old christian coptic ritual. cause the resources are various and there isn't much people interested in arabic .
    young people find it difficult to go to library.
    so we gotta deal with that.
     
  10. pKallv

    pKallv

    Joined:
    Mar 2, 2014
    Posts:
    1,191
    ما مدى دقة ترجمة غوغل في ترجمة اللغة العربية؟

    How accurate is Google Translate in translating Arabic?
     
  11. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    10,160
    As accurate as it is with anything else, which is to say "not at all and native speakers will laugh at you for even trying it."
     
  12. pKallv

    pKallv

    Joined:
    Mar 2, 2014
    Posts:
    1,191
    So the conclusion is, do not use Google Translate? :)
     
  13. Murgilod

    Murgilod

    Joined:
    Nov 12, 2013
    Posts:
    10,160
    Absolutely don't. If you need a project translated, hire a translator.
     
    pKallv likes this.
  14. biosman22

    biosman22

    Joined:
    May 30, 2015
    Posts:
    4
    awful.
    Assuming that you know arabic syntax rules
    I recommend this app Screenshot_2017-03-31-14-37-32.png
     
  15. biosman22

    biosman22

    Joined:
    May 30, 2015
    Posts:
    4
    Arabic in unity

    Unity only desgined to deal with Left to right languages

    And arabic is Right to left language

    So here an example of the same sentence written in 2 different languages.\
    God1.jpg
    Arabic in real world


    God2.jpg


    And when we try to use arabic in unity

    God3.jpg

    How can we write arabic in unity


    In other words how can we write arabic left to right
    So compare the last two graphs
    By:

    1. reversing the characters order.

    2. connect characters.


    And that’s what exactly Arrabic support plugin (by konash) do.

    But there is a little problem break lines and Horizontal wrap .
    Paragraph is showed upside down.

    God4.jpg



    I think to avoid auto break lines error in Arabic support plugin is

    1. to rotate object’s y by 180

    screen222.png

    2. reverse characters order

    3. use reversed font inside unity.
    //I didn't finish
     
    Last edited: Apr 3, 2017
  16. biosman22

    biosman22

    Joined:
    May 30, 2015
    Posts:
    4
  17. PalmGroveSoftware

    PalmGroveSoftware

    Joined:
    Mar 24, 2014
    Posts:
    17
    Neat post !

    I am from Morocco and working on some Arabic issues for a while now ( Arabic app back in 2012 on android < 2.2 where it wasn't handled properly on devices too..), the pack from Abdullah rocks.
    New Unity3d gui is still having same issues, and curious to know more about how you dealt with line breaks indeed..still have that to fix. thx for any help on that front !
     
  18. vx4

    vx4

    Joined:
    Dec 11, 2012
    Posts:
    181
    Thanxs
     
  19. andrewBez

    andrewBez

    Joined:
    Mar 11, 2019
    Posts:
    6
    @VivienS I would be interested in code change you made to to implement correct wrapping of RTL code. My main issue with our game is that we translate during runtime and use best fit, to resize text to fit textbox. Getting this to work correctly with and RTL language is proving very difficult. Does anyone know if there are any plans for official support of Arabic from unity?
     
  20. andrefbr21

    andrefbr21

    Joined:
    Jan 27, 2013
    Posts:
    10
  21. andrefbr21

    andrefbr21

    Joined:
    Jan 27, 2013
    Posts:
    10
    I have managed to download font forge and flip the arabic glyphs! :D