Djihed Afifi

Archive for the 'Arabisation' Category

Gnome 2.22 Arabic Translation

10th January 2008

Over the last few weeks we have maintained a healthy state of GNOME 2.22 translation to Arabic, thanks to Anas Husseini, Abou Manal, Ahmad Farghal, Osama Khayat and Khaled Hosny. Currently we have 95% done and we are at the third spot. Detailed statistics are listed below, and here is a link to the official GNOME statistics:

http://l10n.gnome.org/releases/gnome-2-22
The new modules for the next release have been settled, and the strings are more or less finalised, except for a few changes in the future. So, time to switch our focus to this release and get it done as soon as possible.

I have been quiet busy in the last few week with fixing various RTL and Arabic bugs in GNOME. Expect a few fixes in the next release. I will continue on this work so unfortunately I won’t have much time for translation.

It would be very good if we could complete this release as soon as possible. I would like to dedicate the last 2 weeks before the release to strict Quality Assurance and Translation Revision. I have built the whole new GNOME from sources and so I expect to test and review most translations.

Please feel free to assign yourself any of the uncompleted packages in this list, and let me know what you have taken.

* Translated    39311 95.42%
* Fuzzy          1271 3.09%
* Untranslated    615 1.49%

* Total         41197
* To be done     1886 4.58%

Incomplete Packages
--------------------
Package                 Translated Fuzzy Untranslated
eel                       30         0    1
libgnomekbd               49         0    1
nautilus                1161         1    0
metacity                 514         1    2
gnome-applets-locations 4355         3    0
gnome-terminal           483         3    0
gnome-desktop             65         0    5
gnome-applets            939         5    0
epiphany                 909         5    1
gnome-session            122         6    1
gtk -properties         1501         6    1
gnome-volume-manager     196         8    0
evince                   287         3    6
gtk-engines               32         4    6
cheese                    45         5    5
file-roller              249         7    3
sound-juicer             156         8    6
gnome-build              110         7   10
gnome-system-tools       231        11    6
evolution-data-server   1026        14    3
gtk                      908        16    1
gnome-utils              723        10    9
gnome-system-monitor     210         8   12
vino                      84         8   12
ekiga                    632        11   10
gconf                    453        24    0
eog                      245         8   17
gnome-keyring             59        16   15
totem                    426        22    9
tomboy                   355        24    9
gnome-power-manager      443        36    3
deskbar-applet           186        16   24
libgnome                 215        40    0
seahorse                 727        37    6
empathy                  266         7   37
gnome-control-center     829        41    9
gimmie                   122        40   13
yelp                     289        29   30
gdm                       66        51   15
gtksourceview            273        62   17
gdl                       17        48   33
gnome-games             1684        49   37
gcalctool                303        72   27
anjuta                  1853        80   32
orca                     924        63   52
glib                     215        56   68
gvfs                       0       149   29
evolution               4664       151   32

Posted in Arabisation, Gnome | 1 Comment »

Interview with Leonardo Fontenelle

27th November 2007

A while back I was interviewed by Leonardo Fontenelle (An active Free Software l10n contributor from Brazil), it is worth mentioning here. There, I described many aspects of our work in Arabic translation and Arabeyes. Here I quote some passages:

On the open nature of the translation process:

I guess I don’t like some particular phase per see, but all in all, I very much adore the open nature of it. Right from the source code to the actual compiled message catalogues. You can’t really get much more open that this. This openness really pays off when translating weird messages, when trying to view translations live, when comparing with translations of other packages, when viewing translations of different languages, etc.

On the Technical Dictionary:

The technical dictionary is basically an English-Arabic dictionary for computing terms. At first, we started making it with .po files, but that created many problems with versioning and discussing the terms. So I had the idea of uploading the terms to our Wiki. I used some scripts to convert the .po files to Wiki xml input. The wiki, being open, allows people to edit as they see fit, discuss terms, suggest alternatives, etc. Then finally, there are some scripts that take the wiki pages and convert them back to .po files, as well as .pdf suitable for printing/reading. The experience was very rewarding to us.

On contributors:

For Arabeyes, we are forever in need for contributors. We do think of lots of ideas, but we always hit the shortage of manpower wall. We’d like to see Arabic support addressed in all popular OSS applications. We’d also like to develop a free Arabic OCR application and an automatic translator. This is short term, but the long term list is a big one.

Please read the interview here. It is also translated in Portuguese, thanks to Leonardo.

Part 1.

Part 2.

Posted in Linux, Arabisation, Gnome | 1 Comment »

Downloads of the Technical Dictionary

18th April 2007

About 1 month ago I wrote some scripts to get the technical dictionary contents from the Wiki to a pdf.

At the time I wondered how many people would download it, so I did not spend a good time to make it neatly formatted. The end result pdf was not very good.

Today, however, I decided to check if people are actually downloading it. Doing some Data Mining on the Apache logs, I was quite surprised to see 387 downloads, 197 are from unique IP addresses. The break down of unique downloaders by country is shown below.

Encouraging, time to go back, beautify it and make it look good.

On another front, parsing the referers (which page directed people to the pdf), about 70% were from the arabic page, 30% from the English page. This highlights the importance of having pages in both languages for Arabic and English speakers.

Breakdown of unique Technical Dictionary downloads by country:

38 : EG, Egypt
22 : US, United States
12 : SA, Saudi Arabia
10 : GB, United Kingdom
9 : PS, Palestine
9 : DZ, Algeria
8 : TR, Turkey
8 : AE, United Arab Emirates
7 : MA, Morocco
6 : JO, Jordan
6 : DE, Germany
5 : IL, Israel
3 : TN, Tunisia
3 : QA, Qatar
3 : OM, Oman
3 : --, N/A
3 : IT, Italy
3 : FR, France
3 : CZ, Czech Republic
2 : SD, Sudan
2 : LY, Libyan Arab Jamahiriya
2 : KW, Kuwait
2 : IN, India
2 : FI, Finland
2 : CN, China
2 : BH, Bahrain
2 : A2, Satellite Provider
1 : ZA, South Africa
1 : UA, Ukraine
1 : TH, Thailand
1 : SY, Syrian Arab Republic
1 : RU, Russian Federation
1 : PK, Pakistan
1 : NZ, New Zealand
1 : NO, Norway
1 : MG, Madagascar
1 : LB, Lebanon
1 : HK, Hong Kong
1 : ES, Spain
1 : BG, Bulgaria
1 : AU, Australia
1 : AL, Albania

Posted in Arabisation | 9 Comments »

Statistical analysis of strings in popular Open Source Projects

3rd April 2007

At Arabeyes we have several Open Source Projects for translation, totaling more than 300000 strings. Our biggest challenge is preserving consistency and correctness across all of these projects. From experience, while some of the words seem obvious in English, their counterparts in another langauge (such as Arabic) can sparke heated debates. A while back, in an effort to tackle this, we introduced the Arabic Technical Computing Dictionary, and we hosted it on a Wiki for open discussion by any translator. A few scripts extract the messages every week into various neat formats for translators, including .csv, .po and even a .pdf that is suitable for printing (I still need to fix some issues the latter).

However, we still had problems prioritising discussions: which words should we discuss first? which need immediate attention? are we missing any important words? are we over analysing words that are not important? I believe these are important questions that every translation project to any language should consider. They are especially very important for languages that do not yet have a concise and established list of terminology translations.

The solution seems quite obvious: analysis of existing projects. While computers are quite bad at translation with human level accuracy, they are extremely good at statistics and counting. So why not exploit that?

So I put together a number of scripts that analyse .po files and output statistical data that can help us answer the previous questions. I operated on the biggest four open source projects we have: KDE, GNOME, OpenOffice.org and Mozilla (including Firefox and Thunderbird), the string pool had nearly 300000 strings. Reading the .po files, the scripts count the number of occurences of each word. The top 10 most used words* are:

  1. 4734 file
  2. 3002 name
  3. 2538 error
  4. 2268 text
  5. 2110 use
  6. 1946 list
  7. 1931 window
  8. 1869 select
  9. 1826 open
  10. 1825 show

Again this list may seem obvious, but a word like “select” has a few equivalents in Arabic, and we struggled to agree to one term. The complete list is available in this file. A .pot [0.5 MB] template is also available, but beware that it contains a lot of rubbish, and there are nearly 20000 entries so I can’t clean it all. If you clean it, I’d be interested in having a copy.

This only gives us that most popular words. We also want the most popular technical dictionary entries (including combinations such as “system administrator”, the previous list contains only singular words). The most important technical dictionary entries are in this list.

The difference between the complete list and the technical dictionary gives us the list of words that are not in the technical dictionary. Many of them are very important, I was honestly surprised to see words like “toolbar” and tab” not being in the wiki.

Analysis of individual projects is also available. Here are the most popular words for KDE, GNOME, Mozilla and OpenOffice.org.

The complete set of scripts and results reside in Arabeyes CVS. feel free to make use of them. The scripts are GPL but the data follows the license of the individual projects**. If you have a different way of analysis, or have another set of words from your language I would be very interested in hearing from you.

Special thanks to Chahibi for helping me with some ideas.

* KDE was excluded because bash complained of too many files (arguments). If you know of a way to increase the limit please let me know.
** I believe they are comptaible with the GPL. If you disagree, please send me an email (no need to yell).

Posted in Arabisation | 2 Comments »

Arabic Gnome 2.16 Completed

31st December 2006

Finally Arabic Gnom 2.16 has been fully translated to Arabic, thanks to a dedicated team of translators.

See statistics!

We will be emphasising more on quality and correctness for gnome 2.18, since most of the job is already done. Some details are available in this Wiki page.

Posted in Arabisation, Gnome | No Comments »

Technical Dictionary on the Wiki

10th December 2006

The technical dictionary aims to translate and standardise technical terms that are used in software. It is an effort to unify the terms used across all projects, to present the user with consistant and understandable interfaces.

We have been, since some time, trying to discuss the terms using the mailing list. This created many problems: discussions are forgotten, people discuss terms over and over and there is no single point of reference for all terms.

To solve these, we have recently imported the dictionary to the wiki. You are welcome to contribute, whether you are a native english or bilingual speaker. Proficiency is not need, normal users are also welcome since the work is being done for them, you can comment on whether the term is understandable to you.
The dictionary is available here (Currently only words starting with A are there):

http://wiki.arabeyes.org/Technical_Dictionary

Leave a comment on the discussion page if you would like guidance on where to contribute.

Posted in Arabisation | 2 Comments »

Arabic Gnome Making Progess

11th November 2006

The Gnome Arabic Team is making serious progress to complete Gnome 2.16 translation. We are a bit late due to lack of contributors, but many joined the team recently, so better late than never.

Arabic is a beautiful language, I’m sure you will agree by looking at these screenshots of translated applications. Thanks to all those who helped: Khaled Hosny, Youcef Raffah, Youssef Chahibi, Mohamed Magdy, and all those who translated previous version: Arafat Medini, Bayazidi, and a lot more. I would like to take this chance to extend my invitation to all past contributors and new members: You are always welcome, there is a lot you can do to help the effort: a lot of people are waiting for the release. If you would like to help, please have a look at this roadmap, then email me at djihed at djihed.com

Now, here are the screenshots, click to enlarge:

Gedit: Now becomes the best Arabic Linux Text editor:

Arabic Gedit

Epiphany: The gorgeous Gnome Internet Browser:

Arabic Epiphany
Ekiga: the Internet telephony software:

Arabic Ekiga
Eye of Gnome: The image viewer:
Arabic Eye of Gnome

And last but not least, file-roller, the archive manager:
Arabic File Roller

Posted in Linux, Arabisation, Gnome | 3 Comments »

ArabicOpenCD 0.1

8th October 2006

ArabicOpenCD, a project similar to Canonical’s OpenCD at opencd.org has just been developed and released. The ArabicOpenCD aims to maintain the most complete collection of open source software in a single CD for Windows operating systems. The software is of high quality and provides a suitable alternative to often pirated software in third world countries including Arab countries.

If you own or work at a CD shop, library, computer repair shop, OEM shop or pretty much any orgnisation then you could download the CD, burn it a few times and offer it for the public, or you could contact me and see what we can do. It is also really good for those who do not have a reliable fast conenction to the internet, and would rather have a big collection of software offline, or for those who would like to learn how to program by example, as the software is open source.

Finally, If you think you can contribute by translating software, well, yes you can. It doesn’t require much, and you can really make a difference. Much of the software has been translated to Arabic by various translators at Arabeyes, the rest need more contributors :~) please head to www.arabeyes.org, or if you have difficulty you can contact me.

Oh yeah, the ArabicOpenCD, you can get it here: www.arabicopencd.org . Many thanks to Bashar Al-Abdulhadi for buying the domain and setting up the hosting.

Posted in Arabisation, cdmaftooh | 2 Comments »

Lexicons

4th September 2006

Arabising all terms from English to Arabic individually is cumbersome: it’s waseful in terms of resources and it opens the door wide to repetitions and mistranslation. A Better solution would be to translate terms collectively: collecting terms that relate to a particular subject by brainstorming and observation, thus forming lexicons, or probably more accuratly mini-lexicons. After making a lexicon, we could ‘’mass'’ translate it, in other words, collectively translate the words by relating the meaning of each set to the most suitable counterparts, while observing small differences in the meaning of each word. This amongst various other techniques can help us to write more accurate and understandable translations.

According to Chambers, here is the definition of a lexicon:

1 a dictionary, especially one for Arabic, Greek, Hebrew or Syriac. 2 the vocabulary of terms as used in a particular branch of knowledge The word…

This is comparable to مُعْجَم in arabic. Albeit it’s a small one for each of the topics we want. An example would the one I used in the the previous article, the lexicon of exiting an application or a process, comprising of: lose, quit, exit, kill, terminate, end, finish, go out, stop, shut down, leave, discontinue, cancel, refuse, skip, break, abandon, give up, suspend, stand by, hibernate, crash. The list can go on and is still open.

There will be a coherent list of lexicons in the wiki.

Posted in Arabisation | No Comments »

Standardising Arabisation

30th August 2006

At Arabeyes.org translation we often run into problems of standardisation. There are loads of terms being introduced everyday, and loads of old terms still incorrectly translated to Arabic. Analysing each of the english words and choosing an appropriate Arabic term is not an easy task. For example, consider the differences between [close, quit, exit, kill, terminate, end, finish] etc, what is the most appropriate for each between [L-, -L, -T-, -LL, , -T] etc . It’s all terminology relating to exiting an application. The arabic equivalents need to be carefully chosen to properly match their english couterparts and convey the right meaning. We can group a number of terms together under one topic to see their differences and analyase them seperately and collectively.

First, over the coming articles, I will try to find ways that enable us to reliably skim all the relevant references, including Arabic and English dictionaries and past uses in other software. In other words, I will try to standardise our standardisation process.

Posted in Arabisation | No Comments »