13
The Road to Localization in an Open Source Project
By deciding to go the localization road in my open source (OS) project, I had to learn a lot and overcome some problems. In this blog post I would like to describe for you exactly that.
There are a lot of issues that make localizations difficult:
That's a challenge even for companies. And now I'm standing in front of my project as a recreational open source developer and thinking about its localization. For open source projects (especially small ones in manpower like mine) the hurdle is all the bigger due to lack of capacity. It's unfortunate, because many open source applications could be relevant to those who at the same time can't afford commercial alternatives and don't know English or the other commonly localized languages. That is, where the benefit seems great, the hurdle is even greater. To tell about it I would like to lead through a small history journey of my experiences from the localization of my OS project.
First of all a little disclaimer. I was born in Russia and came to Germany when I was six. That means I speak German, English and very very bad Russian (sorted by skill).
So I'm starting my OS project, which is intended for private accounting (it's still at "not ready enough" for release, unfortunately). It is supposed to help people to get their finances under control. People who don't have money problems don't need to rely on it. So it has to be free! It should also be able to reach as many people as possible. So also those who can't speak English. "Localizations, yes or no?" is thus answered with a clear "F*** yeah!".
At first I decided to localize only in English and German, because these languages were good enough for my needs. Russian I didn't dare to do, because I never went to school in Russia and therefore I was far away from mastering the terminology. This alone shows that when you work on a project all by yourself, you can only handle localization as far as your skills go. And mine are unfortunately very limited. Too limited! However, this forced me to put the code in a position where it is localized and for further languages "only" all existing texts have to be translated.
For further languages I needed help and I looked for help. I found it with my aunt. She is also born in Russia and works in the field of finance. Perfect! In addition, I had a Hungarian girlfriend at that time, who thankfully agreed to help (Szia, magyarország. Hogy vagy?). I had their agreement, but I was still hesitant to take advantage of it. The program should be in a state that is worth something. That is another realization: I did not want to let these helpers, who are dear to me, work unnecessarily and regularly for me completely free of charge. I just had a bad feeling about that.
But I used the help once and then my hobby project was translated into four languages (out of 7,111; source wikipedia). Uh yeah. And then consciously or subconsciously I procrastinated with features that needed new texts. I preferred to deal with other things which didn't require new texts, which is not bad per se, but it inhibits the development of the real purpose of the project. When I started again with new features, I've put the English values into the Russian and Hungarian localizations as placeholders for the new texts. This is what happened to the project until a few weeks ago when I developed a new solution for myself. I wasn't eager to ask my aunt. And the relationship with my Hungarian girlfriend at that time passed in the meantime.
I solved (most of) the problems I had with localizations with the help of two projects. These are very specialized and designed for the C#/WPF/DeepL tech stack. Those who also work with C#/WPF/DeepL are welcome to try these projects. I am looking forward to feedback. I go into the technical details in the wikis of the respective projects (MrMeeseeks.ResXTranslationCombinator/MrMeeseeks.ResXToViewModelGenerator). Feel free to have a look there if you are interested. However, the concepts will certainly be transferable to other tech stacks as well. This will be the topic here.
Unfortunately I won't be able to spare you a short dry definition of the terminology, so let's get it over with quickly!
My projects distinguish between four categories of localization files:
Now a rough description of the two projects and what they do:
I'll be happy to torture you with more conceptual details and my personal experiences with these projects.
The TranslationCombinator is implemented as a github action step. If you follow the workflow - documented and recommended in the repository - the step reacts as soon as changes are made to localization files. Then it uses the translation service (I chose DeepL; be aware that you need an account to access the DeepL API) to create or supplement the automatic files. After that, taking into account any overriding files, the combined files are created or supplemented. Last but not least, a pull request is created if the process resulted in changes to the files.
Ideally, the developers only need to make changes to the default file as soon as they need new localizations. Ideally, there is no need for the language experts to get active. However, they can provide overrides to the localizations when the need arises. Everything else is done by the TranslationCombinator. This also allows for completely asynchronous collaboration between developers and language experts.
Some conceptual choices:
The localization files have an XML format which is not designed for direct use in MVVM projects. This is where the ViewModelGenerator helps out. It takes the default file and the combined files and generates a set of ViewModel interfaces and classes from them. These can then be read directly from elements in the View and ViewModel layers. They also provide a convenient and performant way to switch languages at runtime.
I have created a third repository. This one is just a sample project using the other two projects in combination. If you want to see a complete example in action, feel free to check it out. Please note that I focused on the localization process there. This means that the rest of the code is not up to my usual standards. Here is a small animation of the result (I go through all languages once "slowly" and then a few times at full speed):

You can also have a look at the new localization workflow at the previously mentioned accounting hobby project (Project BFF).
Are all localization problems solved now? No, certainly not. But this is two steps closer to the goal of minimizing human effort. Now I don't have to burden my aunt and my ex-girlfriend with more work. Uh yeah. The great thing is that you automatically get a base localization generated and language experts are always welcome to contribute if they want. There is no pressure on anyone to take care of the localizations, but anyone who wants to has the opportunity to become active. In my opinion, this is worth its weight in gold for an open source project.