Mudlet 4.0 internalization roadmap

Post by **Vadi** » Mon Apr 03, 2017 6:57 pm

We've had requests since the earliest days of Mudlet for Chinese, Russian, Polish, Spanish, and other languages. Mudlet 4.0 will be all about making Mudlet work with them.

There are many aspects to internalization (i18n for short) so I'd like to lay out a reasonable roadmap for this work. As there are a lot of aspects to this, I think we should do this in a step-by-step manner and make minor (3.x) releases as we add support for the next step. Tackling everything at once with our limited resources would be impossible and take another 4 years to get a final version out.

I think a reasonable roadmap is the following:

1) support for displaying other languages in the main window
2) support for working with text in other languages in Lua
3) translating the Mudlet user interface itself
4) translating the documentation
5) translating the API functionality

The roadmap is available as tickets on Github.

#1 is the most vital because without it, Mudlet is just impossible to use. Once #1 is in, you can at least use the client - this in fact what the Spanish (source) fork of Mudlet does. Speaking of forks, there are at least Russian and Chinese versions of Mudlet as well.

I invite all (future) users of Mudlet to post in this thread with their experiences with translated software, wishlists for Mudlet, and feedback on the proposed roadmap. If you've already been working on adding i18n support to Mudlet, please join us on Github to join our efforts

PS. Broken English accepted here, feel free to use Google Translate if need be.

Post by **Vadi** » Mon Apr 03, 2017 7:00 pm

My 2c is that I'm not sure about point #5 (making send() be enviar() or wysłać() or 發送() for example). We've had some discussions on this and as far as I can see it's not a good idea, but I'm open to opinions. I see Microsoft has tried this with translated VBA functions and it just makes users of the translated API impossible to search for help online because all functions are named differently, and non-English programmers I speak to are against the idea as well. In general, translated programming languages throughout the world have failed time and time to take off - interested in other opinions on this though.

Post by **Jor'Mox** » Mon Apr 03, 2017 9:59 pm

So long as users were able to name variables in their native languages, at least the last part would be, more or less, trivial, as it would amount to nothing more than just writing up a module that declared new variable names that stored all of the original functions. And while you make a good point about it being potentially problematic, especially for more serious coders, it has the potential to make Mudlet easier to use for the more typical user, that is less skilled in the programming arena.

Post by **Vadi** » Tue Apr 04, 2017 7:54 am

Would it, though? Everyone who is starting out relies on better programmers and documentation that's out there to help them along. If nobody can help them because even their functions are named differently, then there's not much hope of the person succeeding, and that's not a positive outcome.

Post by **ftpd** » Tue Apr 04, 2017 11:17 am

As I wrote previously in one of help topics, I have spare time to help with Polish version, at least with the user interface.

And a big +1 for not translating API commands. Noone does that, users are used to work with 'commands' in English.

Post by **Jor'Mox** » Tue Apr 04, 2017 11:48 am

I learned everything from the wiki and experimentation until I got things working the way I wanted (plus some lua tutorials for more foundational things). At a minimum, there should be an effort to fill out the few parts of the wiki that are lacking, and translate the content into the target languages. If there isn't a demand for translating the actual function names, then don't do it, though given the way the functions are named, I think that there should be a translation next to them in the wiki at least, to facilitate finding the desired function.

Post by **Vadi** » Tue Apr 04, 2017 4:58 pm

That seems like a good tradeoff. Code would still be workable and newbies would be able to find what they need easier.

Post by **SlySven** » Tue Apr 04, 2017 5:01 pm

There is more to consider than whether function names should be translated or not - even if they are not - should the error messages be translated. I would propose two levels of translation:

Retain the existing function names for Mudlet provided (Core C++) lua functions, but translate the UI text that is produced on errors / bad values. {This should allow for a common Lua codebase, but may benefit from a further return of a standardised "error number" - like what POSIX tries to do even on desperate *nixes - so that scripts never attempt to decide how to behave based on returned "strings" but only on returned numbers.}
Provide language specific function names for the Mudlet provided functions - this could be more "user-friendly" as then the names could actually mean something to them - though this would not help with the intrinsic lua commands/functions and would then make scripts tend to be language specific unless we also provided a function that would convert from the NLS function names to-and-from the "standard" Americanish names.

We also would want a function to provide information about (and conceivably change, though I would be reluctant to provide that functionality to prevent a script countermanding a user's choice.

In case anyone is wondering I intend that we make use of the Qt provided translation systems which means identifying strings that the user will see and that will need to be translated as appropriate and some strings that are not so processed that are mainly used for internal use by the coders and possibly for file/directory name processing that are not translated (or subject to platform rather than language specific modifications). For those willing to work on providing translations they will want to familiarise themselves with QtLinguist which takes a list of translatable QStrings (those provided as:

Code: Select all

QString  message = QObject::tr( "Base text string with arguments like %1 that are replaced with variables supplied as \"args(...)\" to the method and possibly with a single %n which is the integer supplied as a third argument and controls how plurals and quantity dependent phrases are handled when translated to avoid such language kludges that do not work for non-English cases such as 'x room(s)'!", "Context  for translation, at a base level the class in which a string occurs but may need further disambiguation such as for when the same English word has different translations in a different language", 42).arg( "Something that changes at run-time, normally a variable" )

) and allows them to enter what the replacement should be (if it is necessary, sometimes it isn't) and if relevant the alternatives for the (up to three different ranges) of the %n argument. By default, once the translation system is in place strings that have been translated will automatically be used and a leading '#' will appear on strings that do not yet have a translation in place.

Going forward the indication that translations are in place will be the appearance of ".ts" files which are the XML "source" files containing the translation data that has been extracted from the C++ code for QtLinguist to work with and ".qm" files that contain the (smaller) compiled binary files that are placed alongside the Mudlet application and that it will use (as selected by the user choosing what language to use) to render text in the chosen language.

Note that translators will not have be coders and that it is not necessary to have a complete "translation" in place to see it in action - it can be done in an incremental fashion. However if/when the source code changes, any new text strings will need to be added to the list of things to be translated in all languages. It is very likely that some existing coding patterns - especially those that assemble a long message from a series of shorter phrases - will need to be redone to have the whole pattern in one master with "spaces" for the bits that vary - this is because the structure of a translation will likely not be built from the pieces in the same order. This will be easier for those who are coders than non-coders to spot/deal with IMHO as they are more likely to recognise that something cannot be translated into a different language using the "chunks" of fixed and variable data that is available in a particular case.

Post by **Jor'Mox** » Tue Apr 04, 2017 5:21 pm

As someone who has worked as a linguist, I can tell you that creating an accurate translation of a full message by assembling translations of chunks of text in a programmatic fashion is extremely problematic, though the degree to which that is an issue varies depending on the language in question.

Post by **Vadi** » Tue Apr 04, 2017 7:15 pm

I'd like to propose that we use an existing translation service - Transifex or something else - and only translate strings fully for the purpose. You can't assemble text for any language when you rely on an English basis for grammar.

Mudlet 4.0 internalization roadmap

Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap

Re: Mudlet 4.0 internalization roadmap