Mudlet and Unicode (+GitHub repository)

Olostan
Posts: 16
Joined: Sun May 08, 2011 2:18 pm

Mudlet and Unicode (+GitHub repository)

Post by Olostan »

Hello everyone.

Though I am not cool QT/C++ developer and see mudlet for 2 days, but I've made several changes to make unicode support.
You can review them at https://github.com/olostan/Mudlet/branc ... et-unicode (I've forked vadi2's repository on GitHub).

I've tested aliases + triggers - seems works. Also regexps like "letters: (\pL+)" works for matching letters (both latin and unicode).

I hope these changes could be used to make mudlet unicoded (though it is already works for me).

User avatar
Vadi
Posts: 5042
Joined: Sat Mar 14, 2009 3:13 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Vadi »

Very cool! We'll check it out :)

User avatar
Vadi
Posts: 5042
Joined: Sat Mar 14, 2009 3:13 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Vadi »

I think the only problem now is though using Lua fuctions on the Unicode... ie does string.match(matches[2], "russian string") work properly?

Olostan
Posts: 16
Joined: Sun May 08, 2011 2:18 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Olostan »

oops... seems as I initially cloned SF's repository I didn't got changes about mapping ( removed methods "TLuaInterpreter::loadMap" )

I'll restore them and make another pull request

Olostan
Posts: 16
Joined: Sun May 08, 2011 2:18 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Olostan »

Vadi wrote:I think the only problem now is though using Lua fuctions on the Unicode... ie does string.match(matches[2], "russian string") work properly?
Yep:
Image

User avatar
Vadi
Posts: 5042
Joined: Sat Mar 14, 2009 3:13 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Vadi »

try:

if string.find(matches[2], "работает") then echo"ok" end

in an alias?

Olostan
Posts: 16
Joined: Sun May 08, 2011 2:18 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Olostan »

like a charm:
---
ytest работает
ok
---
original regexp was: "^ytest (.*)$"

as for (\pL+) I just changed (added PCRE option) TAlias.cpp and make another commit

User avatar
Vadi
Posts: 5042
Joined: Sat Mar 14, 2009 3:13 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Vadi »

Ok! Heiko will look over this and let you know if anything needs adjusting

User avatar
Vadi
Posts: 5042
Joined: Sat Mar 14, 2009 3:13 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Vadi »

Can you please test this on lusternia.com, port 23? It seems broken.

Your branch:
Selection_019.png
Selection_019.png (41.93 KiB) Viewed 7244 times
My system isn't working quite properly (I'll figure out later why), but for a start, the \n that the game prepends to unsolicted lines is transformed into an unknown character, instead of a newline.

I have the "fix unnnecessary linebreaks on GA servers" main display option enabled, and all special options are off.

User avatar
Heiko
Site Admin
Posts: 1548
Joined: Wed Mar 11, 2009 6:26 pm

Re: Mudlet and Unicode (+GitHub repository)

Post by Heiko »

Nice, that you're working on this.

1. However, you need to base your repository on the main git repo on sf.net otherwise I cannot commit anything when it's ready.

2. You've only touched the tip of the iceberg from my cursory glace. The problem starts when the unicode char gets out of the ascii representable code pages. Then Lua string functions do no longer work as expected because Lua cannot handle unicode. To solve this problem you can look into Lua unicode support libs (there is at least one).

3. For Chinese we have the additional difficulty that some characters need to be printed as 2 different chars on the screen while the buffer representation is a single unicode char. I've solved this earlier as you can see here: http://forums.mudlet.org/viewtopic.php? ... it=Chinese and I could backport some of the necessary code if the project matures. However, the main difficulty is that the Chinese servers expect you to print some of the unicode characters as single screen chars while the majority is expected to be represented as 2 screen chars for 1 unicode buffer char. This requires major support from the Chinese community.

Post Reply