Working with non ASCII strings in Lua scripts (KOI8-R example)

Share your scripts and packages with other Mudlet users.
Post Reply
ILIUS
Posts: 7
Joined: Mon Sep 11, 2017 7:12 am

Working with non ASCII strings in Lua scripts (KOI8-R example)

Post by ILIUS »

Been trying to do some triggering for MUDs with cyrillic. I'm new to lua and Mudlet. Guys explained that non-ASCII characters are not yet in place. Version 4.0 possibly. So tried to do some workaround. I noticed that I can get some utf8 codes of incomming strings with utf8.codes() for example. Those codes are strange. Smth like if take koi8r encode it to ascii and then try to get the utf8 codes. Or I'm not even close with that. Encoding was always far from my understanding :mrgreen: Why didnt they all used one since the beginning?? :mrgreen:

Well I managed to make some translation module from those codes. Please advise me with some optimization and code structure! Or just say that I'm doing all wrong :D
What I did.
First, I got all char codes I whanted to translate with this simple inline lua:
Code: [show] | [select all] lua
lua for p, c in utf8.codes("абвгдежзийклмнопрстуфхцчшщьыъэюяАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯ") do echo("\n"..c) end
Then I put them all in a table like this:
Code: [show] | [select all] lua
koi8rCodeTable = koi8rCodeTable or {}

koi8rCodeTable = {
[224] = "а",
[225] = "б",
[226] = "в",
[227] = "г",
[228] = "д",
[229] = "е",
[230] = "ж",
[231] = "з",
[232] = "и",
[233] = "й",
[234] = "к",
[235] = "л",
[236] = "м",
[237] = "н",
[238] = "о",
[239] = "п",
[240] = "р",
[241] = "с",
[242] = "т",
[243] = "у",
[244] = "ф",
[245] = "х",
[246] = "ц",
[247] = "ч",
[248] = "ш",
[249] = "щ",
[252] = "ь",
[251] = "ы",
[250] = "ъ",
[253] = "э",
[254] = "ю",
[255] = "я",
[192] = "А",
[193] = "Б",
[194] = "В",
[195] = "Г",
[196] = "Д",
[197] = "Е",
[198] = "Ж",
[199] = "З",
[200] = "И",
[201] = "Й",
[202] = "К",
[203] = "Л",
[204] = "М",
[205] = "Н",
[206] = "О",
[207] = "П",
[208] = "Р",
[209] = "С",
[210] = "Т",
[211] = "У",
[212] = "Ф",
[213] = "Х",
[214] = "Ц",
[215] = "Ч",
[216] = "Ш",
[217] = "Щ",
[220] = "Ь",
[219] = "Ы",
[218] = "Ъ",
[221] = "Э",
[222] = "Ю",
[223] = "Я"}
Wrote a lil module with one function:
Code: [show] | [select all] lua
nonASCII = nonASCII or {}

nonASCII.currentCodeTable = koi8rCodeTable

function nonASCII:translate(nonASCIIString)
	local ASCIICharsTable = {}
	for p, code in utf8.codes(nonASCIIString) do
		if self.currentCodeTable[code] ~= nil then
			table.insert(ASCIICharsTable, self.currentCodeTable[code])
		else
			table.insert(ASCIICharsTable, utf8.char(code))
		end
	end
	return table.concat(ASCIICharsTable)
end
And tested it with triggering promt line (unfortunatelly no GMCP in the MUD I play). Thats the regex (last group is for exits):

Code: Select all

(\d+)H (\d+)M (\d+)V (\d+)X (\d+)C Вых:(.+)>
And the test code finally. Echoing translated exits:
Code: [show] | [select all] lua
promptExits = nonASCII:translate(matches[7])
echo(promptExits)

User avatar
Vadi
Posts: 5035
Joined: Sat Mar 14, 2009 3:13 pm

Re: Working with non ASCII strings in Lua scripts (KOI8-R example)

Post by Vadi »

Nice work!

Some Lua-specific optimisations to your functions:
Code: [show] | [select all] lua
nonASCII = nonASCII or {}

nonASCII.currentCodeTable = koi8rCodeTable

function nonASCII:translate(nonASCIIString)
	local ASCIICharsTable = {}
	local codetable, char = self.currentCodeTable, utf8.char

	for p, code in utf8.codes(nonASCIIString) do
		if self.currentCodeTable[code] ~= nil then
			ASCIICharsTable[#ASCIICharsTable+1] = codetable[code]
		else
			ASCIICharsTable[#ASCIICharsTable+1] = char(code)
		end
	end
	return table.concat(ASCIICharsTable)
end

Post Reply