getMudletHomeDir() and encoding
Re: getMudletHomeDir() and encoding
Just make a thread here in the help forum that is clearly named and that spells out exactly what the problem is.
Re: getMudletHomeDir() and encoding
Or submit via Launchpad at: https://bugs.launchpad.net/mudlet/+filebug
- SlySven
- Posts: 1023
- Joined: Mon Mar 04, 2013 3:40 pm
- Location: Deepest Wiltshire, UK
- Discord: SlySven#2703
Re: getMudletHomeDir() and encoding
Ah, just a quick thought but the Lua sub-system shipped with Mudlet may not handle and may not be receiving non-ASCII characters from the core - by rights we ought to be passing UTF-8 data between the two (which is character encoding that is a superset that encapsulates pure ASCII unchanged) but I think that in some cases a from/toLatin1() conversion is done which may be causing this issue.
Please do flag this up via Launchpad - but in the mean time the safest workaround is to use ONLY ASCII for path and file-names. There are additional Lua modules that will work with UTF-8 encoded data but at present we do not include or, I think, present data in a form that could be used by them.
Disclaimer: note that this is my initial musings and may not accurately reflect the official position/situation...!
Please do flag this up via Launchpad - but in the mean time the safest workaround is to use ONLY ASCII for path and file-names. There are additional Lua modules that will work with UTF-8 encoded data but at present we do not include or, I think, present data in a form that could be used by them.
Disclaimer: note that this is my initial musings and may not accurately reflect the official position/situation...!
- SlySven
- Posts: 1023
- Joined: Mon Mar 04, 2013 3:40 pm
- Location: Deepest Wiltshire, UK
- Discord: SlySven#2703
Re: getMudletHomeDir() and encoding
Referring to the example you give the user name has two accented characters which are 'á' U+00e1 {Latin small letter a with acute} and 'č' U+010d {Latin small letter c with caron}. The first, although strictly a non-ASCII character is within the Unicode Latin-1 supplement block and is within range of an unsigned character variable (i.e. a single byte, though as UTF-8 it would be represented as the two bytes 0xc3, 0x83, internal to the Mudlet core it would be stored as the (UCS-4) QChar 0x00c3). The second is within the Unicode Latin Extended-A block and as UFT-8 would be encoded as 0xc4 0x8d, internally it would be stored as a QChar of value 0x010d. These sequences assume that the characters have been (re)normalised (decomposed and recomposed into standardised sequences) and are not presented as, for the latter case say, a Latin small letter c followed by a combining caron ...!
That might make the situation clear to the casual viewer.
That might make the situation clear to the casual viewer.
Re: getMudletHomeDir() and encoding
Thanks, I reported it through Launchpad.
- SlySven
- Posts: 1023
- Joined: Mon Mar 04, 2013 3:40 pm
- Location: Deepest Wiltshire, UK
- Discord: SlySven#2703
Re: getMudletHomeDir() and encoding
I've posited a one line fix on Launchpad that might solve the problem but can you modify, compile and test it? I can't really duplicate the environment - I found a bug in a Debian system utility trying to create a user with that name on my system but it isn't Windows so isn't really a valid test environment for your issue.
Re: getMudletHomeDir() and encoding
Erm, I'm not sure I would know how to do that. I'm not really a coder.
Re: getMudletHomeDir() and encoding
You should be able to fix this by creating a symbolic link to a path not containing a special character.
https://code.google.com/p/symlinker/ seems to be a decent tool for Windows.
https://code.google.com/p/symlinker/ seems to be a decent tool for Windows.
Re: getMudletHomeDir() and encoding
I mean I will gladly test it for you in the next build or if you give me a test version, but I have never compiled anything in my life.