Getting raw data

Bloodweiser
Posts: 6
Joined: Thu Aug 18, 2016 12:34 am

Getting raw data

Post by Bloodweiser »

When I do a packet capture, there's quite a bit of data that I'm missing on the front of the client. Some of it is color formatting, but most of it is just being ignored. How can I capture the raw data for scripting? I tried changing the echo function for the trigger input, but it seems that the string is already formatted before it gets to this point.

What I'm getting on the display and as a result of the regex trigger:
[]\ []mmmm []. . . . []. . . . | . . . . [][][]---- A
SioreilaA> R spear L shield Hits
: 40/40 Hits Taken : 0 Stamina : 10 Experience : 2392 ->

What I would like to script with (wireshark capture):
[2J[1;[3;6H [3;8H [3;10H[33m[][0m[3;12H\ [3;14H[33m[][0m[3;16H[35mmm[0m[3;18H[35mmm[0m[4;6H [4;8H [4;10H[33m[][0m[4;12H. [4;14H. [4;16H. [4;18H. [5;6H [5;8H [5;10H[33m[][0m[5;12H. [5;14H. [5;16H. [5;18H. [6;6H [6;8H [6;10H| [6;12H. [6;14H. [6;16H. [6;18H. [7;6H [7;8H [7;10H[33m[][0m[7;12H[33m[][0m[7;14H[33m[][0m[7;16H--[7;18H--[8;6H [8;8H [8;10H [8;12H [8;14H [8;16H [8;18H [9;6H [9;8H [9;10H [9;12H [9;14H [9;16H [9;18H [2;24H A Sioreila[4;14HA[6;12H>[23;1H [23;1H [1;32mR spear[0m[24;1H [24;1H [1;32mL shield[0m[23;42H [23;30H[35mHits : 40/40 [0m[23;72H [23;60H[35mHits Taken : 0[0m[25;42H [25;30H[35mStamina : 10[0m[24;42H [24;30H[35mExperience : 2392 [0m[11;1H[21;1H ->

User avatar
Vadi
Posts: 5035
Joined: Sat Mar 14, 2009 3:13 pm

Re: Getting raw data

Post by Vadi »

You can't access the raw data, but you can access formatting and etc - what exactly are you trying to do? I would be able to help you point the right way.

Bloodweiser
Posts: 6
Joined: Thu Aug 18, 2016 12:34 am

Re: Getting raw data

Post by Bloodweiser »

Thanks, much of the following data is coordinate data posing as formatting.

[2J[1;[3;6H [3;8H [3;10H[33m[][0m[3;12H\ [3;14H[33m[][0m[3;16H[35mmm[0m[3;18H[35mmm[0m[4;6H [4;8H [4;10H[33m[][0m[4;12H. [4;14H. [4;16H. [4;18H. [5;6H [5;8H [5;10H[33m[][0m[5;12H. [5;14H. [5;16H. [5;18H. [6;6H [6;8H [6;10H| [6;12H. [6;14H. [6;16H. [6;18H. [7;6H [7;8H [7;10H[33m[][0m[7;12H[33m[][0m[7;14H[33m[][0m[7;16H--[7;18H--[8;6H [8;8H [8;10H [8;12H [8;14H [8;16H [8;18H [9;6H [9;8H [9;10H [9;12H [9;14H [9;16H [9;18H [2;24H A Sioreila[4;14HA[6;12H>[23;1H [23;1H [1;32mR spear[0m[24;1H [24;1H [1;32mL shield[0m[23;42H [23;30H[35mHits : 40/40 [0m[23;72H [23;60H[35mHits Taken : 0[0m[25;42H [25;30H[35mStamina : 10[0m[24;42H [24;30H[35mExperience : 2392 [0m[11;1H[21;1H

The items in blue in the format [#;#H are really coordinates, posing as formatting tags, it's to place where the "A Sioreila" character is supposed to go in the map. I'm not sure yet what all of the other codes are for, some of them are color formatting, which I can ignore. If I can access those coordinates as a formatting token, then that's fine.

I'm converting everything before the first > into a map, which is a 7x7 grid of 2 characters each followed by all the NPC/PCs on the grid with coordinates.

User avatar
SlySven
Posts: 1019
Joined: Mon Mar 04, 2013 3:40 pm
Location: Deepest Wiltshire, UK
Discord: SlySven#2703

Re: Getting raw data

Post by SlySven »

Oh those are CSI CUP (Cursor positioning commands):
In Wikipedia it was wrote:CSI n ; m H
CUP – Cursor Position
Moves the cursor to row n, column m. The values are 1-based, and default to 1 (top left corner) if omitted. A sequence such as CSI ;5H is a synonym for CSI 1;5H as well as CSI 17;H is the same as CSI 17H and CSI 17;1H
For further information see the relevant Wikipedia page.

AFAICT We only handle some of the ones ending in "m" which are SGR {Set Graphic Rendition} commands.

Mind you, the Incoming stream parser in, I think, cTelnet.cpp which processes this really, really needs an overhaul - for the main reason that it does not handle incoming Utf-8 streams which it must do if we want to get proper support for non-ASCII Muds (i.e. those out-side of the USA) done.

I think getting CUP to work will be, interesting ;) as, of course, working out exactly where line 1 is on a scrolling, buffered page will be difficult, unless we have previously found the spot where the MUD server thinks it has cleared the screen, (with a different code) or reset it to the top left corner with a CSIH (no parameters) perhaps...

Bloodweiser
Posts: 6
Joined: Thu Aug 18, 2016 12:34 am

Re: Getting raw data

Post by Bloodweiser »

I looked at TBuffer.cpp and it looks like the [X;XH values will be ignored. I didn't look to see how the tags get purged, but the triggers are kicked off from the translate function, so there doesn't look to be a good way to get the CUP data. Even if CUP were to be implemented, I think parsing it would be problematic due to the stacking of NPCs, unless there's a way to show a lot of symbols on the same space.

I noticed that the telnet client generates a replay log, would it be possible to stream read the replay log based on the trigger detecting the translated map? I would need a way to match up the specific log message to the current trigger time; the last log may work for the current message in a live session, but wouldn't work for replays.


I didn't quite look through enough of the code to determine if this would work, I'll probably try it out, but if someone knows if it would/wouldn't work it might save me some time.

User avatar
SlySven
Posts: 1019
Joined: Mon Mar 04, 2013 3:40 pm
Location: Deepest Wiltshire, UK
Discord: SlySven#2703

Re: Getting raw data

Post by SlySven »

Yeah, the SGR codes are parsed and removed from the incoming data stream in TBuffer::translateToPlainText(std::string & data) which is called from TConsole::printOnDisplay(std::string & data) and the data for that comes from cTelnet::postData() for data coming from the MUD server OR from TLuaInterpreter::feedTriggers(...) .

There are two logging systems, the one you refer to is what is called a "Replay": and it does indeed contain a recording of what was sent FROM the MUD server and can be replayed (at various speed-ups if so wished - slow down/pause/loop are future plans) to test users' script systems (it is most useful if you are NOT connected to the server during a replay!) it does retain the ANSI control sequences BUT it also contains time stamps and "chunk"-size information and is not a format that is intended for end-user usage.

The OTHER log is either a plain text, or HTML, file that records the main console text and (IMHO) the latter makes a reasonable job of reproducing what appears on screen in a browser window. The same engine is also used where the user does a "copy" or "copy as HTML" operation on any of the profile's consoles. Yours true did some work to improve that in the recent past - and I do have some stuff on a back burner to further improve it by "caching" the HTML/CSS stuff to make the files more compact...

Bloodweiser
Posts: 6
Joined: Thu Aug 18, 2016 12:34 am

Re: Getting raw data

Post by Bloodweiser »

Looks like this is becoming less feasible. The replay log file doesn't flush, so I can't get at the logs. Would there be opposition to adding a rawline variable in lua that could be accessed the same way as line? What's the approval process for contributions?

User avatar
Vadi
Posts: 5035
Joined: Sat Mar 14, 2009 3:13 pm

Re: Getting raw data

Post by Vadi »

Can you see first if you can make something work with http://wiki.mudlet.org/w/Manual:Technic ... t_protocol ?

Bloodweiser
Posts: 6
Joined: Thu Aug 18, 2016 12:34 am

Re: Getting raw data

Post by Bloodweiser »

Thanks, that looked like a promising path, but I'm not receiving any telnet control events from the server.

I registered a few different events without issue, but could not get a sysTelenetEvent to trigger. When I took a look at the code to trigger the event, I noticed that all of the command set codes are 0xf?, the server I'm connecting to is just using escape codes to provide the metadata.

from wireshark, my dataline is:
1B 5B 32 4A 1B 5B 31 3B 31 66 1B 5B 33 3B 36 48
1B is the escape character
followed by [2J (clear the screen)
followed by [1;1f (move cursor absolute)
followed by [3;6H (move cursor relative)

So it's not really a control command.

User avatar
SlySven
Posts: 1019
Joined: Mon Mar 04, 2013 3:40 pm
Location: Deepest Wiltshire, UK
Discord: SlySven#2703

Re: Getting raw data

Post by SlySven »

Handling that data is going to be a problem - it is predicated on a VT100 model that has a fixed screen that the MUD server can "clear" and then reposition a cursor in the next 24 lines or however many lines it thinks the screen is. Conceivable we might be able to code a handler that, upon detecting the Clear Screen, allocates a Screen height number of lines and places characters as directed by those cursor movement codes, but it is not clear when that mode of operation finishes and we can get back to just appending lines afterwards at the end of the region on screen so that the image scrolls up and off the screen. To be honest this issue would go away if the MUD server identified a GUI rather than a CLI type client and just rendered the map itself before sending it out as a series of lines from top to bottom. The bottom line is that the MUD server is sending codes it assumes that the client can handle without checking first, (which is a bit naughty IMHO :? ), unless they are part of the base NVT (Network Virtual Terminal) which is the bare-bones, minimum specification for something at one end of a Telnet link!

Yeah you need to see the raw data stream which is unfortunately sanitised/munged before you see it. Adding a new Telnet protocol handler is a different subject I think and Vadi's reference "Adding telnet support for a new protocol" is not relevant I think. The trouble is that providing the raw data for user processing will introduce a performance hit I believe and allowing the user to modify it before processing could break things (MXP specifically says that some stuff should not be made available to the end user IIRC - e.g. see the section "MXP Line Tags" in this) .

Also, looking at that Mudlet Telnet handling reference material that Vadi indicated revealed something that sounded a bit iffy to me in that it is said that each incoming line ends with a new line. In fact under certain circumstances that is not the case and appending one (which I think we do) if one is missing can break things - for instance when the MUD server subsequently sends backspace characters and then more text to "animate" things (some DikuMuds and derivatives do this to make a spinning "wait" indicator where they send one of '\', '|', '/' and '-' and then a backspace and then the next one in the series).

tl;dr;

It might be possible to code for a tap or "tee" that makes the incoming stream available on a read-only basis using the Event system (an event gets raised with the raw data) but there are issues with that:
  • a Lua script may well want to encode ASCII control characters as literals (to pick out those CSI i.e. ASCII '␛' and '[' codes) but the Lua scripts and other stuff is stored as XML 1.0 and THAT prohibits ASCII codes (below the '@' character) except for '␉', '␊' or '␍' although it might be able to encode the other codes (except 0x00) below 0x20 as a &#XX; ...
  • the processing time could slow things down
I'm not minded to place this issue high on my stack of things to address at the moment but what is the MUD Server that is sending these codes at you - just in case this does deserve further attention?

Post Reply