(This won't really capture your first example, but I'm sure this was simply a mistake of forgetting the quotation marks there.)
Anways, just a hint of making regex patterns slightly more efficient: It isn't normally a problem in Mudlet, since it will still match very fast, but as a general principle, I try to avoid using more than a single .* in a pattern, since it can slow down regex matching quite a bit.
Consider how the computer will do the matching if it gets the second line of your example:
Code: Select all
You say (Elvish) 'this is a really, really, really, really, really, really, really,
really, really, really, really, really, really, really, really, really, really, really,
long line.'
It will first attempt to match "^You say ", which matches fine on the received "You say", ok, no problem. Next comes the first ".*" of the pattern. This will match anything and by default, it will match as -much- as possible. So the first .* will match the whole rest of 'this is a really, really [...] long line' AT FIRST. Then it will proceed to try to match the '.*'$ of your pattern. The ' doesn't match however, since we already used up the whole received line in the first ".*". So the regex parser realizes that it doesn't match like this and BACKTRACKS, meaning that it will first remove the last character from the initial .* match and and again try to make the rest match the rest of your pattern. Again, this fails, and the parser again backtracks one step. This continues over and over again, until it has finally backtracked to making the first .* match only (Elvish) , after which the rest will match just fine.
You see, this involves a lot of steps that could have been avoided by simply making the pattern:
^You say [^']*'.*'$
(I.e. we replaced the .* by a [^']* meaning that instead of "match anything" it is now a "match anything except single quotation marks/apostrophs"). In that case, the pattern will never assign the whole rest of the line to that first wildcard, but realize immediately that the [^']* should only match until before the first single quote. This makes is no longer necessary to backtrack all the way, and speeds up matching a lot.
As a general principle, having several wildcards that can match a huge variety of things in the same pattern can slow down things, ESPECIALLY if those wildcards appear early in the pattern. Especially
starting a pattern with a .* or .+ very often results in a lot of backtracking and so I'd avoid it whenever possible.
Unless the pattern is, say, just: ^.*$
In that case, it's no problem and there will never be any backtracking involved, since the pattern will match any line just fine, no matter of how it ends. (I guess it might be slightly more efficient to just use an empty pattern for "match-anything" though. But I'm not that regex-savvy to say that for sure.)
P.S. Er, sorry for the long derail. I got a bit carried away there...