Gammon Forum
Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to "verify" your details, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Entire forum
MUSHclient
General
Plain Text Trigger Performance
Plain Text Trigger Performance
|
It is now over 60 days since the last post. This thread is closed.
  Refresh page
Posted by
| Candido
USA (78 posts) bio
|
Date
| Fri 03 Jun 2011 08:14 AM (UTC) |
Message
| I've been thinking a long time about this, but I want to check and make sure I'm not totally off track before I do something that's going to be a pretty big undertaking.
Assume a sample size of around 500-1000 simple triggers that require no wildcards. Just plain matches. In theory/guessing, which of these is fastest?
1. Make each an individual trigger with anchored regex (^ and $).
2. Make each an individual plain text trigger, no regex.
3. Create a catchall trigger ^(.+)$ which passes its match to a function. The function uses the plain text of the match as a key in a hash table that is constructed with every one of the plain text matches of the 'triggers' as keys, and functions of what each one should do when matched (what would otherwise normally be the Send To of the trigger).
As you can probably guess, I'm here because I think 3 would be the fastest. A hash table lookup is, when best implemented, a constant time operation no matter the size of the table, and I have no reason to believe Lua is lacking in this respect.
The underlying question then is the performance of ^(.+)$. Is it slow because is can be anything, or is it fast because it can be anything? My instinct is that even if it's slow (observation seems to confirm this), it isn't so slow that its matching time plus a constant time hash table lookup is greater than the time needed to evaluate 500-1000 separate triggers.
Opinions? | top |
|
Posted by
| Candido
USA (78 posts) bio
|
Date
| Reply #1 on Fri 03 Jun 2011 08:18 AM (UTC) |
Message
| Forum is logging me out whenever I try to edit, but by Send To above I just meant Send. | top |
|
Posted by
| Nick Gammon
Australia (23,042 posts) bio
Forum Administrator |
Date
| Reply #2 on Fri 03 Jun 2011 08:31 AM (UTC) Amended on Fri 03 Jun 2011 08:32 AM (UTC) by Nick Gammon
|
Message
| First, options 1 and 2 will be the same. Internally all triggers are compiled to regular expressions.
As for option 3, I'm inclined to agree with you. For a large number, exercising the trigger evaluation code (plus the associated overhead inside the client) 1000 times per line sounds like it will take longer than finding a single match and then doing a lookup.
Especially since in Lua table lookups are pretty efficient.
As for the slowness of the "catch all" trigger I tried a bit of test code:
re = rex.new ("^.+$")
--re = rex.new ("^.")
local str = "While traversing the busy, cobbled streets of Darkhaven, you notice"
local s, e, t
start = utils.timer ()
for i = 1, 1e6 do
s, e, t = re:exec (str)
end -- for
finish = utils.timer ()
print ("time taken = ", finish - start)
I tried two different ideas here, because for a "catch all" trigger you don't really need to make the regexp handler go to the end of the line, do you? After all, triggers always return the matching line. So all you really want is something that always triggers. Hence matching on "." should do it. The regexp engine will find a single character, matching anything, and quickly exit.
Then use the second argument to the trigger (the matching line) and do your table lookup.
My testing above seems to indicate that matching a single "." indeed runs faster than trying to match "^.+$" but not by a very large amount.
To put it into context, I got the million matches from the script in around 0.89 seconds using "." but 1.10 seconds using "^.+$".
So either way, I wouldn't really call a million matches a second "slow". |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | top |
|
Posted by
| Candido
USA (78 posts) bio
|
Date
| Reply #3 on Fri 03 Jun 2011 09:07 AM (UTC) |
Message
| Awesome, that's what I was hoping to hear, and at the same time not hoping to hear :P
Also, you might consider implementing this directly into Mushclient. Not that I'm trying to offload work that can be done on the user side, but I think a lot of people with big bloated worlds on ludicrously spammy MUDs (Iron Realms, anyone?), could benefit from it.
This is an oversimplification no doubt, but I imagine all you'd have to do is make it so when it comes time to evaluate a plain text trigger, it goes to table lookup instead of the usual Mush trigger engine. Plain text is determined by the trigger having only special characters of ^, $, and \ when interpreted as regex, which can be flagged in the trigger itself to avoid having to determine it during run time.
Imagine the freedom that would come from knowing you can add any amount of plain text triggers and not incur the slightest performance decrease, of course provided you don't add so many that you run out of memory...
It's something to consider. Also, totally did not know we had a time function with granularity finer than a millisecond. Here I am doing all my speed testing with os.clock(). | top |
|
Posted by
| Nick Gammon
Australia (23,042 posts) bio
Forum Administrator |
Date
| Reply #4 on Fri 03 Jun 2011 10:21 AM (UTC) Amended on Mon 25 Nov 2013 08:33 PM (UTC) by Nick Gammon
|
Message
| Well, I understand what you are saying. But triggers right now are self-contained. That is not only might they match on plain text but they might all do different things. Your case seems to me to be a fairly specialized case, which currently can be handled quite quickly.
It sounds like an interesting system you are working on. You could almost consider doing a SQLite3 database lookup. That might be slower than a table lookup, but the database would probably cache things and be quite fast.
Look at this for example:
The mapper reads a database to draw those rooms. Admittedly it caches the room information in memory once they are read, but it seems quite fast, as each box on the screen is a room. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | top |
|
Posted by
| Candido
USA (78 posts) bio
|
Date
| Reply #5 on Sun 05 Jun 2011 06:54 AM (UTC) |
Message
| That's a really cool mapper, though I suspect helped a bit by aardwolf being so grid-like.
SQLite is an interesting idea, but after all I am trying to go for maximum speed here. Also my experience with SQLite in Mushclient has not been the best to begin with. Could very well just be improper use or foolish expectations on my part, but I once made a script that did about 10k INSERTs by putting DatabaseExec in a for loop, and it froze so bad I had to close Mush. I have a hunch that using the for loop to build a massive sql string and then doing it in one DatabaseExec would probably help things. Also, once things are built, just using SELECT might be a lot faster, but I haven't timed it or tested it extensively. | top |
|
Posted by
| Nick Gammon
Australia (23,042 posts) bio
Forum Administrator |
Date
| Reply #6 on Sun 05 Jun 2011 07:46 AM (UTC) |
Message
| You should have used a transaction. That is MUCH faster. Try this code:
DatabaseOpen ("db", GetInfo (66) .. "mytestdb.sqlite", 6)
DatabaseExec ("db", [[
DROP TABLE IF EXISTS weapons;
CREATE TABLE weapons(
weapon_id INTEGER NOT NULL PRIMARY KEY autoincrement,
name TEXT NOT NULL,
damage INT default 10,
weight REAL
);
]])
start = utils.timer ()
DatabaseExec ("db", "BEGIN TRANSACTION")
for i = 1, 10000 do
DatabaseExec ("db", "INSERT INTO weapons (name, damage) VALUES ('sword', 42)")
end -- for
DatabaseExec ("db", "COMMIT")
finish = utils.timer ()
print ("time taken = ", finish - start)
print ("changes = ", DatabaseTotalChanges ("db"))
DatabaseClose ("db") -- close it
That inserts 10000 records, and printed this:
time taken = 0.76090487452166
changes = 10000
That's 10000 inserts in under a second!
And to read all 10000 back again ...
DatabaseOpen ("db", GetInfo (66) .. "mytestdb.sqlite", 6)
start = utils.timer ()
local count = 0
DatabasePrepare ("db", "SELECT * from weapons")
-- execute to get the first row
rc = DatabaseStep ("db") -- read first row
-- now loop, displaying each row, and getting the next one
while rc == sqlite3.ROW do
count = count + 1
local id = DatabaseColumnValue ("db", 1) -- get some data
rc = DatabaseStep ("db") -- read next row
end -- while loop
-- done with statement
DatabaseFinalize ("db")
finish = utils.timer ()
print ("time taken = ", finish - start)
print ("read = ", count)
DatabaseClose ("db") -- close it
That printed:
time taken = 0.022610129890381
read = 10000
So we read 10000 records in 22 milliseconds. Again, not too bad.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | top |
|
Posted by
| Candido
USA (78 posts) bio
|
Date
| Reply #7 on Mon 06 Jun 2011 06:07 AM (UTC) |
Message
| Thanks. I knew it was something simple. | top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
21,130 views.
It is now over 60 days since the last post. This thread is closed.
  Refresh page
top
Quick links:
MUSHclient.
MUSHclient help.
Forum shortcuts.
Posting templates.
Lua modules.
Lua documentation.
Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.