| Posted by
| David Haley
USA (3,881 posts) Bio
|
| Message
| OK... the basic idea between string hashing is to save memory.
For instance, you might want to have 30 mobs, each with the name "a goblin".
Now, without string hashing, you are storing those 9 bytes (8 letters + null space) 30 times. Not good...
The clever thing with string hashing is that you store it only once, and you keep a reference count on how many times that string is actually used. When the reference count hits 0, you deallocate the memory - but never before.
How the hashing works is a slightly different subject, but basically it keeps a list of string/count structures. When you use STRALLOC, it looks up in the table if the string already exists; if so, it returns the pointer to that and increments the reference count; if not, it allocates new memory, sets the reference count to one, and then returns that pointer.
Hash tables are actually a slightly more complicated concept; I don't think it's necessary to understand the concepts of hashing algorithms, buckets, hash collisions, etc., to understand string hashing.
The QUICKLINK macro basically means that you know that the string you're linking to is in the hash table, so you know where to increment the count. Personally I think that the way quicklink is implemented is dangerous, but that's another story. :)
Anyways, let's consider now that you have a hashed string. Because of how the implemented it (no type safety or anything) your string is basically a char *. This is good because it means you can use it exactly as you would a normal, non-hashed string, but bad because of the freeing.
What happens if two mobs share the same hashed name? There is only one memory allocation, with a reference count of two. If you use the wrong macro - DISPOSE for instance - you do not properly manage the hash table. So, it goes and erases the memory, and poof, the other mob is now pointing to invalid memory.
What STRFREE does is to actually go into the hash table, decrement the count, and only deallocate if there is nothing left.
What I think is very poor in SMAUG's implementation is that the only way to tell if a string is hashed is to look up its allocation, and that can be tedious. It would have been as simple as putting an h_ in front of the variable name, such as mob->h_name... and better yet, which is the solution I am adopting, create your own data structure that provides nicer type safety. In their defense, however, such a solution is much easier to do in C++ where you can have much stricter compile-time typing than in C. Still, they could have done a better job. |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | | Top |
|