Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.
 Entire forum ➜ MUSHclient ➜ Suggestions ➜ Line drawing characters, a la terminals & co

Line drawing characters, a la terminals & co

It is now over 60 days since the last post. This thread is closed.     Refresh page


Posted by Worstje   Netherlands  (899 posts)  Bio
Date Fri 27 Jul 2007 11:28 AM (UTC)
Message
Would special linedrawing facilities be an option at some point?

Not that I mind doodling with ascii art, but for simple tables, I am kind of getting peeved with the fact all I can use is +, - and |, all of which leave little gaps in between the different characters. And + for corners.. well, it just makes me shiver.

Back in the olden days of DOS, those were the times... double lines, single lines, fat lines... those of you familiar with the ancient Wordperfect versions, it had funky linedrawing capabilities.

[/reminiscing]

Right, probably it will never go in.. but I can still daydream about it. ^_^
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #1 on Fri 27 Jul 2007 09:23 PM (UTC)
Message
I *think* some of the Unicode code points are designed for that. If you are planning it just for internal use (i.e at the client end), a suitable font, with UTF-8 enabled, might do it for you.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Worstje   Netherlands  (899 posts)  Bio
Date Reply #2 on Fri 27 Jul 2007 09:35 PM (UTC)
Message
Ooh, I'll have a look at it. But I don't think my favorite font supports unicode though. It would definately be for client-side usage.

Thanks for the tip. :)
Top

Posted by Onoitsu2   USA  (248 posts)  Bio
Date Reply #3 on Fri 27 Jul 2007 09:48 PM (UTC)
Message
www.torasin.com/~venificius/Aardwolf_Colored_ConsiderLUA.xml

install that into a world, and then type concolors, and that is something of the table i've made, I COULD have made it all solid white lines, but chose not to, as this way still was very easily read and understood.

hope that might inspire you a little :)

Onoitsu2 (Venificius on Aardwolf)
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #4 on Fri 27 Jul 2007 11:12 PM (UTC)

Amended on Fri 27 Jul 2007 11:15 PM (UTC) by Shaun Biggs

Message
That's exactly what Worstje is doing, if you look at the first post. Also, there is one problem with this table... each line doesn't fit on one line if a person has their client set to a standard wrap at 80 setting. I just downloaded it again and found that this is the same for the current version, so I did what I had done before and neatened it up by removing one space from the left side data and putting a "+" at the corners between any "-" and "|" as Worstje mentioned doing. Been running this plugin like that for quite a while. One of my favourites that if seen you write.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Shadowfyr   USA  (1,788 posts)  Bio
Date Reply #5 on Sat 28 Jul 2007 01:51 AM (UTC)
Message
I actually created a font with some characters rearranged to "fix" where they didn't match in the high end letters (over 127) with the fonts everyone else used. The problem was that it was only one size, didn't show up with the right point size info, and doesn't do you any good if other people are not using it.

Strictly speaking, only *some* fonts, like Lucida Console, have the correct unicode characters in them at all to do what he wants. I don't remember what the unicode range (i.e. what first byte needs to be used to say, "use the console characters", to get the correct offset for them. They may even be in one of the offsets not available for utf8, which would be anything from roughly 0x0000-0x7FFF. In other words, any unicode sequence that has its first byte with the high bit *off*. The utf8 compatible range is anything from 0x8000 to 0xFFFF.

And to make matters worse, most fonts are not fixed width, so even if they did have the characters, they wouldn't display them in the correct grid pattern to form boxes anyway. Some times fonts are a pain in the rear...
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #6 on Sat 28 Jul 2007 05:36 AM (UTC)
Message
If you find a font that defaults to being able to use the upper Ascii codes correctly though, it shouldn't be hard to manage at all. Just make sure that people are only using the fonts accepted by checking GetInfo( 20 ) before trying to use whatever extra symbols you want.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #7 on Sat 28 Jul 2007 05:53 AM (UTC)
Message
Quote:

They may even be in one of the offsets not available for utf8, which would be anything from roughly 0x0000-0x7FFF. In other words, any unicode sequence that has its first byte with the high bit *off*. The utf8 compatible range is anything from 0x8000 to 0xFFFF.


I don't quite understand that. UTF-8 is a way of encoding Unicode characters, including 0x0000 to 0xFFFF. However characters in the range 0x00 to 0x7F are identical to non UTF-8 ones. That is, they encode the same way. Once you get 0x80 onwards it looks different. For example, if you press Ctrl+Shift+F12 in MUSHclient, click Insert Unicode, and type in '80', you get \C2\80. If you type '7F' you get \7F.

The problem for MUDs is, if you want to send characters in the range 0x80 to 0xFF, are they to be interpreted as UTF-8, or as simply special characters in certain character sets.

For example, if you want the character 0x95 and you are *not* using UTF-8, you simply send \95. However in UTF-8 it is \C2\95 (two bytes for one glyph on the screen).




- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Shadowfyr   USA  (1,788 posts)  Bio
Date Reply #8 on Sun 29 Jul 2007 02:08 AM (UTC)
Message
Hmm. Ok, see, I assume that, in the simplest terms, utf8 just made everything in 0x00-0x7f a normal character, then supported everything else as 0x80-0xff + <actual character>. Seems they decided to go more complicated than that: "UTF-8 encoded characters may theoretically be up to six bytes long, however 16-bit BMP characters are only up to three bytes long." Yikes!!

So, in other words.. If a font had copyright symbol, I would have to take the *real* unicode which is normally, "0x00A9", I.e., block 0, character A9, and instead encode it as, "0xE289A0", according to the page I read... Ok, I can see where that is useful in an OS, where specific bytes have special meanings, but for text transmission, it just bloats the data stream by anywhere from 1-4 extra characters for "everything" that isn't basic text. Hardly what I would call efficient. And for that matter, hardly what I would call human readable either, which one would thing would be a big thing, given all the open document format who hah everyone is on about right now, to make sure things *stay* readable for the forseeble future. lol

And here I thought the concept was so simple you could just string stuff together by hand and understand what you where doing. :p
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #9 on Sun 29 Jul 2007 03:52 AM (UTC)
Message
Quote:

I would have to take the *real* unicode which is normally, "0x00A9", I.e., block 0, character A9, and instead encode it as, "0xE289A0", according to the page I read.


No, where did that idea come from? Click on Insert Unicode in the MUSHclient "debug simulated world input" box, and type in a9, and you get: \C2\A9

The whole point of UTF-8 is to be reasonably friendly to "legacy" applications that still use 8-bit character encoding. Bear in mind that for many C applications, hex 0x00 (ie. zero) is a string terminator. Thus, the string 0x00A9 will either be an empty string (string terminator, followed by A9, or A9 followed by the string terminator (depending on the endian-ness of the CPU). So either way you couldn't imbed 0x00A9 into the middle of a document. Imagine also the Unicode character 0x010A - the 2nd byte looks like a linefeed (0x0A).

UTF-8 is specifically designed so that you don't get 0x00 bytes or indeed anything that looks like a "control" code in the text stream.

To use "straight" Unicode for (say, downloading HTML documents, or talking to a MUD), both ends would have to agree that each character on the screen needed 2 bytes (or maybe 3 or 4), so you knew how many bytes represented a "character".

Quote:

Ok, I can see where that is useful in an OS, where specific bytes have special meanings, but for text transmission, it just bloats the data stream by anywhere from 1-4 extra characters for "everything" that isn't basic text. Hardly what I would call efficient.


Yes, up to a point. However if you decide on 2 bytes for everything, then you are already taking up one of those extra bytes, so you are hardly better off, and you are worse off if a lot of the text is normal English text (eg. program code).

Probably if all of your text was in Unicode characters that required 3 bytes in UTF-8, but only 2 in 16-bit encoding (for example, Japanese), then you are better of - at least as far as space goes - to use 16-bit encoding.


There is a lot of detail in Unicode, try looking at:

http://www.unicode.org/

They give heaps of explanation there.


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #10 on Sun 29 Jul 2007 06:14 AM (UTC)
Message
Quote:
it just bloats the data stream by anywhere from 1-4 extra characters for "everything" that isn't basic text. Hardly what I would call efficient. And for that matter, hardly what I would call human readable either, which one would thing would be a big thing, given all the open document format who hah everyone is on about right now, to make sure things *stay* readable for the forseeble future.

No one ever claimed that Unicode is terribly efficient when dealing with small sets of characters. The whole point of Unicode is that you can display any type of language in a nice standard format. There isn't really a nice way to keep piling languages onto a character set and keep it legible to the human eye for terribly long, since the complexity just grows. Especially with languages that have characters which are modified by what characters are around them.

The actual benefit of Unicode is that it will display exactly what you want it to through any program that supports Unicode, on any platform, and for whatever language you are using. If you are terribly worried about bandwidth or storage size, it compresses very well when you are using only a tiny set of the characters, since not terribly much will be unique.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Worstje   Netherlands  (899 posts)  Bio
Date Reply #11 on Sun 29 Jul 2007 08:43 AM (UTC)
Message
Sadly, my font does NOT support most unicode gimmicks, but is monospaced. Does anyone know of a funky 'supplement_monospaced_fonts_with_box_drawing_characters.exe'? :) Or maybe an updated Monaco font that does have these characters?

Oh, right. The characters are in the 2500-2580 range, for anyone looking for them.
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


32,111 views.

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.