Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Due to spam on this forum, all posts now need moderator approval.
Entire forum
➜ MUSHclient
➜ General
➜ Chinese trigger not loaded
|
Chinese trigger not loaded
|
It is now over 60 days since the last post. This thread is closed.
Refresh page
| Posted by
| Flow
(5 posts) Bio
|
| Date
| Wed 13 Jun 2012 04:10 AM (UTC) |
| Message
| Hi everyone,
I am new to Mushclient and I have been investigating one issues regarding chinese trigger using regex.
The mud is using Big5 and I cannot check the utf-8 box.
When the triggers contain some special chinese words(eg. "架","跋", "崙"), they are not loaded when I open the world.
It gave a error message saying "Failed: missing terminating ] for character class"
I tried to use some encoder to check those words, all of them contain "%5B" which can be decoded to "[".
I think that's the cause of the problem.
Is there any workarounds to make the regex treating the whole sentence as one string but to check it byte by byte?
Please help.
Thanks | | Top |
|
| Posted by
| Nick Gammon
Australia (23,173 posts) Bio
Forum Administrator |
| Date
| Reply #1 on Wed 13 Jun 2012 06:15 AM (UTC) |
| Message
| What is Big5?
Quote:
I cannot check the utf-8 box.
Why not? |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | | Top |
|
| Posted by
| Flow
(5 posts) Bio
|
| Date
| Reply #2 on Wed 13 Jun 2012 06:43 AM (UTC) |
| Message
| Hi Nick,
Big5 is an encoding for traditional chinese..
Because the mud is using big5, the words will be corrupted if I check the utf-8 box..
The second byte of the chinese character was translated to "[" which make the triggers failed.
Thanks.
| | Top |
|
| Posted by
| Nick Gammon
Australia (23,173 posts) Bio
Forum Administrator |
| Date
| Reply #3 on Wed 13 Jun 2012 07:47 AM (UTC) |
| Message
| | I see. Well I suggest making a plugin that converts incoming packets from Big5 to UTF8, then you can check the UTF8 box and the trigger should work. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | | Top |
|
| Posted by
| Nick Gammon
Australia (23,173 posts) Bio
Forum Administrator |
| Date
| Reply #4 on Wed 13 Jun 2012 07:48 AM (UTC) |
| Message
| I don't know enough about Big5 to be much more specific, but check out this:
http://www.gammon.com.au/scripts/doc.php?general=plugin_callbacks
In particular:
OnPluginPacketReceived
You should be able to do a simple Lua global replace where it converts Big5 to UTF8 from a simple table. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | | Top |
|
| Posted by
| Flow
(5 posts) Bio
|
| Date
| Reply #5 on Wed 13 Jun 2012 07:52 AM (UTC) |
| Message
| Nick,
Thank you very much.
I will try that out first.
by the way, is there any way to make pcre work better on chinese?
| | Top |
|
| Posted by
| Nick Gammon
Australia (23,173 posts) Bio
Forum Administrator |
| Date
| Reply #6 on Wed 13 Jun 2012 07:53 AM (UTC) |
| Message
| |
| Posted by
| Nick Gammon
Australia (23,173 posts) Bio
Forum Administrator |
| Date
| Reply #7 on Wed 13 Jun 2012 07:55 AM (UTC) |
| Message
|
Flow said:
by the way, is there any way to make pcre work better on chinese?
Turn UTF-8 on, it can't know that the characters are not the usual meanings.
Although for triggers it *might* just work to put an underscore before it.
For example, instead of matching on 架 match on \架 |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | | Top |
|
| Posted by
| Flow
(5 posts) Bio
|
| Date
| Reply #8 on Wed 13 Jun 2012 08:20 AM (UTC) |
| Message
|
Nick Gammon said:
For example, instead of matching on 架 match on \架
this does not work..
Because chinese words have 2 bytes..
the problem is that the last byte become a special character..
there is no way to insert \ in between those 2 bytes..
I am looking for a way to group the words and then ignore all special characters inside the group..
seems no such method. | | Top |
|
| Posted by
| Flow
(5 posts) Bio
|
| Date
| Reply #9 on Wed 13 Jun 2012 09:45 AM (UTC) |
| Message
| Finally got one way to solve this...
Quote the word by /Q.../E ..
this will enclose the characters as literal and ignore all syntax...
Thanks all...
| | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
29,492 views.
It is now over 60 days since the last post. This thread is closed.
Refresh page
top