Outputting an Autotranslate Message

#1 Nifim

Member

Members
13 posts

LocationWashington

Posted 27 April 2016 - 04:14 AM

Ok so i see there is a whole of 1 post about Autotranslate that comes up when you search it. Hope someone might have some guidance on how to output an autotranslate message from a lua code.

Simply i wish to ,when i start a spell chain on sch, output to the party {Fusion} blah blah. But its so much more annoyin to actually write autotranslate in to the output string then i thought it would be. I have been able to catch a autotranslate throw it in to a global and place back in to the output allowing my hopes and dreams of my ls not whining that i my "macros" are hard to notice. however its really annoyin to have to feed each skillchain in to my code manually ingame every time i load the addon.

I tried throwing the autotranslate in to the settings but i am never able to feed it back in to the output as it ends up as either a bunch of digits or a funky arrow. I am at a bit of a loss and feel there must be something simple i have over looked

#2 Iryoku

Advanced Member

Windower Staff
488 posts

Posted 05 May 2016 - 10:11 AM

To use auto-translate strings from Lua you need to understand a little bit about character encodings. And to understand character encodings you need to know about Unicode. The first part of this post is just going to give a very brief explanation of these concepts, but if you don't care and would just like to cut to the chase, feel free to skip to the end. Also, just be aware that this is a huge topic, and the rabbit hole is unimaginably deep if you choose to explore it further.

So, what is Unicode?

Well Unicode is an international standard that assigns these abstract things called "characters" to numbers called "code points". For example, in English we have a thing that we call "Upper Case A" and we have another thing that we call "Lower Case A"; Unicode calls both of these things "characters", and it assigns them to the code points 65 and 97, respectively. As of Unicode 8.0 (released in June 2015) there are 260,319 assigned code points from nearly every language in the world, with 853,793 left unassigned.

So, what about Encodings?

The thing about code points is, they're just numbers, and it turns out there are lots of ways to store numbers on computers. That's where encodings come in. Encodings provide standard ways to map a sequence of code points, into a sequence of bytes that can be stored or used by computers. One of the simplest encodings is called UTF-32, which just stores each code point as 4 consecutive bytes (4 bytes * 8 bits per byte = 32 bits, hence the name). A few other popular encodings are UTF-16 and UTF-8, which are a little bit more complicated. FFXI uses a a customized variant of an encoding called Shift_JIS, which is designed specifically to store Japanese text.

I don't want to get too deep into the specifics of how Shift_JIS works, but you do need to know that it is variable length. This means that not every character is stored with the same number of bytes. A byte in the range 00-7F or A1-DF indicates a "half-width" character which is stored in 1 byte. A byte in the range 81-9F or E1-EF indicates a "full-width" character, stored in 2 bytes. The second byte is always in the range 40-7E or 80-FC. You might noticed that there's a lot of unused space. The values 80, A0, and F0-FF are never used for either "half-width" characters or as the first byte of a "full-width" character. This is where SE's modifications come in. SE decided to use this unused space to store some special things for FFXI. The elemental icons are in this unused region of Shift_JIS for instance, and so are the markers for auto-translate.

Finally auto-translate!

Auto-translate phrases are marked in Shift_JIS encoded text with the unused byte FD. After this are 4 bytes that identify, the type of phrase, the language of the client used to input it, and the phrase itself. It's very important to note that none of these 4 bytes can be 00; if one of them is 00 it will corrupt the chat log and potentially crash the game. I won't be going into any more detail about how these 4 bytes are encoded, because it's somewhat complicated. Finally, after these, there's another FD byte to mark the end of the encoded phrase. So a complete auto-translate phrase looks like a sequence of 6 bytes: FD XX XX XX XX FD. You can generate a string like this in Lua like this:

local at_phrase = string.char(0xFD, a, b, c, d, 0xFD)

Where a, b, c, and d are the XX bytes from before.

So, why did I go through all the trouble of explaining character encodings? Well, you can only use this for strings that are in Shift_JIS encoding. Most of the Windower API functions that interact directly with the FFXI chat log expect strings in Shift_JIS, but some others expect strings in UTF-8. Sending a Shift_JIS encoded string to a function that expects UTF-8 or vice versa will result in garbage.

#3 Nifim

Member

Members
13 posts

LocationWashington

Posted 06 May 2016 - 01:37 AM

awesome answer and informative thank very much

#4 sdahlka

Advanced Member

Members
324 posts

Posted 06 May 2016 - 04:00 AM

i wonder if there is a full list of every one of the sequences

#5 Nifim

Member

Members
13 posts

LocationWashington

Posted 06 May 2016 - 05:38 AM

here is the skillchains

which is what i was after so thought i would share

at_Fusion = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC1, 0xFD )

at_Distortion = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC0, 0xFD )

at_Gravitation = string.char(0xFD, 0x02, 0x02, 0x1E, 0xBE, 0xFD )

at_Fragmentation = string.char(0xFD, 0x02, 0x02, 0x1E, 0xBF, 0xFD )

at_Reverberation = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC5, 0xFD )

at_Liquefaction = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC3, 0xFD )

at_Compression = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC2, 0xFD )

at_Transfixion = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC6, 0xFD )

at_Induration = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC4, 0xFD )

at_Detonation = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC8, 0xFD )

at_Impaction = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC9, 0xFD )

at_Scission = string.char(0xFD, 0x02, 0x02, 0x1E, 0xC7, 0xFD )

#6 Iryoku

Advanced Member

Windower Staff
488 posts

Posted 07 May 2016 - 09:14 PM

There isn't a list of the byte sequences, but there is a list of all phrases in the standard dictionary along with their IDs in the Windower resources. The ID in the resources file goes in the 4th and 5th bytes of the sequence (big endian order). Bytes 2 and 3 depend on the client language. Note that any phrases in the list that have an ID with the low 8 bits equal to zero are category names and can't be used in chat (doing so would require using a 00 byte which would corrupt the log). Also keep in mind that this is only a tiny fraction of all possible auto-translate phrases; every item and key item in the game can also be encoded into an auto translate phrase, but doing so isn't quite as simple. Items and key items can have 00 bytes in their IDs and there's a special mechanism for encoding them into an auto translate sequence that deals with that problem.

#1 Nifim

#2 Iryoku

#3 Nifim

#4 sdahlka

#5 Nifim

#6 Iryoku

1 user(s) are reading this topic