As much as your doing an amazing job. I just wanted to ask after you get done with this do you plan on translating Fuuka & Desco-hen Hajime Mashita too? It's the add on disc that was recently released.
I am not too sure if it shares the same eboot/sfo version as the main Japanese version of the game. It should but I am not sure yet. In fact I don't know if it has been released online or not so I really wouldn't know. Here is the boxart to it.
(Fuuka and Desuko-hen... I didn't know this existed. Good to know, I guess!)
Some news regarding the 2-letters-per-glyph hack I suggested.
I've compiled (and improved) the scripts I used for my tests into a single multipurpose script -- see attached file. It can be used to convert "talk.dat" to and from an editable text format, generate glyph pairs from the strings it contains, and generate a modified "font.lzs" from previously obtained pairs.
Also, it can be used to convert ASCII strings into pseudo-SJIS strings using previously obtained "font.lzs". Finally, for those who frown upon this whole "letter pair" hack, it can be used to substitute ASCII letters by full-width SJIS romajis, as Tidusnake666 suggested. (Please note that the "talk.dat" conversion part has required quite a bit of research. You may want to read the source regarding that aspect if you're curious.) You'll need perl to run this script.
First and foremost, it seems the "talk.dat" from the US version can be simply used instead of the Japanese one, once its strings are converted. This is a bit surprising, since the number of conversation entry points differ between the two files. So I'll suppose to have the US version of the game as well, and that you have extracted "talk.dat" from "start.dat" and renamed it "talk-us.dat".
Here are a few recipies. If you simply want to substitute all the ASCII letters by 2-byte SJIS equivalents (aka. Tidus method), just run:
(That's it, the generated "talk.dat" is ready for use with the Japanese version.)
If you want to try this "paired letters" thing, extract EXTRA_STRINGS and PRESERVED_GLYPHS from d4tool.zip as well, and prepare the "font.lzs" and "font.ffm" from the Japanese version. I'll suppose you've renamed them "font-jp.lzs" and "font-jp.ffm" for clarity sake. Just run:
(If everything went well, you should have a new "font.lzs" and "talk.dat" in the directory you've run those commands in. Simply put those two in "start.dat" with the other files. Please note also that the PRESERVED_GLYPHS is only useful to avoid overwriting some specific kanjis, and EXTRA_STRINGS to allow some interesting letter pairs used in some ".dat" files but unused in "talk.dat". Feel free not to use them if you don't care.)
Once you've generated "font.pairs" and "font.lzs", it is also interesting to substitute the ASCII strings by SJIS 2-letter-per-glyph strings in the other ".dat" files. This avoids double spacing between words, and allows the use apostrophes, slashes, and other characters that didn't work previously. It is possible to perform this substitution automatically if the original and patched files have the same size. For instance, this can be used on "zukan.dat" (supposing "zukan-jp.dat" is the original and "zukan-translated.dat" is the version from the patch):
(The generated "zukan.dat" is ready for use with the corresponding "font.lzs".)
The ".dat" files for which this trick can be used are: char.dat, charhelp.dat, charPersonal.dat, comb.dat, committee.dat, GE.dat, HABIT.dat, magic.dat, mitem.dat, mskill.dat, music.dat, MapEditMap.dat, MapEditShop.dat, MinistryMap.dat, nameplate.dat, pirate.dat, RelatedChart.dat, senator.dat, ShipParts.dat, THIEF.dat, Torture.dat, TortureNegotiation.dat, WISH.dat, and of course zukan.dat. Since the original and patched "name.dat" have different sizes, this does not work for them.
Warning: it is a bad idea to transcode the strings that will be stored in the save file, because they will become garbage should the font.pairs/font.lzs change (for one reason or another, such as preserving more kanjis). It is in particular the case of item names. Therefore, I recommend excluding the item names (but not the descriptions) from "mitem.diff". This can be either done by hand, or using the "-l 24" argument (since all item names have fewer than 24 bytes, and all descriptions have more than 24). In summary, "mitem.diff" should be generated as follows:
If you've used the provided PRESERVED_GLYPHS file during the letter pairing, most of the kanjis used in the original Japanese EBOOT.bin should have been left untouched. This means that the kanjis in the yet untranslated parts of the patched EBOOT.bin are untouched as well! What's more, with EXTRA_STRINGS, buggy parts of item descriptions (e.G., "HP/SP", apostrophes, percent symbols) should have been fixed, assuming you patched the corresponding files to use letter pairs.
Finally, if you're feeling adventurous, you can try:
zoidran, you're a man! You've beaten me for a couple of days! Had drafted algo, but wasn't able to eventually implement it due to alot of other work and catching a little cold. Great thing you did it!
Really, it's awesome having you around! Tremendeous thanks for your help!
If you won't mind, I'll incorporate your program in the all-in-one patcher (D4Trans, released earlier)? Or if you want privacy, we can just release patched files (maybe it would be even simplier)
Also, a question to everyone in team, should we repatch already-made files (char, mitem,mskill..)?
On the other hand, we have everything tested and running fine.
Or maybe include both versions (one with fonts untoched, talk.dat replacement by SJIS chars, other files unchnged,other with fonts patched (glyph method), talk.dat glyph patched, other files patched as well (or not) )
Would like to hear more opinions.
And one more thing, what about patching SOUNDS? Should it be dropped, now that we have proper talk.dat translation? It's asynched and quite a large file (400 mb) to distribute.
Also, waiting for Alexmagno's update on his EBOOT bin translation progress.
EDIT: Having error using glyph method.
The error pops up on the stage when creating pairs:
Adding preserved glyphs... 796 glyphs added
Error: not enough glyphs!
Somehow, I just keep receiving this error, about not enough glyphs... IDK what causing this, can tou try again, does it work for you? Can I kindly ask you to check MD5? If it works for you, then it means there's a problem on my side.
Zoidran: Your script is really impressive. The translation of talk.dat is perfect - and all the "problem letters" are there too :3
We should retranslate all possible *.dat files. Especiallly the magic.dat has some letter problems which can't be fixed by hand. Maybe your script will fix them. Why you didn't show up earlier ...
Maybe we should skip the sound files. For the intro it works quite well but later in the game it's just distracting. But on the other side 400mb are not that big. I'm sure there are some people who want the english sounds even if it's not sync.
Then should we should replace all special characters to normal in all files (again). I think it's possible via auto pasting because I used SJis spaces (81 40) to remove double spaces. Those space can be used as marker.