Author Topic: [solved] languages special characters [use UTF8 without BOM]  (Read 5826 times)

Offline Pony

  • Squaddie
  • *
  • Posts: 7
    • View Profile
[solved] languages special characters [use UTF8 without BOM]
« on: October 01, 2018, 07:48:15 am »
Hello,

When I use extraString to input ufopedia new strings in different languages, I encounter a problem with special characters from certain languages. Like the é â è in french. They display correctly when it's from the vanilla fr.yml file. But when I imput them in a custom .rul using extraString,  they display in game as ? ? ? If I write a custom .yml myself in my mod, it doesn't work either.

Obviously, the font image files and the font.dat do include them, it just doesn't use them when I write them in any of my mod files, but does when it's in the regular language .yml file.

My code:
Languages.rul

extraStrings:
   - type: fr
     strings:
        STR_WHATEVER_UFOPEDIA: "my text here"

I did exactly like it is written in the equal term mod https://github.com/KingMob4313/Equal-Terms-Mods/blob/master/v1.0/data/Ruleset/EqualTerms_Weapons_v103.rul
If you have encountered that issue before, thank you for your help !
« Last Edit: October 05, 2018, 02:02:42 pm by Pony »

Offline SupSuper

  • Lazy Developer
  • Administrator
  • Commander
  • *****
  • Posts: 2162
    • View Profile
Re: languages special characters
« Reply #1 on: October 01, 2018, 08:20:41 am »
Make sure your files are set to UTF-8:

Offline Pony

  • Squaddie
  • *
  • Posts: 7
    • View Profile
Re: languages special characters
« Reply #2 on: October 01, 2018, 10:03:37 am »
YES !

That was it, my files were in ANSI. I used regular notepad to re-save as UTF8, and it works.

Thanks for your help and sorry to have bother you with something that simple that i couldn't figure by myself.

By the way, thanks for all the work you've done (with others) to remake this awesome game.  I used to play ufo:eu when I was a kid, and replayed it from time to time for 20 years. This summer, I reinstalled it to play it with my 12 yo nephew... he loved it and now I'm hooked again, and with OpenXcom I can play without bugs and mod the shit out of it.

[edit]
« Last Edit: October 05, 2018, 02:03:20 pm by Pony »

Offline Solarius Scorch

  • Global Moderator
  • Commander
  • *****
  • Posts: 11732
  • WE MUST DISSENT
    • View Profile
    • Nocturmal Productions modding studio website
Re: [solved] languages special characters [use UTF8]
« Reply #3 on: October 03, 2018, 01:51:11 pm »
Nitpicking. Saving UTF8 with BOM (what Windows Notepad does, AFAIK) works fine too.

Maybe it works on the user's end, but if you for example try to upload such a file to Transifex, it won't work. It may also wreak havoc on some OSes.

Offline tkzv

  • Commander
  • *****
  • Posts: 583
    • View Profile
Re: [solved] languages special characters [use UTF8]
« Reply #4 on: October 03, 2018, 10:23:12 pm »

Maybe it was the other way around? ???
The point is, one variant works and the other doesn't.
Rechecked. BOM is present in the majority of Ruleset/*.rul files, which don't have any non-ASCII characters, but absent from Language/*.yml files, which do have Unicode texts.

Removed incorrect statements.
« Last Edit: October 03, 2018, 10:24:49 pm by tkzv »

Offline Solarius Scorch

  • Global Moderator
  • Commander
  • *****
  • Posts: 11732
  • WE MUST DISSENT
    • View Profile
    • Nocturmal Productions modding studio website
Re: [solved] languages special characters [use UTF8]
« Reply #5 on: October 03, 2018, 10:26:35 pm »
Removed incorrect statements.

I removed mine too, because it looked weird.
...okay, who am I kidding. Now it was a double post, and I won't suffer this. 8)

Offline Stoddard

  • Colonel
  • ****
  • Posts: 485
  • in a fey mood
    • View Profile
    • Linux builds & stuff
Re: [solved] languages special characters [use UTF8]
« Reply #6 on: October 04, 2018, 01:28:15 am »
People, let's just forget that BOM and UTF-16/32 ever existed and live in peace for centuries thereafter.

Offline Pony

  • Squaddie
  • *
  • Posts: 7
    • View Profile
Re: [solved] languages special characters [use UTF8]
« Reply #7 on: October 04, 2018, 05:41:47 pm »
Hi !

Yeah sorry, never heard of BOM before seeing it in sup super screenshot (than Google it to get a basic understanding of what it was). I was able to replicate what he showed me, but it's just configure the display of the file in notepad++. Than I thought of regular notepad to resave while changing the setting from ANSI to UTF8 cause I knew it could do it from something else from years ago. But there was no no-BOM option, that's why I mentioned it in my post, thinking "someone will say something if it's really a problem", as it seemed to work anyway with or without BOM. With or without BOM, I can't live with or without BOM.

You react days later, but your posts are so edited, I have no idea... is it fine or not? How can I transform from UTF8 into UTF8 without BOM,  if it's a problem ?
« Last Edit: October 05, 2018, 02:04:23 pm by Pony »

Offline Stoddard

  • Colonel
  • ****
  • Posts: 485
  • in a fey mood
    • View Profile
    • Linux builds & stuff
Re: [solved] languages special characters [use UTF8]
« Reply #8 on: October 04, 2018, 06:28:30 pm »
You react days later, but your posts are so edited, I have no idea... is it fine or not? How can I transform from UTF8 into UTF8 without BOM,  if it's a problem ?

BOM is not fine because it's useless, doesn't work with transifex and only creates confusion elsewhere.

Rulesets, translations and everything else text-based should be in UTF-8 without BOM.

As to how to get rid of it, this depends on the editor in question. For Notepad++ I think the setting SupSuper had shown would work, if you set it, edit the file a bit (like add and delete a space) and save it.

I use SciTE, it's small, fast, and syntax-highlights and folds yaml nicely.






Offline Pony

  • Squaddie
  • *
  • Posts: 7
    • View Profile
Re: [solved] languages special characters [use UTF8]
« Reply #9 on: October 05, 2018, 02:01:48 pm »
OK,  I made a few tests, yet not 100% sure, but I'll just put it here in case it helps someone in the futur:

*Switching from ANSI to UTF8 without BOM in notepad++ (edit and save file) doesn't seem to work, as it's back to ANSI next time I reload the file in notepad++
*open in regular Notepad and save as UTF8 does convert from ANSI to UTF8, but add a BOM.
*open a UTF8 with BOM in  notepad++, switch to UTF8 without BOM, edit and save does seem to definitely convert the file into UTF8 without BOM (checked with a hex editor)


BOM BOM BOOOOOM
BOM BOM BOOOOOOOOOOM
Bom-bom-bom Bom-bom-bom Bom-bom-bom