aliens

Author Topic: Converting OXCE to rapidyaml  (Read 3099 times)

Offline Yankes

  • Moderator
  • Commander
  • ***
  • Posts: 3331
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #15 on: October 24, 2024, 04:06:50 pm »
Close, I mean dummy struct like:

Code: [Select]
strcut YamlString
{
    std::string yaml;
};

Because `typedef` help with readability but not prevent typos, like you try send `_spawnedSoldier` to `getSolderType(_spawnedSoldier)` and it will compile fine.

You can add some helpers to this `YamlString` to easier get data from it like:
Code: [Select]
YAML::YamlNodeReader reader(_spawnedSoldier, true); //create `ryml::ConstNodeRef` on the fly and other boilerplate to make `YamlNodeReader` wrok

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #16 on: October 24, 2024, 09:37:32 pm »
Ok.

Next thing I'm wondering about is... should all the save and load functions pass around readers/writers instead of nodes, e.g.:

instead of
Code: [Select]
void BattleUnit::load(const ryml::ConstNodeRef& node, const Mod *mod, const ScriptGlobal *shared)
{
YAML::YamlNodeReader reader(node);
reader.useIndex(true);
reader.tryRead("id", _id);
_statistics->load(reader.getChild("tempUnitStatistics"));
...

void BattleUnit::save(ryml::NodeRef node, const ScriptGlobal *shared) const
{
node |= ryml::MAP;
YAML::YamlNodeWriter writer(node);
writer.write("id", _id);
_statistics->save(writer.writeKey("tempUnitStatistics"));
...

is it better to use
Code: [Select]
void BattleUnit::load(const YAML::YamlNodeReader reader, const Mod *mod, const ScriptGlobal *shared)
{
reader.tryRead("id", _id);
_statistics->load(reader.getChild("tempUnitStatistics"))
...

void BattleUnit::save(YAML::YamlNodeWriter writer, const ScriptGlobal *shared) const
{
writer |= ryml::MAP;
writer.write("id", _id);
_statistics->save(writer.writeKey("tempUnitStatistics"));
...

?

Offline Yankes

  • Moderator
  • Commander
  • ***
  • Posts: 3331
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #17 on: October 25, 2024, 12:41:54 am »
Second is better in my opinion, less boilerplate and look more similar to old code.

btw you do not use `&`? what copy sematic you assign to this new types?

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #18 on: October 25, 2024, 01:28:04 am »
I only started learning C++ two weeks ago, I don't know much about all the const and & stuff :(
fucking... [&&] const &function(const int* const& &name) const!

getChild finds a node with a key and wraps it into a new YamlNodeReader instance and returns it
writeKey creates a new child node with a key and wraps it into a new YamlNodeWriter instance and returns it

I don't really know what you mean with "copy semantics".
« Last Edit: October 25, 2024, 01:33:14 am by Delian »

Offline Yankes

  • Moderator
  • Commander
  • ***
  • Posts: 3331
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #19 on: October 25, 2024, 02:07:18 am »
In our case is do this class behave as `int*` or like `std::vector<int>`, both can refer to list of integers but copy work different.
Code: [Select]
int* p = new int[]{1, 2, 3};
int* b = p;
b[0] = 42;
assert(p[0] == 42); // both change same memory localization

where for `std::vector`:

Code: [Select]
std::vector<int> p{1,2,3};
std::vector<int> b = p;
b[0] = 42;
assert(p[0] == 1); // each one have unique memory

As rapid do try avoid copy memory is possible that you have first case, know as "shallow copy" and refer as "not owing memory",
where vector have "deep copy" and "own memory".
Then if you have type with "deep copy" to avoid coping lot of memory every time you try send it to function you
use reference `&` to send object effective as pointer but with better syntax (`.` instead of `->` and `*`) and not-null guarantee.
In most cases this is used as `const T& arg` and this mean "I only want quick look on this object and I promise I do not change anything".

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #20 on: October 25, 2024, 02:26:15 am »
Ok, I'll use & as much as possible then.
In some cases I was having issues, for instance
Code: [Select]
_blabla.save(node.append_child("key"));

If the save function uses &node, then this would give me some lvalue error or whatever. But if I remove the &, then there's no error.
alternatively, I could use
Code: [Select]
Node child = node.append_child("key");
_blabla.save(child);
But this is more annoying to code with, because of having to create a variable every time before calling the save/load function.
This was before, when I was passing around ConstNodeRef. With YamlNodeReader there doesn't seem to be an issue with using the 1st option.
« Last Edit: October 25, 2024, 02:29:07 am by Delian »

Offline Yankes

  • Moderator
  • Commander
  • ***
  • Posts: 3331
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #21 on: October 25, 2024, 11:13:13 am »
I not asking you to add `&`, in some cases passing as value (without any `&`) is preferred, especially if type do not allocate memory or need to share same object in multiple places.
Have you this code available in github? probably better would be look on code itself as sometimes small things could affect answer to questions like this.

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #22 on: October 27, 2024, 05:00:16 pm »
Been working on it for the past two days, but until it's all done, the main project is not gonna compile.

I've whipped up a separate test project, which does compile. You can check it out and tell me all the things I'm doing wrong lol

Today I wasted at least 2 hours trying to figure out why I was getting linker errors. Because I put the template function implementations in the .cpp file...

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #23 on: October 27, 2024, 06:35:51 pm »
I found out that there are a couple of "bugs" in rapidyaml, but I think they're fixable.

First is the fact that rapidyaml author didn't have the time to add any output formatting options for yaml. By default, rapidyaml will output flow style nodes delimited with comma instead of comma+space ([1,2,3] vs [1, 2, 3]), like we currently do. So, I fixed this by editing rapidyaml.hpp single header file...
In Emitter<Writer>::_do_visit_flow_sl(), line 26107, I changed:
Code: [Select]
            this->Writer::_do_write(',');

to
Code: [Select]
            this->Writer::_do_write(", ");


I'm not sure if it's better for us to include rapidyaml as a single header file, or as separate files but, since the library author is afk, modifying those files is basically the only option I think. Hmm. I suppose it would also be possible to do a partial yaml emit of a flow style node and use string replace, but that might cause other bugs, plus it wouldn't be efficient (because we use flow style a lot).

The second bug I already reported at https://github.com/biojppm/rapidyaml/issues/475
That one is a bit more critical, because it affects some mods (for instance XPZ), and causes parsing failure. So either I fix it in the library, or the mod authors have to run a regex [ ]+$ through their .rul files to remove the offending trailing whitespace. Since there's technically nothing wrong with trailing whitespace, I think the fixing it in the library is the correct choice.

Edit: I fixed the 2nd bug in the library as well. There were 5 places where the parser forgot to skip whitespace.
« Last Edit: October 27, 2024, 10:16:12 pm by Delian »

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #24 on: October 30, 2024, 02:17:37 am »
Some random stuff I encountered today...
- I noticed that with proper buffer management (by setting large enough initial buffers), I can easily get another 5% parsing speed increase.

- Converting base64 binTiles encoding to rapidyaml was a chore...

- I came across some funny code in SaveGame::load()
Code: [Select]
// Finish loading crafts after bases (more specifically after all crafts) are loaded, because of references between crafts (i.e. friendly escorts)
{
for (YAML::const_iterator i = doc["bases"].begin(); i != doc["bases"].end(); ++i)
{
// Bases don't have IDs and names are not unique, so need to consider lon/lat too
double lon = (*i)["lon"].as<double>(0.0);
double lat = (*i)["lat"].as<double>(0.0);
std::string baseName = "";
if (const YAML::Node &name = (*i)["name"])
{
baseName = name.as<std::string>();
}

Base *base = 0;
for (auto* xbase : _bases)
{
if (AreSame(lon, xbase->getLongitude()) && AreSame(lat, xbase->getLatitude()) && xbase->getName() == baseName)
{
base = xbase;
break;
}
}
if (base)
{
base->finishLoading(*i, this);
}
}
}

I changed it into
Code: [Select]
for (int i = 0; i < _bases.size(); ++i)
_bases[i]->finishLoading(reader["bases"][i], this);
it was using some convoluted way to compare bases, but since the _bases vector has the same order as the bases in the save file, it's much simpler to use this order to match them.

Offline Meridian

  • Global Moderator
  • Commander
  • ***
  • Posts: 9061
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #25 on: October 30, 2024, 10:59:40 am »
I found out that there are a couple of "bugs" in rapidyaml, but I think they're fixable.

First is the fact that rapidyaml author didn't have the time to add any output formatting options for yaml. By default, rapidyaml will output flow style nodes delimited with comma instead of comma+space ([1,2,3] vs [1, 2, 3]), like we currently do. So, I fixed this by editing rapidyaml.hpp single header file...
In Emitter<Writer>::_do_visit_flow_sl(), line 26107, I changed:
Code: [Select]
            this->Writer::_do_write(',');

to
Code: [Select]
            this->Writer::_do_write(", ");


I'm not sure if it's better for us to include rapidyaml as a single header file, or as separate files but, since the library author is afk, modifying those files is basically the only option I think. Hmm. I suppose it would also be possible to do a partial yaml emit of a flow style node and use string replace, but that might cause other bugs, plus it wouldn't be efficient (because we use flow style a lot).

The second bug I already reported at https://github.com/biojppm/rapidyaml/issues/475
That one is a bit more critical, because it affects some mods (for instance XPZ), and causes parsing failure. So either I fix it in the library, or the mod authors have to run a regex [ ]+$ through their .rul files to remove the offending trailing whitespace. Since there's technically nothing wrong with trailing whitespace, I think the fixing it in the library is the correct choice.

Edit: I fixed the 2nd bug in the library as well. There were 5 places where the parser forgot to skip whitespace.

I don't want any custom bugfixes.... unless ABSOLUTELY necessary.

1/ Output formatting I don't care about... if I understand correctly it is just a cosmetic thing, not functional, right?

2/ Trailing whitespaces... on the frontpage he writes something about known limitations (https://github.com/biojppm/rapidyaml?tab=readme-ov-file#known-limitations), there are trailing tabs mentioned there... don't know if it's the same or not... but if yes, then I'd rather ask modders to remove trailing whitespaces instead of losing 10% performance

3/ Regarding including rapidyaml, I'll need to discuss with Yankes and SupSuper, but my personal preference would be:
a/ "as is" (just copypaste source snapshot one to one into our repo)
b/ as single hpp file
c/ as static library
d/ as dynamic library -- this I really don't like, after all we've been through over the years with yamlcpp

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #26 on: October 30, 2024, 01:21:44 pm »
1. Yes, it's just cosmetic. I'll leave that change out then.

2. The known limitations are about tabs, not spaces. The parser crashes with extra " " spaces at the end of flow-style lines. The standard says that those should simply be ignored, so it's either a bug, or a limitation.
I've tested the performance difference with and without a custom fix with the extra whitespace checking. It's <0.5% performance difference, so that shouldn't be a consideration. But I agree that, custom bugfixes are a nasty solution because they make updating libraries a nightmare. Going with the "tell mod authors to remove trailing spaces" solution, I'll have to make sure that when the game encounters such a parser error, that it's correctly communicated to the user.

3. Ok, I'll try to produce a test project that includes a 1:1 copy of rapidyaml/src/ instead of the single header file.

Anyway, another formatting difference is, sequences don't start new lines:
Before:
Code: [Select]
moduleMap:
  -
    -
      - 1
      - 1
    -
      - 2
      - 2
    -
      - 3
      - 3
  -
    -
      - 1
      - 1
    -
      - 2
      - 2
    -
      - 3
      - 3
  -
    -
      - 1
      - 1
    -
      - 2
      - 2
    -
      - 3
      - 3
After:
Code: [Select]
moduleMap:
  - - - 1
      - 1
    - - 2
      - 2
    - - 3
      - 3
  - - - 1
      - 1
    - - 2
      - 2
    - - 3
      - 3
  - - - 1
      - 1
    - - 2
      - 2
    - - 3
      - 3

Lastly, the yaml standard says that certain non-printable characters should be escaped, as in, ascii code 00 should produce "\x00" in the yaml. It also states that it's not mandatory. Unlike yaml-cpp, rapidyaml does not respect this suggestion hehe. It simply outputs a control code 00 character raw. Subsequent parsing of these non-escaped characters works just fine however. It even works fine in yaml-cpp, so it's perfectly backwards compatible. As long as a string is in double quotes, everything goes.
« Last Edit: October 30, 2024, 07:51:07 pm by Delian »

Offline Yankes

  • Moderator
  • Commander
  • ***
  • Posts: 3331
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #27 on: October 31, 2024, 01:15:07 pm »
I don't want any custom bugfixes.... unless ABSOLUTELY necessary.

1/ Output formatting I don't care about... if I understand correctly it is just a cosmetic thing, not functional, right?

2/ Trailing whitespaces... on the frontpage he writes something about known limitations (https://github.com/biojppm/rapidyaml?tab=readme-ov-file#known-limitations), there are trailing tabs mentioned there... don't know if it's the same or not... but if yes, then I'd rather ask modders to remove trailing whitespaces instead of losing 10% performance

3/ Regarding including rapidyaml, I'll need to discuss with Yankes and SupSuper, but my personal preference would be:
a/ "as is" (just copypaste source snapshot one to one into our repo)
b/ as single hpp file
c/ as static library
d/ as dynamic library -- this I really don't like, after all we've been through over the years with yamlcpp
e) use submodule, only drawback that after `git clone` will require additional operations to fetch all submodules.
f) fast google show that cmake can use `ExternalProject_Add` that allow grab some git repo

both have one benefits that we can point to fork at first, but when base lib get updated we can redirect both to official version.

For speed, with rapid we have lot of "free" speed budget grained after ditching yaml-cpp, this means we can spend it on making code more compatible with old version and still have big grain in speed after that.

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #28 on: October 31, 2024, 08:40:14 pm »
Here's another simple test project for the current iteration of our wrapper that also includes a performance test.

I've removed the single header file and included src files from the rapidyaml. Note that rapidyaml has git folder references to other projects, so I had to get src files from those projects as well. Basically, get src files from:
rapidyaml/src/
rapidyaml/ext/ -> c4core/src/c4/
rapidyaml/ext/ -> c4code/src/c4/ext/ -> debugbreak/debugbreak.h

I had to add all of them to the project file, otherwise it wouldn't compile... I wonder if there's a way to build VS projects with external cpp files.

e) use submodule

rapidyaml author says to use:
Code: [Select]
git clone --recursive https://github.com/biojppm/rapidyaml

In the README he includes all the instructions about how to use it as a submodule as well. Ways that use CMake and stuff which I have no clue about lol

Btw, compiling with src files instead of a single header file takes waaay longer.
« Last Edit: October 31, 2024, 08:43:34 pm by Delian »

Offline Delian

  • Colonel
  • ****
  • Posts: 470
    • View Profile
Re: Converting OXCE to rapidyaml
« Reply #29 on: November 01, 2024, 12:47:26 pm »
I'm done converting everything in /src/Savegame/. Now for the rest of the folders...

Btw, I think I found a small bug. When we're saving soldiers to yaml, if they're on a craft, we store the "craft unique id", and when we're loading soldiers from yaml, we set the soldier's _craft to the one that's in the base with equal "craft unique id". The problem is that we are storing a "target id" instead of "craft unique id". This is because Craft is missing an override of Target's virtual saveId() function (for instance, Ufo provides the saveId() override, but Craft does not). The consequence of this bug is that two redundant nodes are stored hehe.
Code: [Select]
        rank: 1
        craft:
          lon: 1.803321073590076
          lat: -0.46181407437014516
          type: STR_PIRANHA_AIRBIKE
          id: 1
        gender: 1
versus
Code: [Select]
        rank: 1
        craft:
          type: STR_PIRANHA_AIRBIKE
          id: 1
        gender: 1
The lon and lat are unnecessary; they're not read when loading a save file. Should I add a saveId() override to Craft to fix this?