So what I did was, I used WinDbg to debug OXCE, break at a certain point in the Mod.cpp code (before AfterLoad calls, to avoid too much recursion), and then use WinDbg's command line to dump Mod object. I did a 7-level recursive traverse (with Natvis, because std::map and std::vector's data structures are otherwise too deep to dump), output to a file, which generated a 1.8GB 18M line text file. I did this both with the yaml-cpp and rapidyaml, which took a couple of hours each. I then edited the output files, censored pointer addresses and structure offsets, and compared the results. I found the following differences:
_mod->_scriptGlobal->_refList[xx].value.data
_Val and _Pad are different. Are these just pointers that randomly change on each startup?
_mod->_scriptGlobal->_events[xx][yy]._proc
Some bytes change from one run to the other. Is this because the compiled scripts contain pointers, and these changed bytes are just pointers that always change on each startup?
_mod->_globe._textures[xx].second._terrain[xx]
Some values (of type double) are different... well, this is complicated.
As we know, our rules allow mods to add on top of one another. Two mods can define the same rule, where properties of the 2nd definition add to, or overwrite properties of the 1st definition.
rapidyaml, coincidentally, happens to match behavior. If we try to deserialize a yaml sequence into an existing std::vector, each yaml sequence element will be deserialized into an existing vector element, updating the fields.
yaml-cpp however, clears the std::vector before trying to deserialize into it.
So this difference is (probably a bug) caused by:
- Dio not deleting existing globe textures before updating them with his own.
- rapidyaml in-place deseralizing into existing std::vector
I'm not sure which behavior is correct. For now I've disabled rapidyaml's support for std::map and std::vector and copied the implementation into our Yaml.h, where I modified it to clear vector/map before deserializing into them. This replicates the yaml-cpp's behavior, and while it is now backwards-compatible, one could argue that the behavior is incorrect.
_mod->_transparencies[xx].second[yy]
This one was my fault. In old Mod.cpp line 2543 the for loop was increasing two counters at the same time and I missed the 2nd one. Fixed, tested, and now it works.
_mod->_invs[xx].second._costs[xx].second
Similar problem like with the std::vector, except this time it's a std::map. But it's worse. If rapidyaml tries to deserialize into an existing std::map, which isn't empty, the deserialization for the elements with matching keys quietly fails. The previous solution of clearing the map before deserializing into it solved the problem. I've also reported the inconsistency issue between vector and map to library author.
_mod->_armors[xx].second._moveCostFlyWalk.TimePercent
This one was also my fault. In the ArmorMoveCost::load() I put in UInt8 as deserialize type. I don't know how that happened. Changed it to int and problem fixed.
YAML::Node changed to YAML::YamlString
As expected, this causes some changes in the memory layout.
There was also another problem related to deserializing into map, but I fixed it in the custom map deserialize method.
I pushed the fixes to the remote repo.