Rapidyaml doesn't use list<node>. Internally it already uses vector<NodeData> (It's array*, same thing; NodeData = 72 byte struct), and "NodeId" is just offset in this array.
The cache-locality of this array is already great. It would be very difficult to improve it. You can try, but I think constructing your vector would take more time than the time gained from theoretically improved cache-locality. Sorting... that's worse than hashing, isn't it? That's why std::map never beats std::unordered_map.
I can give you some current timing data. Let's say game startup (debug build with XPZ; Options::mute = true) takes 5000ms
Of these 5000ms, operations related to rapidyaml take:
- 1300ms to parse raw yaml into node trees
- 333ms searching for node keys
- 263ms deserializing node values
- 87+35ms building+destroying unordered_map node key indexes
- 32+13ms building+destroying YamlNodeReader.children() vectors
- 20ms building+destroying YamlNodeReader objects
- other negligible stuff
Of the 333ms of node key searching there is:
- 220ms (10ms in release build) of indexed searching
- 72ms (4ms in release build) of unindexed searching (with stored child iterator optimization)
What about game loading? In the release build, loading a 12MB sav file, all the unindexed searching together takes <2ms.
If your proposed optimization is targeting unindexed key searching, well, there's very little left to gain there. The majority of search calls (3 to 1) are already indexed. At least for game startup. For game-loading, the majority is unindexed, however that's mostly (80%) inside BattleUnitKills::load(), where the stored child iterator optimization gives it O(1).
Btw, what are the places that you say you found where there's too much iteration?