Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Topics - hmaon

Pages: [1]
Open Feedback / screenshot of OXC in a news feed on Steam
« on: May 09, 2013, 02:36:08 am »
at the bottom:


Programming / alien AI work, now with test builds
« on: March 02, 2013, 11:23:19 am »
Latest un-merged build you can try if you're bold: n/a everything's merged

So, I've been working on some AI changes. It'll probably be a while before the giant pull request is merged (, especially since it keeps getting amended, but if anyone wants help test them, I'm making builds available.


Never mind, it's merged! Just use the git builds.

You might notice aliens trying to hide after shooting you. They should also reliably face toward where they expect xcom to appear, when they're hidden. Then there's a bunch of stuff that might not be noticeable and then also melee units should be properly vicious. Maybe. Unless they're broken again.

Attached is a screenshot of some sectoids trying to set up an ambush.

Programming / HQX and scale2x filters edit: Now with OpenGL shaders
« on: February 09, 2013, 03:53:24 am »
After getting all the FPS I could for now out of the game, I've taken steps to make it slower instead. That is, HQX and Scale2x filters are now an option with this code:

The results are a little weird but kind of nice at times. Screenshots attached!

edit: OpenGL stuff in a reply below somewhere

Work In Progress / Capture aliens and brainwash them to join the XCom cause
« on: February 05, 2013, 07:32:37 am »
This code is kind of in its infancy. Also, yes, I know it's a stupid idea. I like stupid ideas.

Here's the branch:

Right now you just have to have an alien mind-controlled at the end of the mission and he's yours. Future plans include some kind of special brainwashing facility. Then perhaps a cloning facility. After that, chryssalids break out and try to overrun your base, I guess.

TODO: Perhaps don't let the player take off the alien's skin and replace it with XCom armor via the armor menu? Perhaps mutons can still wear armor, though? What's the point of having aliens on your side? Maybe there should be some new special mission requiring an alien infiltrator? I dunno, it's kinda dumb, like I said.

This branch is also the testing ground for on-demand loading of .SPK files. I'm using it to load inventory sprites for aliens. (Also works if they're just regularly mind controlled!) I got the sprites from another mod, the inventory screens here: So, thanks for that!

A couple of screenshots are attached.

Programming / Profiling OXC (and optimizing some code a little)
« on: February 05, 2013, 07:21:03 am »
Hellope. I did some profiling on OpenXcom and I've been working on trying to speed up some critical sections.

I used callgrind (a valgrind tool) to do the profiling. It seems the easiest approach even though the actual game runs excruciatingly slowly under valgrind's emulation. Except single-digit FPS. Look at kcachegrind's pretty output, though:

This is a human turn plus an AI turn of a base assault:
The method individually using the most CPU in the battlescape is obviously the shader. Then, curiously there's SavedBattleGame::getTile() and then _zoomSurfaceY().

I bet there's something that could be done to speed up the shader code but I actually don't understand it yet. I moved on to the other functions.

getTile() seemed to cry out to be inlined so that's what I did. I then inlined getTileIndex() along with it so the whole procedure can avoid a call. As you can see, getTile() gets called a lot. In this run, it was called over 68 million times. _zoomSurfaceY() gets called once per frame, I think; that makes 68513755 / 4031.0 = 16996.7 getTile() calls per frame on average. It's hard to say whether that's actually a lot; it's 1/4 of the pixels of a 320x200 window, though? It's not quite 5% of the CPU load. Then again, 5% CPU time in a single getter function, really?

Anyway, next I looked at _zoomSurfaceY(). It's responsible for stretching the 320x200 native resolution window to the display resolution (e.g., 640x400 or my preference of 1280x800). It's written to be a very general function to scale the image correctly to any arbitrary resolution given any pixel format. That allows for a lot of optimization in the special cases of x2 or x4 scale at 8bpp which seem like the most common use cases. I wrote two rescaling functions to read data as a 64 bit int and write it back as 64 bit ints (and then 32-bit versions of the same.) The results seems to have been an FPS increase anywhere from +10% to +100%. At 1280x800 on my particular laptop, the game went from ~70 fps to ~140 fps. Coincidentally the 32-bit versions of the zoom function are only slower by a couple of FPS. I'm not sure why -- write combining maybe? Could be the register spill I'm noticing in the assembly output on the 64-bit version? -- if anyone has some experience in this sort of analysis, please take a look.

Coincidentally, there's probably some opportunity to insert other filter functions here, perhaps copied from any of the many console emulators out there.

Finally, here's the profiler's output after my changes:
As you can see, getTile() is gone from the results and its most frequent callers from the TileEngine are the next in line. Also, _zoomSurfaceY() has fallen below TileEngine code in CPU use! From 4.68% CPU to 3.07% CPU seems like a nice change.

Of course, those figures are hardly scientific. I made hardly any effort to keep the two runs identical. There's also no demo that I could run the game through to help me repeat similar runs. I have to actually play the game at ~0 fps in valgrind's virtual CPU.

Oh yeah, _michal asked on IRC for a write-up of my profiling and optimization attempts.

The branch with my optimizations is here:
I've submitted a pull request for whenever SupSuper is done working on actual important stuff.

Suggested points for discussion:
1) What is up with the Shader code? How does it work? Anyone? How can it be sped up?
2) What's the deal with my coding style? Why is it such a mess?
3) How about some optimizations that I missed?
4) Can those TileEngine methods be improved somehow?
5) Shouldn't we just OpenGL to scale and filter the output? (Perhaps?)
6) Does ANYONE have a working PowerPC Mac? I bet my code is broken on big-endian systems right now but I have no computer to test on!

tl;dr: I made the FPS number go up a little; maybe someone porting to really underpowered hardware (or running debug builds) will care.

Pages: [1]