Nothing entered.
[http://sourceforge.net/p/ufoai/bugs/2080 Item 2080] imported from sourceforge.net tracker on 2013-01-28 19:18:54
I have been able to nail it down to revision 23100.
In rev 23099 the LEVEL-pass took 2 seconds, one rev later it takes 19s :(
rev 23100 seems to be about the Mem_* functions. As the LEVEL-pass rushes towards 90%, then waits, I'd suspect the Mem_Free(). Just a guess.
Needless to say that this is quite annoying for both compiling all maps AND testing pathfinding.
===== Comments Ported from Sourceforge =====
====== tlh2000 (2009-03-22 20:57:42) ======
we've introduced the mempool management to fix the 40-50 mb memory leaks per map. so yes, it might be a little bit slower - but leaking memory is no option. maybe we can use some stack vars instead of mallocing stuff all over the place to neither malloc nor free the stuff.
====== aduke1 (2009-03-22 22:00:34) ======
I agree that management takes longer ;) Maybe 50% or even 100% longer.
But 10 times longer definitely sounds like a bug to me.
====== aduke1 (2009-03-23 19:15:08) ======
I noticed that Mem_Free() walks through the list in reverse order to find the memory block to free. What's the design idea behind this ?
If I imagine a part of the app that builds a large array of pointers to allocated memory, then does something with it and then frees it in the same order that it was build up, this design is rather inefficient.
I agree that the memory management is very useful to find leaks. But once we have found all or most of the leaks, there should be a way to short-circuit it. e.g. for a release version. Is that planned ?
====== aduke1 (2009-03-23 23:35:52) ======
Okay, I looked into _Mem_Alloc(). So forget about my bickering about reverse order...
However, the request for a short-circuit still remains.
====== tlh2000 (2009-03-25 07:47:19) ======
i've recompiled all the maps in trunk in 35 minutes. so i have to say that you two already did a quite nice job. it took the whole night before (before the threading though)
please describe a little bit more what should be done in your opinion - then i'll see what i can do.
====== aduke1 (2009-03-27 22:21:10) ======
The LEVEL pass is forced to be single-threaded, so I removed the mutex from Mem_Free, but it merely brought the time down from 23 to 20s.
gprof says that ufo2map (fighter_crash) spends some 25% in _Mem_Free() which is called 1.4 million times. And it spends that time inside the function itself, not in the called children !
So it's rather clear to me that the search for the mem* in the list must be the problem.
Mem_Alloc builds that list in reverse oder of 'appearance'. FreeTree() frees memory by recursing the tree. It seems that in the process of building the tree (with all that splitting of brushes) the original order is shuffled quite a lot. So on average Mem_Free has to sequentially search half-way through the list to find the mem*.
So what options do we have ?
1. splitting the generic pool.
Unfortunately 1.1 of the 1.4 Mio calls to Mem_Free come from FreeWindings(), so there is no easy split.
2. improving the search speed in Mem_Free
We could switch from a simple list to something sorted/structured, maybe a hashtable.
3. reduce the # of Mem_Alloc / Alloc bigger chunks
We would need somebody who *really* understands the bsptree and it's subcomponents.
4. ... ?
====== aduke1 (2009-03-28 20:11:02) ======
make mem.c use a hashtable
====== aduke1 (2009-03-28 20:14:33) ======
I went ahead and made mem.c use a little 'hashtable of lists'.
That brings us down from 23 to 8s. Further increasing of MEM_HASH doesn't seem to speed things up.
====== aduke1 (2009-03-29 22:03:27) ======
Maybe I should have mentioned that I also uploaded a patch here for your approval...
====== tlh2000 (2009-03-30 06:04:45) ======
hehe - indeed - i've missed the patch. will have a look after work
====== tlh2000 (2009-03-30 17:18:43) ======
new version (64bit + coding guidelines)
====== tlh2000 (2009-03-30 17:19:09) ======
attached a new patch - but it segfaults (and yours didn't apply cleanly to trunk - have i missed something?)
====== aduke1 (2009-03-30 19:30:58) ======
fixed _Mem_AllocatedInPool()
====== aduke1 (2009-03-30 19:34:35) ======
Somebody removed an unused var after I created the patch. I got a merge conflict too.
There was a crash bug in _Mem_AllocatedInPool(). I applied a fix and created and uploaded a new patch.
Have fun !
====== aduke1 (2009-03-30 22:27:45) ======
Addition:
In _Mem_CheckPoolIntegrity(), the ", blocks = 0, size = 0" need to be move to the for(j...)-loop !
====== aduke1 (2009-03-31 23:06:57) ======
Thx for committing the patch :)
Tested it again; works nicely here.
But we are still 4x slower than before rev 23100 :(
gprof for ufo.exe says that _Mem_Free is only called some 85.000 times (for the same map).
So just *loading* the bspTree is not really a problem.
Comparing that to the 1.4 Mio in ufo2map.exe I suspect that the *splitting* of brushes while building the bspTree is the problem.
Investigating...
====== aduke1 (2009-04-05 21:36:39) ======
The culprit is CheckPlaneAgainstVolume(). It's accountable for some 70% of the calls to Mem_Free.
The function uses SplitBrush() to figure out whether a given plane *would* split the brush and then deletes/frees the two resulting brushes.
A possible solution would be to create a function DoesPlaneSplitBrush(). I would rather leave that to someone who is more familiar to the code around brushes than I am ...
====== tlh2000 (2009-04-06 07:22:44) ======
to tell the truth i don't even think that we have to split brushes at all ;)
we are not using any pvs (potential visible set) or any of the other bsp features that quake was using to speed up rendering. we can think about hardware culling (hiding invisible stuff via hardware) and remove most of the bsp stuff from the code.
====== aduke1 (2009-08-05 18:22:08) ======
Now that I learned that the LEVEL-pass is by far not the most time-consuming part of the map-compile, I think we can lower the priority a bit ;)
====== tlh2000 (2009-11-09 07:08:44) ======
is this really still relevant duke? you increased the compile speed that much that i'm able to compile all maps in less than 30 minutes...
====== aduke1 (2009-11-09 20:07:56) ======
CheckPlaneAgainstVolume() is still a terrible waste of cpu. Let's keep this item as a 'reserve' in case the compile time increases again (or if I get bored after the release of 2.3).
====== aduke1 (2010-12-19 20:55:41) ======
Today I introduced DoesPlaneSplitBrush().
It saves some 30-40%.
====== tlh2000 (2010-12-19 21:35:42) ======
would be cool if could add this to the commit message the next time, as there is no link between the commit and the tracker item.
it saves 30-40% for what? the bsping steps?
====== tlh2000 (2010-12-19 21:36:29) ======
uh oh - you are bored?
[quote](or if I get bored after the release of 2.3).[/quote]
====== aduke1 (2010-12-20 16:24:48) ======
1. np. What exactly would you like to see in the commit msg ? SF-Tracker-item # ?
2. 30% of the LEVEL-pass as mentioned above
3. not bored, just cleaning up unfinished business...
====== aduke1 (2010-12-23 22:04:10) ======
damn, it doesn't work for all maps...eg. city_train.
=>reopened
====== alextishin (2011-04-24 14:14:27) ======
Changed (in fact, simplified) algo in 8b84135d54385965d8b69e1f2fab1ab9529aaabd, works for me now. Set as default algo in map compiler.
====== sf-robot (2011-05-08 14:20:12) ======
This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).