project-navigation
Personal tools

Author Topic: Village mission crash (now with patch)  (Read 5297 times)

ubequitz

  • Guest
Village mission crash (now with patch)
« on: August 28, 2006, 07:08:15 pm »
Hi all,

Using latest SVN, on my system the mission "village07d" crashes at the start after I first click the next turn button (although it happens only about 1 in 4 times). This could be a bug specific to my system (linux ATi drivers are suspect), or it could be more widespread. If you are interested in testing it is possible to jump directly to this mission and test it. I would greatly appreciate knowing if the problem is not just specific to my box.

To test:
edit base/ufos/campaign.ufo and change the line in the intro section that reads:
    missions  "farm"
to:
    missions  "village"

Then run ufo, start a new campaign, setup a base near the village mission (middle of the USA). Exit the base menu, increase time and wait for the mission. Then send your ship to the mission when it is _daylight_ (night missions don't give me the crash as far as I can tell). Then just hit the next turn button straight away. If it doesn't crash exit and repeat a few more times.

I've managed to get a backtrace of my crash for those interested. See:
http://pastebin.com/777996

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #1 on: August 28, 2006, 11:03:57 pm »
Update: made some small progress.

 I added the debug line:
Com_Printf("CM_TransformedBoxTrace: tile no is %d , cnt is %d, max is %d\n", tile, numTiles, MAX_MAPTILES);

to CM_TransformedBoxTrace [cmodel.c]

I now get something like this right b4 the crash: CM_TransformedBoxTrace: tile no is -1113735969 , cnt is 1, max is 512
The exact tile no does change between runs though.

Something to investigate further....

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #2 on: August 29, 2006, 01:54:09 am »
I've further narrowed it down a bit:

In  SV_SpanServer  [server/sv_init.c] there is the following code:
Code: [Select]

/* fix this! */
    for (i = 1; i <= CM_NumInlineModels(); i++)
        sv.models[i] = CM_InlineModel(va("*%i", i));


When I get a crash on the village map I've worked out that this code block assigns sv.model[1] with a model which is later accessed by SV_HullForEntity (via SV_ClipMoveToEntities) [server/sv_world.c]. It looks like this sv.model[1]->tile is unitialized - it is just some random integer.

I don't really understand the SV_SpanServer code entirely. In cmodel.c similar code exists but it filters out models with names beginning with "*" which is the opposite of what the above code does.

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #3 on: August 29, 2006, 06:21:01 am »
Ok this bug is turning out trickier than I thought. As far as I can tell that suspect code I referenced above is not the problem. Indeed all models start with *, I was misreading some code...

I've checked that the initial assignment of the model in the above code is fine and the model->tile variable is fine initially.

That means sometime between initialization and the end turn that variable (the tile field of sv.models[1]) becomes corrupted.

The saga continues....

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #4 on: August 29, 2006, 08:01:26 am »
Ok, I've narrowed it right down now.

in CM_LoadMap [qcommon/cmodel.c] there is some code:
Code: [Select]

/* free old stuff */
 for (i = 0; i < numTiles; i++)
        CM_FreeTile(&mapTiles[i]);


Which frees some tiles. But I've discovered one of these tiles is referenced by sv.models[1]. For example before the frees I can access sv.models[1]->tile but immediately after attempt to access this same variable always crashes. (Note direct access isn't possible in some places so I wrote a function in server/sv_world.c like:
Code: [Select]

void SV_test_tile(void) {
   if (sv.models[1]) {
      Com_Printf("SV_hmm: sv.model->tile %d\n",  sv.models[1]->tile);
   }
}

to do the testing in various places.

Hopefully somone can look into what the correct behavior here should be.

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #5 on: August 29, 2006, 08:24:10 am »
Dang... I take that back... CM_FreeTile only frees tile->extraData

 my only explanation is extraData is written in the same memory region as sv.models[1]

so it is even more complex... back to square 2...

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #6 on: August 29, 2006, 08:50:41 am »
Ok, this _might_ be related. Something here looks strange:

in CMod_LoadSubmodels [qcommon/cmodel.c]:
Code: [Select]

out->tile = curTile - mapTiles;

where out has type cmodel_t, so looking in [game/q_shared.h] cmodel_t->tile we find this has type int. However curTile and mapTile_t are of type *mapTile_t

Looks odd indeed

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #7 on: August 29, 2006, 09:31:46 am »
Well if I comment out the /* free stuff */ block in qcommon/cmodel.c I can not reproduce this bug (tried 26 times in a row, previous record was 7). So that freeing I would say is causing my system issues, but I don't understand how or why...

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #8 on: August 29, 2006, 12:37:42 pm »
OH boy, I think I have finally figured it out

sv.models[1] is mapped to &mapTiles[1].cmodels[258] via qcommon/cmodel.c : CM_InlineModel

&mapTiles[1].cmodels[258] exists in shared memory since it is allocated by qcommon/cmodel.c : CMod_LoadSubmodels

this shared memory is then freed by CM_LoadMap [qcommon/cmodel.c]

but there is still a function that tries to use sv.models[1] after the free (namely SV_HullForEntity [server/sv_world.c]).

obviously no guarantee that that memory hasn't been overwritten since a whole chunk of previously allocated shared memory where that data was is now unmapped. Hence might work some of the time, and not work at other times....

Not sure why it is happening for me on the Village map only though.

PS: this is a log of allocations around the critical section:

==== InitGame ====
------- Server Initialization -------
CM_AddMapTile: curTile->extraData = Hunk_Begin(0x400000)
Hunk_Begin called with maxsize: 4194304
CMod_LoadSurfaces: out = Hunk_Alloc
Hunk_Alloc called with size: 141632, curhunksize: 141632
CMod_LoadLeafs: out = Hunk_Alloc
Hunk_Alloc called with size: 89344, curhunksize: 230976
CMod_LoadLeafBrushes: out = Hunk_Alloc
Hunk_Alloc called with size: 8000, curhunksize: 238976
CMod_LoadPlanes: out = Hunk_Alloc
Hunk_Alloc called with size: 102688, curhunksize: 341664
CMod_LoadBrushes: out = Hunk_Alloc
Hunk_Alloc called with size: 29920, curhunksize: 371584
CMod_LoadBrushSides: out = Hunk_Alloc
Hunk_Alloc called with size: 203072, curhunksize: 574656
CMod_LoadSubmodels: out = Hunk_Alloc
Hunk_Alloc called with size: 11424, curhunksize: 586080
CMod_LoadNodes: out = Hunk_Alloc
Hunk_Alloc called with size: 446464, curhunksize: 1032544
CM_MakeNodes: curTile->tnodes = Hunk_Alloc
Hunk_Alloc called with size: 356992, curhunksize: 1389536
CM_AddMapTile: curTile->extraData = Hunk_End()
Hunk_End called. Final curhunksize: 1389536
ED_CallSpawn: NULL classname
Created AI player (team 0)
Created AI player (team 7)
-------------------------------------
0.0.0.0:0: client_connect


^]^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^_

^Bvillage07d
CL_Precache_f
CM_FreeTile: freeing tile->extradata
Hunk_Free called. curhunksize b4 free: 1389540
CM_AddMapTile: curTile->extraData = Hunk_Begin(0x400000)
Hunk_Begin called with maxsize: 4194304
CMod_LoadSurfaces: out = Hunk_Alloc
Hunk_Alloc called with size: 141632, curhunksize: 141632
CMod_LoadLeafs: out = Hunk_Alloc
Hunk_Alloc called with size: 89344, curhunksize: 230976
CMod_LoadLeafBrushes: out = Hunk_Alloc
Hunk_Alloc called with size: 8000, curhunksize: 238976
CMod_LoadPlanes: out = Hunk_Alloc
Hunk_Alloc called with size: 102688, curhunksize: 341664
CMod_LoadBrushes: out = Hunk_Alloc
Hunk_Alloc called with size: 29920, curhunksize: 371584
CMod_LoadBrushSides: out = Hunk_Alloc
Hunk_Alloc called with size: 203072, curhunksize: 574656
CMod_LoadSubmodels: out = Hunk_Alloc
Hunk_Alloc called with size: 11424, curhunksize: 586080
CMod_LoadNodes: out = Hunk_Alloc
Hunk_Alloc called with size: 446464, curhunksize: 1032544
CM_MakeNodes: curTile->tnodes = Hunk_Alloc
Hunk_Alloc called with size: 356992, curhunksize: 1389536
CM_AddMapTile: curTile->extraData = Hunk_End()
Hunk_End called. Final curhunksize: 1389536
LoadMap
Map: village07d

Notice the pattern repeats, it is separated by a Hunk_Free call... actually the data loaded after the Hunk_Free is identical to the earlier data as far as I can tell... the Hunk_Free just corrupts our sv.model[1] data then restores the original data back in a new memory location.... I would have to double check this though.

Offline Bandobras

  • Captain
  • *****
  • Posts: 586
    • View Profile
Village mission crash (now with patch)
« Reply #9 on: August 29, 2006, 01:03:54 pm »
Congrats on tracing down the Hunk_Free issue! Keep up the good work!

Quote from: "ubequitz"
Code: [Select]

out->tile = curTile - mapTiles;

where out has type cmodel_t, so looking in [game/q_shared.h] cmodel_t->tile we find this has type int. However curTile and mapTile_t are of type *mapTile_t

Looks odd indeed


Such oddities are common in C, AFAICT. This should work OK...

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #10 on: August 29, 2006, 10:22:40 pm »
Well I thought I had a fix for this bug. It was going to be very simple.

In CL_Precache_f [client/cl_main.c] I tried commenting out the call to CM_LoadMap. I didn't think it was needed since SV_SpawnServer [server/sv_init.c] calls it first so we avoid loading things twice. It also avoids the bug where we free memory that the server has a pointer to (see the earlier postings I made).

However in a Multiplayer game both calls to CM_LoadMap are needed (one for the server and one for the client).

So the fix should be... in CL_Precache_f test to see if the server has loaded the map already and if has, don't load it. I'm not yet sure of a clean way of implementing this test though.

ubequitz

  • Guest
Village mission crash (now with patch)
« Reply #11 on: August 30, 2006, 06:12:54 am »
Here is a patch that corrects the problem. It isn't as absolutely clean as I would have hoped but I've tested it relatively thoroughly and it works.

Code: [Select]

Index: src/qcommon/cmodel.c
===================================================================
--- src/qcommon/cmodel.c        (revision 3093)
+++ src/qcommon/cmodel.c        (working copy)
@@ -176,6 +176,8 @@
 static byte tfList[HEIGHT][WIDTH][WIDTH];
 static byte tf;

+static qboolean mapLoadedByServer = qfalse;
+
 void CM_MakeTnodes(void);
 void CM_InitBoxHull(void);

@@ -1092,7 +1104,7 @@
  * @brief Loads in the map and all submodels
  * @sa CM_AddMapTile
  */
-void CM_LoadMap(char *tiles, char *pos)
+void CM_LoadMap(char *tiles, char *pos, qboolean serverCall)
 {
        char *token;
        char name[MAX_VAR];
@@ -1100,6 +1112,14 @@
        int sh[3];
        int i;

+        /* return if the server has already loaded the map */
+        if (mapLoadedByServer && !serverCall) {
+               mapLoadedByServer = qfalse;
+               return;
+        }
+
+        mapLoadedByServer = serverCall;
+
        /* free old stuff */
        for (i = 0; i < numTiles; i++)
                CM_FreeTile(&mapTiles[i]);
Index: src/qcommon/cmodel.h
===================================================================
--- src/qcommon/cmodel.h        (revision 3093)
+++ src/qcommon/cmodel.h        (working copy)
@@ -11,7 +11,7 @@

 extern vec3_t map_min, map_max;

-void CM_LoadMap(char *tiles, char *pos);
+void CM_LoadMap(char *tiles, char *pos, qboolean serverCall);
 int CheckBSPFile(char *filename);
 cmodel_t *CM_InlineModel(char *name);  /* *0, *1, *2, etc */

Index: src/server/sv_init.c
===================================================================
--- src/server/sv_init.c        (revision 3093)
+++ src/server/sv_init.c        (working copy)
@@ -769,7 +769,7 @@
                else
                        sv.configstrings[CS_POSITIONS][0] = 0;

-               CM_LoadMap(map, pos);
+               CM_LoadMap(map, pos, qtrue);
        } else {
                /* fix this! what here? */
        }
Index: src/client/cl_main.c
===================================================================
--- src/client/cl_main.c        (revision 3093)
+++ src/client/cl_main.c        (working copy)
@@ -980,7 +980,7 @@
        S_StopAllSounds();
        MN_PopMenu(qtrue);

-       CM_LoadMap(cl.configstrings[CS_TILES], cl.configstrings[CS_POSITIONS]);
+       CM_LoadMap(cl.configstrings[CS_TILES], cl.configstrings[CS_POSITIONS], qfalse);

        Com_Printf("LoadMap\n");
        CL_RegisterSounds();


Basically, CM_LoadMap is only called in two places (once when the server loads a mission map and sometimes by the client). So this patch simply checks to see if the last call to CM_LoadMap was by the server, if it was it doesn't load the map. Otherwise it does (thus if server doesn't load the map - i.e. in multiplayer when you are running the client only, then you still get the map).

PS: I hope this applies ok. In some of those files I have some printf style debugs around the place in my working copy so the line numbers might be slightly out compared to SVN.