Search Unity

Crash in RakNet.UpdateNetworkLoop

Discussion in 'Linux' started by Alloc, Nov 17, 2016.

  1. Alloc

    Alloc

    Joined:
    Jun 5, 2013
    Posts:
    241
    Hi,

    just confirmed this by debugging a Linux server of one of our users, but as every now and then Linux server operators report similar random crashes I think this might be "more common".

    For that one system I just had a look at the game crashes as soon as RakNet is initialized, core dump gives this trace:
    Code (csharp):
    1. #0 0x00006584bd06d067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
    2. #1 0x00006584bd06e448 in __GI_abort () at abort.c:89
    3. #2 0x00006584bc8b9976 in ?? () from <....>_Data/Mono/x86_64/libmono.so
    4. #3 0x00006584bc85c8a5 in ?? () from <....>_Data/Mono/x86_64/libmono.so
    5. #4 <signal handler called>
    6. #5 0x00000000013454c4 in UpdateNetworkLoop(void*) ()
    7. #6 0x00006584bde150a4 in start_thread (arg=0x65843439d700) at pthread_create.c:309
    8. #7 0x00006584bd12062d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
    I suppose frame #0 to #4 can be ignored as that's just the signal handling, but #5 is what makes me believe it's due to RakNet. This was also further confirmed by disabling the RakNet protocol so only UNET and Steam Networking were left active and the game didn't crash anymore.

    This was on a Debian 8 64 bit installation. No port conflicts.

    Is there anything that we can look for on that specific system that might be causing the issue? Or is there some generic error within RakNet?

    Due to the issues with UNET and high traffic connections we can't rely on that as it causes a *lot* of people timeouts/disconnects as early as logging in but sometimes also later on. RakNet was very stable for us so we would like to be able to use it until UNET becomes as stable :)

    Regards,
    Chris
     
  2. Tak

    Tak

    Joined:
    Mar 8, 2010
    Posts:
    1,001
    Yes, it looks like segmentation fault in UpdateNetworkLoop, followed by mono checking whether it's a managed NRE or similar, followed by abort (because it isn't).

    UpdateNetworkLoop is part of raknet. The code looks pretty straightforward - the only way I can see it crashing in that frame is if the peer is null (or has been stomped by bad memory access somewhere else). If you can provide more info, we can investigate further, but nothing's jumping out at me just from this.
     
  3. larus

    larus

    Unity Technologies

    Joined:
    Oct 12, 2007
    Posts:
    280
    We have an updated transport in the UNET implementation, this is shipping in 5.6 but we have released a 5.4 based build with this update so people can try it on their projects and give us feedback. Hopefully this will fix the issues you have had, please let me know if you are able to try it, also if it doesn't fix your issues because then we'll need to look into it in the new implementation. Old system/RakNet will go away eventually, the release is will happen is not decided yet but it's getting closer.
     
  4. Alloc

    Alloc

    Joined:
    Jun 5, 2013
    Posts:
    241
    Thanks for your replies :)

    Bit of more information on what's happening on that startup phase: RakNet is started like this:
    Code (csharp):
    1. NetworkConnectionError res = Network.InitializeServer (128, 26910, false);
    Result is NoError. After that we only initialize the Steam server API (different ports) and then we receive the last log message 7ms after starting RakNet. Nothing else happens with RakNet at all in this time from our end. There can't be a lot happening (as in code executed) between the last message we receive and the actual crash as there would rather soon be more log messages be output. The only stuff we do in between is update some internal array values for the game settings and load a Tex2D.

    Server is running a Debian 8 64 bit, Kernel 3.14, on a Xeon E5-1620 v2 with 64 GiB RAM.

    Same issue also seems to apply to another Unity based game's server that user is running that's also still using RakNet.

    Anything else that would help? Have no idea what I could provide that would help with finding the source of this issue.



    Yeah, we're aware of that update. It sounds really promising. Unfortunately though there's no backport to the 5.3 series (wonder why though as it sounded like it's strongly separated code) and we currently can't (and don't want) switch to Unity 5.4 yet.

    Regards,
    Chris
     
  5. Alloc

    Alloc

    Joined:
    Jun 5, 2013
    Posts:
    241
    Bump :)
     
  6. Alloc

    Alloc

    Joined:
    Jun 5, 2013
    Posts:
    241
    So nothing that can be done about this or helps figuring out what might be causing this issue?