Search Unity

Using fragmented channel sometimes fails with UNET transport errors

Discussion in 'UNet' started by ttRevan, Oct 23, 2015.

  1. ttRevan

    ttRevan

    Joined:
    Feb 24, 2014
    Posts:
    31
    Version: 5.0 - 5.2.0p1
    Platforms:
    • windows, linux - server;
    • windows, osx - client
    We have a client-server game where server sends some configuration data to client at start. This data represents a 20K json-serialized string + some metadata. That's pretty large chunk, so we're using a ReliableFragmented QOS. This string is always the same, so its content is not the reason of the issue. Length is 20959 JFYI, that's the length written by NetworkConnection.SendByChannel (little confession - I did some reverse engineering of IL code, sorry)

    For communication we're using HLAPI, but only for connecting and sending custom messages by channel, i.e. all spawning and state management is handled by our custom protocol on top of Unity. The flow is as follows:
    1. client sends "ready to receive config"
    2. server receives that and sends config
    3. client receives config and sends "ready for game"
    4. server receives that and starts the game
    So, most of the time this approach works as expected, but rarely, after server has sent config message (stage 2), game just... hangs there. It not freezes or becomes unresponsive, it just does nothing, since client waits for config and server waits for client to confirm. No "message received" on client, no "connection lost", no any crash on server. The only sign of some error we're get is "Event has already in the list" on server which comes with no stacktrace, because it happens (i think) in the C++ network library.

    Search for this string on the internet gives exactly zero results. I did some packet tracing and discovered that message is sent by server and, moreover, is received by client machine and by this I mean that client machine registers those UDP packets, but game client does not call an appropriate NetworkMessageDelegate. I tried to set LogFilter.currentLogLevel to zero, but no additional logs were printed.

    It looks like I need to file a bug report, but so far I was unlucky to get any Unity guys' attention in a bugtracker :). I'm afraid Unity users cannot help my unless they have faced and resolved the same issue.

    Dear Unity developers, I really looking forward in your assistance on the issue. To help you find the cause, I can provide you with:
    • *.etl files with packets tracing from both client and server.
    • process dump made with Windows Task Manager at the very moment of error.
    • Running server on Windows which is now in erroneous state, so you can provide me with instructions to perform on it, or I can even give you a client build to connect.
    • project source code, but the issue is really hard to reproduce.
    Thanks in advance and sorry for a large post!
     
  2. ttRevan

    ttRevan

    Joined:
    Feb 24, 2014
    Posts:
    31
    Today I updated to 5.2.2p1 because I noticed some Fragmented channel fixes in release notes. But issue is still there. Here is what happens in Editor on the server:
    upload_2015-10-24_20-40-36.png
    So I have additional errors:
    • !evnt.IsInList()
    • if it is happened something wrong with initial parameters, check disconnect timeout and ack delay timeout
    • val >= 0
    Again, no stack trace, no results on the internet. Feel myself like Matt Damon in his latest movie :)
    One of messages says my initial parameters are wrong, but I didn't change them from default except for adding some channels and setting MaxSentMessageQueueSize to 1024 on server and 512 on client.
    From val >= 0 error message I can guess that Unity expects some value to be negative, but have no idea what it could be for a network library.
     
  3. ttRevan

    ttRevan

    Joined:
    Feb 24, 2014
    Posts:
    31
    Update: I tried to implement my own simple fragmentation algorithm on top of Reliable (not even sequenced) channel. I just chop that big message in a bunch of small (~500 bytes) numbered ones and send them all at once. And it works... guess it! most of the time :) Sometimes it fails with the exact same sequence of error messages. By fails I mean it drops some of small messages and double-delivers others.

    I suspect that I'm doing something wrong, maybe don't cleanup somewhere properly, but with those mysterious errors I don't have a chance to figure out where exactly.
     
  4. emrys90

    emrys90

    Joined:
    Oct 14, 2013
    Posts:
    755
    Did you ever figure any of this out? Occasionally, my server crashes. I have not found a way to reproduce it yet, so not sure what the exact cause is. Right before it crashes, I get the same error of "event has already in the list". Like you, there is absolutely nothing out there by googling it.
     
  5. lifeFo

    lifeFo

    Joined:
    Apr 14, 2017
    Posts:
    15
    me too