Search Unity

Unity collaboration forum analytics

Discussion in 'General Discussion' started by Deequation, Jul 9, 2015.

  1. Deequation

    Deequation

    Joined:
    Jan 2, 2014
    Posts:
    16


    The following thread is strictly for entertainment and giggles, so please do not take anything here too serious or too literal. There have been applied some very rough cuts in the corners in a mathematical sense, and a few things have been altered slightly to give a somewhat more concise representation. There are a few important things to keep in mind when evaluating the graphs. The data is a sampled set from about the last thousand threads in the collaboration forum only, and when referring to the users, it refers to the thread starters in that set only. These threads includes things like closed, edited and moved threads as well. All data have been normalized and you will see percentages of the sample set, rather than any absolute numbers. Note that since there is no direct metric for determining the success of a collaboration thread, all of the following information is made just out of curiosity and to determine trends and averages. Given this lack of a predicate they will not provide any direct conclusions, although indirect conclusions will be formed occasionally.







    If we start by looking at threads that actually uses images to advertise their projects, against those that do not, we get that on average each thread contains a stunning 0.46 images. Now this is not a bad number at all, when you consider that some threads might just be people looking to join a project and in that regard has nothing to post. It also doesn’t include any images buried under external links. We choose to exclude these, as getting members for your project is about instant attention catching, and posting a direct image will instantly portrait your concept and ideas – while a linked image does not. It should be noted that the maximum amount of images you are allowed to post is twenty. Now let us take a look at how many images the different threads uses and the distribution between threads with none, those with a few and those with many, we get these:




    But hold on – what is happening here. We clearly had near half an image per post on average and yet, there is a vast majority of threads that contains no images at all. Clearly the average is getting pulled quite a lot up by threads with a lot of images. Now we could bore you with some statistical numbers like the variance, but the meaning of the number would be lost on the majority of the readers, so instead let’s look at a binary distribution on threads that has at least one against the threads that doesn’t have any. This surprisingly shows that nearly 80% of threads have no images at all.










    Now before we take a closer look at the actual content of these threads, let us take a moment and just look at who the people posting the threads actually are. If we look at how active the users are in terms of their posting count (excluding posts on Questions-&-Answers as that is counted separately to the forum post count), we get that
    on average each user has 68.4 posts measured at the time of the data collection. The high rollers in this regards in this reaches over 3000 other posts. As the first graph shows, an overwhelming amount of users that have very little other posts, and it has a very quick falloff. Compared to a normal forum post distribution this gets a much higher falloff, nearing a falloff rate of question related forums. This was not totally unexpected, and it is quite hard to read anything into the the data, even when focusing only on the first month on its own. However from the data, it shows that about 20% of thread starters has no other posts than their initial collaboration entry. Now we ask the question, how many users are actually active posters, and do a binary distribution on users with five or less posts against those that has above five. This gives a depressing view of nearly 45% of the thread starters are inactive forum users.





    Now seemingly a user might be someone who registered a long time ago, and have been using the forum as a source of information or just never had a reason to post anything in the forum. It is a little unfair to label these as ‘people who just register to post a single thread then leave again’ so let us take a look at the time measured in months, since a user initially registered on the forum to when they made their collaboration thread. This tells us about how long a user have potentially been a member of the community, before trying to create a project. The light shines through, as we see people who have been members for years upon years before posting, as seen on the first graph here. Although our assumption from the post count before, that people are likely to just register and post, then leave, looks justified as nearly 20% of users creates their collaboration thread within a week after joining the forum. An interesting thing that we did not look at, was how many of these 'insta-posters' were also single-posters, as there might be an fair overlap. However we choose to believe that the fact that 80% of users have at least a week on the forum also means they don’t just register to post, although the first graph shows that most posts is still within a month of join.











    Now with no better place to put this, let’s just go through it here. How active is the collaboration forum, is it increasing or decreasing and when does it peek. Now notable the data set generated were scraped in a backwards fashion from today and then going back in time, we might miss early trends. But if we look at the amount of posts per month, measured in months back in time, we can see that (we did intentionally not include any regression analysis here) the activity is pretty much stable over the last 18 months with averaging about between 100 and 200 new threads per month, with what could lightly be called an increasing amount.










    Now this segment is pretty vague and non descriptive, as again there is no way to link any correlation between the values and averages to what makes a good post. Nonetheless, it was a metric that was really easy to include, even if it doesn’t give any notable information. The following are graphs over thread title lengths and the word count of the thread content, measured to scale. Some of the very low counts are threads that have been altered to say "close" or "filled" by the users, and have not been deleted by moderators.











    The following is a series of different groupings of words from the thread contents, and their relative frequencies of use. There’s a lot to be said about this form of data analytic. People use different spelling, abbreviations, synonyms and misspelling, so a major step to give a somewhat accurate overview was to group up a lot of these into the same groups. This means that words as ‘C#’ and ‘Csharp’ have been compiled into same group, and 'mmorpg' have been added to 'mmo', as two examples. There are endless groupings you could do, and we choose to select a few that seemed reasonable and funny to show. The data from these are not normalized, so the same word being used multiple times in a thread will count multiple times. Considering the large amount of data set that we ran this on, there were lots of similar words that would fit into the group categories, but simply did not have enough counts to warrant adding them (someone actually did use the words lisp, fortran and cobol). Remember that word frequencies is a thing that is very hard (impossible) to get perfect, so read into them easily. The size of the graphs are the relative frequencies between the words shown. If anyone is
    interested in the full word list with counts, just contact me.









    Now for some closing words, we lied, they won’t be about unicorns. If you have any comments or feedback, or metrics you want us to add to this, just throw me a message and we will look into adding them. Depending on how well this is received, we might do the same for the questions and answers forum, where we can actually measure the success as well. if you liked this thread, if you hated it or if you just thought it was cool that someone wanted to use their time to contribute to an amazing community, then give the post a like or send us your thoughts and most of all, enjoy using Unity!



     
    Last edited: Jul 22, 2015
  2. Tenebrous

    Tenebrous

    Volunteer Moderator

    Joined:
    Jul 25, 2011
    Posts:
    102
    Interesting stuff! What other info did you / could you gather?
     
  3. landon912

    landon912

    Joined:
    Nov 8, 2011
    Posts:
    1,579
    Very cool!
     
  4. Eric5h5

    Eric5h5

    Volunteer Moderator Moderator

    Joined:
    Jul 19, 2006
    Posts:
    32,401
    In the future, please post text as text. We have some vision-impaired users on the forums, who won't be able to read that easily. Also, it's impossible to search for, which is a shame since it looks like interesting info.

    --Eric
     
    angrypenguin likes this.
  5. Kiwasi

    Kiwasi

    Joined:
    Dec 5, 2013
    Posts:
    16,860
    What were your attempting to discover? While there are benefits to unstructured data analysis from time to time, it's often better to ask specific questions of your data.

    Examples:
    • Do images increase the activity of a post?
    • Do active users get more activity?
    • What can new posters to to get more attention?
    • Are there any early indicators of a projects success/failure?
    All that said, it was interesting data. And most depressing is the high frequency of the term MMO.
     
  6. Deequation

    Deequation

    Joined:
    Jan 2, 2014
    Posts:
    16
    In terms of other things we could gather from the collaboration forum, well we did collect all available metrics, and this gives us quite a lot of other things that we potentially could compute, but choose not to. This includes things like correlations that makes little sense, such as title lengths against views, as this does not factor in time on the forum and thus doesn't give an accurate representation. There were also things that seemed interesting, like view count against time on the forum, but the graphs weren't consistent enough to conclude any trends. There are a couple of things we wanted to show.

    I absolutely agree, this is very unstructured with no clear intention to discover anything. We wanted to just take a look at what trends were happening, and what roles were in demand, but the other metrics were just a side effect that required little extra effort to compute, and if anyone finds it fun to watch, then the purpose is filled as far as we're concerned. As far as the things you are listing, it is true that we could graph out things like image count against view count, but as previous explained view count is a steady increase over time and this would make the correlation somewhat inaccurate, and having graphed this out already, it doesn't show that images actually matters (although this is based on view count only, as we have no way of measuring the success of the post). I think overall the conclusions would be pretty self-explanatory like 'use some images in your post to get peoples attention' and 'stay active on the forum' - and the irony would be, the ones who post these (arguably) S*** posts, is most likely not the ones who would ever read this post.
     
    Last edited: Aug 1, 2015
  7. PassivePicasso

    PassivePicasso

    Joined:
    Sep 17, 2012
    Posts:
    100
    Perhaps the fact of the high desirability of creating an MMO shouldn't be depressing.
    Obviously everyone wants to be able to have massive multiplayer capabilities in their games. Games are highly social experiences.
    So perhaps we should take a step back and try to convince Unity Technologies to work on more multiplayer capabilities to simplify some of the more difficult tasks associated with created an MMO.
     
  8. tedthebug

    tedthebug

    Joined:
    May 6, 2015
    Posts:
    2,570
    Maybe Unity could make an MMO tutorial ?

    Maybe the stats could be cross referenced with the showcase forum (or any other one that allows users to show what they have made) to see if anyone posting in this analysed forum ever manage to post in the ones for completed projects?
     
  9. MattMcg-GameDev

    MattMcg-GameDev

    Joined:
    Jan 8, 2014
    Posts:
    8
    I just got linked this in a discussion. Very interesting statistics and insight. Reminds me of the kind of topics you see on the TIGSource forums. Those with lots of pictures and active users tend to be the most popular. Whilst the dev logs by people who only post in their topic (or not at all) and have no pictures barely get any views or comments.
     
  10. Deequation

    Deequation

    Joined:
    Jan 2, 2014
    Posts:
    16
    # casual bumping #
     
  11. Storyteller

    Storyteller

    Joined:
    May 15, 2012
    Posts:
    23
    fascinating, Im surprised there isn't greater discussion.
    Id be interested in the data on which threads gained collaborators and which ones resulted in a finished product.
     
  12. Kiwasi

    Kiwasi

    Joined:
    Dec 5, 2013
    Posts:
    16,860
    Both of these would require a detailed survey of the users. But my gut feel would be <10% for the first and <1% for the second.