Search Unity

uSpeak Voice Chat - work in progress

Discussion in 'Works In Progress - Archive' started by PhobicGunner, Jul 25, 2012.

  1. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    $uSpeakBanner_small.png
    Today I'm proud to introduce an upcoming voice chat system for Unity - uSpeak.
    uSpeak will enable your game with efficient low-bandwidth communication, instantly.

    NOTE: THE FEATURE SET AND DEMO VIDEOS BELOW ARE OUTDATED.
    SINCE THE BETA, USPEAK HAS UNDERGONE MASSIVE IMPROVEMENTS
    PLEASE REFER TO THE OFFICIAL RELEASE THREAD:
    http://forum.unity3d.com/threads/15...hat-Voice-Chat-For-All-Your-Multiplayer-Games


    Features

    • Uses a combination of the Speex codec and Deflate compression, for minimal bandwidth usage
    • Plugs into any network backend - if you can serialize bytes and ints, you're good to go!
    • Supports four kinds of voice trigger - push to talk, toggle talk, volume activated, and always on.
    • Spatial audio support - simply tick a checkbox and your players' voices will pan in realtime 3D
    • Supports all major publishing platforms - browser, standalone, and mobile, and all licenses - Indie, Pro, and Mobile!
    • Runtime microphone detection - no need to restart the game
    I plan on releasing this package in the asset store soon for the low price of $30

    The beta period is now over. Beta testers can keep and use their downloads free of charge, and exactly one week from today I will release uSpeak on the asset store.

    EDIT: The fluttering effect seen below and mentioned by wccrawford has now been fixed, barring internet latency uSpeak now has flawless playback.

    Check out a video of uSpeak in action!


     
    Last edited: Oct 26, 2012
  2. wccrawford

    wccrawford

    Joined:
    Sep 30, 2011
    Posts:
    2,039
    I can hear some flutter to their voices. I usually associate that with running out of bandwidth... Could be an artifact of Speex compression at the lower settings? The audio of the stream and their guns is crystal clear, but his voice flutters a lot.

    Any idea what price you're planning to sell this for, and an ETA on release?
     
  3. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Well as for ETA the package itself is mostly complete. I'm going to try dealing with some of that flutter, add some extra features, polish off the documentation, etc. Shouldn't take too long (a couple days at most)
    As for that flutter, I think it's actually a result of packet bunching. Occasionally, the voice sounds perfect, but every so often the flutter happens, and I think it's because currently my system immediately decompresses and plays back a packet when it's received (and if a bunch of packets come in at the same time, they are played at the same time). I'm going to add a buffer, which should mitigate those issues. Also, I do notice the audio seems to stutter more when I use Speex (if I turn off Speex, I don't hear much stutter), so I think it's partially an artifact of the algorithm.
    As for price, I'm thinking something around $20 or so.

    EDIT: Actually, as for ETA I think I'm going to wait a week before submitting this thing. It should be plenty of time for me to catch any bugs that might surface. I want this thing to be rock solid, if possible.
     
    Last edited: Jul 26, 2012
  4. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
  5. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Aha!
    In the last couple of days I've been trying to deal with the fluttering issue. It's been driving me crazy!
    I finally found out what was causing it. It turns out, this is actually an age-old problem in Unity, and it's the fact that Unity doesn't provide any good way to accurately time audio sequences. It wasn't anything to do with Speex at all.
    Something that definitely seems to improve it is setting DSPBuffer Size to Best Latency.
     
  6. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    New video demonstrating audio reverb zones (also demonstrates just how much better it sounds with the DSPBuffer setting)
     
  7. HeadClot88

    HeadClot88

    Joined:
    Jul 3, 2012
    Posts:
    736
    Hey,

    A few quick questions -

    First being - How much does this (Plan to) Sell for on the asset store?

    Second being - Can one team hear another team if they get too close?

    Thanks

    Ben
     
  8. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    First - I plan to sell this for right around $20.
    Second - No, there's no chance of that. The voice channel property actually filters out voice data, so if the local player receives voice data on a different channel than the local uSpeaker, it's completely discarded. Of course, if you wanted to you could leave the voice channel at -1 (which is a global channel) and code your own solution (so, if you actually wanted to have teams hear each other if they are near each other, it's entirely possible to do so)
     
  9. HeadClot88

    HeadClot88

    Joined:
    Jul 3, 2012
    Posts:
    736
    @PhobicGunner

    Thank you for the VERY quick reply :)

    The price point seems fair for what is being offered. Let me know when you add this to the asset store.

    Well at least it can be programmed in. :)
     
  10. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
  11. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Argh, I have a competitor.
    Normally I wouldn't care so much, except the competitor is offering it for $40, and on the Asset Store guidelines it says:
    I think selling mine for half price counts as grossly undercutting a similar product.
    I think I'm going to try to get away with $30 instead, especially since his does support one more codec that mine doesn't which could account for the $10 price difference. If that doesn't work, I'll try $35.
     
  12. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    OK, one person has requested a beta package.
    There's still 19 left! Completely free of charge!
     
  13. SevenBits

    SevenBits

    Joined:
    Dec 26, 2011
    Posts:
    1,953
    I'll take a beta package. Happy to help!
     
  14. wmgcata

    wmgcata

    Joined:
    Jul 20, 2012
    Posts:
    169
    I'd like a beta copy. I think i'm still in the first 20 lucky ones? ;-)
     
  15. orb

    orb

    Joined:
    Nov 24, 2010
    Posts:
    3,037
    Does this require Pro? I'm assuming it does, unless you ported Speex, but asking to be sure :)
     
  16. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Actually, I'm using a .NET port of Speex, so it is entirely Indie compatible :) In fact, I developed it on an Indie license
     
  17. orb

    orb

    Joined:
    Nov 24, 2010
    Posts:
    3,037
    Nice! I'd love to try it, then :)

    Side-project #389 could really use voice communication ;)
     
  18. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    There are now 16 of these left.. they're going much faster than I anticipated :D
     
  19. DaneC020

    DaneC020

    Joined:
    Mar 19, 2010
    Posts:
    191
    I would be interested in trying out a beta package as well. I am actually working on an online game but was unsure if I wanted to try and attempt handling VOIP when there were solutions out there already. It would be fun to try this out.

    -Dane
     
  20. SevenBits

    SevenBits

    Joined:
    Dec 26, 2011
    Posts:
    1,953
    Well, you have a nice package here. Besides, the only other person doing this never released a beta, so this is a chance for us to try voice chat in our games. ;)

    Also, you'll get feedback as well.
     
  21. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Yes, that is part of the reason I'm giving these away.
    I'm only one person, so having 20 people be able to use it in real cases and report any bugs or oversights is incredibly useful
     
  22. orb

    orb

    Joined:
    Nov 24, 2010
    Posts:
    3,037
    It's strange nobody had something like this officially available already. I was even considering it myself, but fortunately PhobicGunner got started first :)
     
  23. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    I find it kind of strange too. Especially since Unity already uses RakNet, why they don't go ahead and use RakVoice (since it appears to be included in the whole RakNet package) is beyond me.
    Oh well. Lack of built-in voice chat does provide us indie developers with market opportunities ;)
     
  24. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    For those of you who requested a beta download...
    I wish to apologize for a mistake in my code. Remote speakers will not process voice data, to fix it do the following:
    Move the entire IF block starting on line 235 (it starts with the comment 'receive voice packets') and the line below (receiveTimer -= Time.deltaTime) to before line 205 ( which reads 'if( !ready || SpeakerMode == USpeakerMode.Remote )' )
    Sorry for the inconvenience.
     
    Last edited: Jul 30, 2012
  25. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    A couple of users have reported some issues with uSpeak, which I'll outline here...
    1.) It seems the codec I am using (NSpeex) is prohibitively slow. I'm going to be replacing the codec with Mu-law compression.
    2.) On some microphones, uSpeak doesn't transmit anything in Volume Activated mode. I'm going to try replacing my current VAD algorithm with something much more robust, likely based on zero-crossing rate and short-term energy.

    Once I'm finished implementing these changes, I will send everybody a new version. Thanks so much for helping me work out some issues :)
     
  26. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    OK, on second thought scratch Mu-Law compression.
    After some digging I found that even though encoding does take quite a bit of time (20ms on average), it's only ever called on one player in the scene, so you will very likely never notice a slowdown. Decoding, on the other hand, is blazing fast (about 1ms on average).
    I did some experimentation, and now I think volume detection is a more stable.
    I also did quite a few bugfixes I noticed while looking though my code.
    All current beta users should be receiving a new version. If you don't get a new version within an hour, please contact me.
    Thank you guys so much for testing this thing.
     
  27. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    15 uSpeak beta packages left...
    I really need help testing this thing! I want it to be as rock solid as possible before I release it.
     
  28. Tikaro

    Tikaro

    Joined:
    May 18, 2011
    Posts:
    23
    Sign me up! I'll throw into our fps test lab and see how it performs under some load.
     
  29. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Sent!
    Now 14 left
     
  30. GodlyPerfection

    GodlyPerfection

    Joined:
    Jun 18, 2012
    Posts:
    23
    Would love to try this out Phobic. I won't have it implemented for a couple weeks as basic multiplayer code is on my todo list for the next couple of weeks, but I'm a big believer in the importance of voice chat to solidify a community. I'll be adding it to my stealth multiplayer game "Covet" which is currently staying public throughout development on the browser.
     
  31. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Sent via PM :D
    There are now 13 left
     
  32. ahmetDP_1

    ahmetDP_1

    Joined:
    Sep 23, 2010
    Posts:
    113
    Nice work. I'd love to beta test uSpeak.
    We will consider buying it for our project when it is released.
     
  33. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    No need to purchase, privilege of a beta tester :)
     
  34. ahmetDP_1

    ahmetDP_1

    Joined:
    Sep 23, 2010
    Posts:
    113
    Great! thanks
     
  35. GodlyPerfection

    GodlyPerfection

    Joined:
    Jun 18, 2012
    Posts:
    23
    This makes me very happy. Thanks phobic. :) Like I said I'll let you know when I get it implemented into my public version of Covet.
     
  36. UnLogick

    UnLogick

    Joined:
    Jun 11, 2011
    Posts:
    1,745
    Sign me up. I doubt I can make it scale on an mmo, but I'll give it a try. :)
     
  37. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Well, it could get tricky on an mmo. Think of a couple hundred people in a region all trying to talk at once. Not only is that a lot of bandwidth, it's also going to be very noisy. More likely, you'd want to either have voice chat in a few specific areas (like PvP arenas, where the number of players in one area is likely to be much smaller), or use the voice channels feature to separate parties into different channels (so a party of players can voice chat amongst themselves)
     
  38. GodlyPerfection

    GodlyPerfection

    Joined:
    Jun 18, 2012
    Posts:
    23
    If you have a proximity system that could work as well, do checks for distance every so often to turn on/off chat between particular people.
     
  39. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    That could be done super easily on the server-side. If you have a quadtree system, you could very quickly locate players around a given player, so if he sends voice chat it queries the quadtree for nearby players and sends the data to them. You'd also make the voices 3D so the player doesn't notice a hard line where voice data stops.
     
  40. orb

    orb

    Joined:
    Nov 24, 2010
    Posts:
    3,037
    LotRO and DDO does group-only chat. That's the best way to do it. A chatroom per party of up to 9. Special care can be taken to make a raid leader and each party's designated leader the only ones speaking in a raid, if you have that option. Then you can keep the bandwidth abuse low and people can hear orders barked over the tubes :)
     
  41. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Hm... a few more thoughts on bandwidth-saving.
    One thing you could do is when a party is formed create a separate private room for them, the sole purpose of the room is to ferry voice communication. Depending on your multiplayer tech, you might even just be able to open up a P2P connection between party members (in my case, I'm using Player.IO for an upcoming game which will be pretty small in scale, I'll probably just create a separate voice chat room when a player hosts a room. I could just as easily use Unity Networking to open up a direct connection between the players)
     
  42. JClaus

    JClaus

    Joined:
    Aug 1, 2012
    Posts:
    5
    Interested in a beta of this as well. Working on a game with a group of recent grads from my university to put our game design class we took into a video game. I have been looking into voice alternatives and was planning to port the Mumble client into Unity eventually....but this could save me the trouble :)

    Would be using SmartFox as a transport layer for voice packets and since the project is open source you are more than welcome to steal the SmartFox code once I get it working ;)

    Sounds like you are handing out the source, if that is the case let me know if you don't want that public and I can package it and only put the package on the repo.
     
  43. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Well, I'm not worried about people distributing the source without permission. The chances of that happening are actually fairly low, and if it happens I haven't deleted any messages and have enough proof that I am indeed the original creator ;)
    Also, as far as I know Mumble uses the exact same codec I am using - Speex. So essentially you're getting at least comparable voice quality (I think in the C++ version of speex you get more quality control, but other than that nothing is different)
     
  44. Bioblaze

    Bioblaze

    Joined:
    May 8, 2012
    Posts:
    103
    Throw me a Beta for this if you will. :D if thats ok with you ^_^
     
  45. SevenBits

    SevenBits

    Joined:
    Dec 26, 2011
    Posts:
    1,953
    I believe that's the case as well.
     
  46. UnLogick

    UnLogick

    Joined:
    Jun 11, 2011
    Posts:
    1,745
    Well I'll be going with the voice channels for party, guild, etc. That way I can mix the voice server side, make rules like if guild master / raid leader speaks others are faded down so orders can be barked, etc.

    Unless you did p2p band width usage for free roaming voice would be insane. Imagine early Iron Forge scenario's in wow, where hundreds or even thousands of players would be hanging around in one plaza. And you definitely don't want to do p2p because then players can get each other's ip's. I'm certain that some people would resort to killing a router to disconnect an opponent in pvp.

    Its worth a shot and I'll let you guys know what kind of bandwidth this consumes irl.

    Cheers,
    UnLogick
     
  47. SevenBits

    SevenBits

    Joined:
    Dec 26, 2011
    Posts:
    1,953
    I've always wondered about bandwidth usage as well from solutions like this.
     
  48. JClaus

    JClaus

    Joined:
    Aug 1, 2012
    Posts:
    5
    Correct. The biggest advantage of mumble is that it (and murmur) are already established. Not saying they are a superior product, but i would rather reuse something that has a proven track record then go through the pains of testing from alpha myself. Which is why I would gladly use your solution instead :)
     
  49. JClaus

    JClaus

    Joined:
    Aug 1, 2012
    Posts:
    5
    It seems p2p is actually more likely to kill clients. Admittedly, the bandwidth requirement on your server(s) would need to be insane to handle free roaming voice when there is any sort of scale. However, image if you walked into a room and 100 separate peers all wanted to resolve a UDP port to use similtaneously? I think someone magically reverse engineering a peer's IP (which is hidden by NATs in most cases) would be the least of your worries...

    Changing the voice mixing based on a set of rules is an interesting idea. I wonder how it would affect player experience if their party leader automatically was given 'command' when speaking? Would it support a more authoritative party structure in your game? Would it make players subconsciously less apt to argue/contradict a direct command from their raid leader? Essentially, you are allowing the leader to suppress freedom of speech...sorry I am straying. Interesting food for thought though :)
     
  50. PhobicGunner

    PhobicGunner

    Joined:
    Jun 28, 2011
    Posts:
    1,813
    Yeah, I think for any moderately scaled MMO you'd probably just want to leave out free roaming chat and stick with party chat, and possibly direct chat (click on another player to talk to them or something). Free roaming is bound to generate an insane amount of data.

    Also, for leader speaking, I think you could probably use the Gain for that, just check the isTalking function of the leader and if it is true set Gain of all other members to 0.5, otherwise 1.0. Maybe add a lerp to smooth it out.