History
When WebRTC was first introduced it was meant to be used in a peer-to-peer environment with a small number of users broadcasting and subscribing to each other. At the time, this was a wicked-cool new development for modern web browsers and allowed for small-group audio and video calls.
Nowadays WebRTC is being used to broadcast to a larger population of subscribers (for example a Crowdcast event with dozens or hundreds of live attendees). When you start broadcasting to a bigger group, it doesn’t make sense to have all the peers connected to each other. The concept of a Selective Forwarding Unit (SFU) was introduced to start scaling WebRTC. The SFU is a server that all the browsers connect to, and the server forwards the WebRTC packets from the publisher's browser to each subscriber's browser. It looks a little like this.
- Publisher - a browser that has their camera and mic on, they are broadcasting video and audio data. In Crowdcast we call this the host.
- Subscriber - a browser that is consuming the broadcast, they are receiving video and audio data. In Crowdcast we call this an attendee.
- Bitrate - the measure of bits per second being transferred over the internet. Higher bitrate results in better video and audio quality.
By introducing an SFU we can solve the first problem of having too many connections to the publisher. If there are 10 subscribers, the publisher doesn’t need to be connected to each of them. The publisher sends their video and audio bytes to the SFU and the SFU forwards those bytes to each of the 10 subscribers. That is how we can start using WebRTC with a larger number of subscribers.
Setting publisher bitrate
Without an SFU, when all the clients are connected to each other, the publisher will send video & audio bytes to each subscriber at the bitrate that the subscriber requests. Subscribers on slow connections will receive media at a lower bitrate and subscribers on fast connections will receive media at higher bitrates.
With the topography pictured above, there is now an SFU in the middle, the SFU will receive feedback from each subscriber about their network connections and out of necessity will instruct the publisher to satisfy the needs of the worst subscriber (the lowest bitrate). This is to ensure that all subscribers in the session will be able to consume the publisher's content.
The obvious problem with this is that one bad subscriber will bring down the quality for all the subscribers. See this image:
Let’s put this in the context of Crowdcast, when you’re a host and you are broadcasting to a group, there is a really high likelihood that as soon as you get a few attendees there is going to be someone with a bad connection. We can’t afford to force all the attendees to suffer because of one person on a bad connection.
Enter Simulcast
Luckily, WebRTC has matured to introduce a feature that deals with this, it's called Simulcast. With Simulcast enabled, the publisher will encode their video into multiple different bitrates and send all of the streams to the SFU. The SFU will negotiate with each subscriber and send each subscriber whichever stream is best for them.
Simulcast is a publisher feature and when properly implemented, the subscriber does not know that Simulcast is happening behind the scenes.
Genius! Now we can actually run WebRTC in a broadcast scenario. No matter how many subscribers we have we do not need to bring down the bitrate of the publisher for weak subscribers. Each subscriber can consume whichever bitrate is best for them. There is a little tradeoff, there is more overhead on the publisher side (higher CPU, memory utilization and bandwidth). But the benefits for every subscriber far outweigh this downside for the publisher.
So where can we use Simulcast now?
Since Simulcast is a publisher feature, the browser that is publishing has to implement Simulcast per codec. Right now, the main codecs used for WebRTC are: VP8 and H264.
- Chrome supports: VP8 and H264 for publishing and subscribing. Simulcast is supported in VP8.
- Safari supports: H264 for publishing and subscribing. Simulcast is not supported
- Firefox supports: VP8 and H264 for publishing and subscribing. Simulcast is supported in VP8.
For now, since we require Simulcast, we have to force hosts to publish in Chrome or Firefox with the VP8 codec. The downside of that is that Safari cannot subscribe to streams in the VP8 codec. We work around this by showing compatibility mode on Safari, which is an HLS version of the session.
The good news is that Chrome is in the process of adding H264 Simulcast support. Soon after this lands in Chrome, we'll be able to switch our sessions to the H264 codec (with Simulcast) and users in Safari will be able to subscribe to the real-time sessions. In fact, this functionality already landed in Chrome Canary a couple months ago, so it should be making its way into Chrome Stable soon. [1, 2] Hopefully Firefox will follow suit. [3] I'll be sure to post an update here when the status changes.
The Future: Scalable Video Coding (SVC)
There is an even better idea on the horizon that improves on Simulcast, it's called Scalable Video Coding (SVC). SVC is a codec-level feature that allows a publisher to encode their video stream in multiple layers. Each layer builds on the previous layer to go from the lowest bitrate to the highest bitrate. This is beneficial so that the publisher is not required to encode and send multiple versions of the same video (they only have to encode and send 1 stream) and the SFU can peel off the layers when sending video to each subscriber.
Right now, SVC is not supported by the current codecs in a WebRTC environment. It is being supported by VP9 (the successor to VP8) and H265 (the successor to H264). When codecs mature and WebRTC is able to support SVC it will be a big step forward. We're keeping our eyes peeled for when we can start using this. Until then, we will be using Simulcast.
Update October 10, 2018:
Safari gets H.264 Simulcast in Safari Technology Preview Version 67
- Release notes and Changelog
- Safari Technology Preview is Safari's early developer release browser.
- This is good news, if Chrome and Safari both support H.264 simulcast then we can start using this codec for Crowdcast sessions and all three of the main browsers will be able to subscribe to sessions in realtime.
- Surprisingly, this Safari Preview Release also includes experimental VP8 support with Simulcast. This is off by default and not included in the release notes but this is more good news for Crowdcast to support realtime simulcast sessions in Safari. See the discussion here
Notes:
[1] http://webrtcbydralex.com/index.php/2018/06/21/h-264-finally-a-first-class-citizen-in-webrtc-stacks/
[2] https://bugs.chromium.org/p/webrtc/issues/detail?id=5840
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1210175