WebRTC
WebRTC (Web Real-Time Communication) is a technology that enables real-time communication directly between browsers and applications without requiring additional plugins or software. Here are the key concepts and keywords associated with WebRTC:
Core WebRTC Concepts
- Peer-to-Peer Communication: Direct browser-to-browser connections without needing a server to relay media
- MediaStream API: Captures audio and video streams from devices
- RTCPeerConnection: Manages the peer-to-peer connection
- RTCDataChannel: Enables bidirectional data transfer between peers
Key Components and Protocols
- ICE (Interactive Connectivity Establishment): Framework for connecting peers
- STUN (Session Traversal Utilities for NAT): Helps discover public IP addresses
- TURN (Traversal Using Relays around NAT): Falls back to relay servers when direct connection isn't possible
- SDP (Session Description Protocol): Format for describing connection parameters
- Signaling: Process of coordinating communication between peers (not specified by WebRTC)
Security Features
- Mandatory encryption: All WebRTC components require encryption
- DTLS (Datagram Transport Layer Security): Secures data channels
- SRTP (Secure Real-time Transport Protocol): Encrypts audio/video
Applications
- Video conferencing: Real-time video meetings
- Voice calling: Audio-only communication
- File sharing: Direct peer-to-peer file transfers
- Screen sharing: Broadcasting screen content
Key Callback Methods
When three clients connect to each other, these are the main callbacks that get triggered:
-
WebSocket callbacks:
- onopen: When the WebSocket connection to the server is established
- onmessage: When signaling messages are received
-
Peer connection establishment callbacks:
- handleJoined: When a client successfully joins a room
- handlePeersInRoom: When a client receives the list of existing peers
- handleNewPeer: When a new peer joins the room
- createPeerConnection: Creates a connection object for a specific peer
- createOffer/handleOffer: Generate and process session offers
- createAnswer/handleAnswer: Generate and process session answers
-
Media and connectivity callbacks:
- onicecandidate: Triggered when ICE candidates are generated
- oniceconnectionstatechange: When connection state changes
- ontrack: When media tracks are received from a peer
-
Disconnection callbacks:
- handlePeerLeave: When a peer leaves the room
- onclose: When the WebSocket connection closes
In a mesh topology with three clients, the connection process becomes more complex as each client needs to establish direct connections with every other client. This requires:
- At least 3 separate peer connections (Client1↔Client2, Client1↔Client3, Client2↔Client3)
- 6 sets of offer/answer exchanges (2 for each connection)
- Multiple ICE candidate exchanges for each connection
The signaling server facilitates these connections but doesn't participate in the actual media streaming. Once connected, audio and video stream directly between peers in a fully decentralized manner. This implementation demonstrates how WebRTC can be used to build a multi-party video conferencing application with relatively simple server-side code, as the heavy lifting of media transmission happens directly between browsers.
The Process When "Sending an Offer":
The Process When "Sending an Offer":
- Client 1 wants to initiate a connection with Client 2
- Client 1 creates an offer using createOffer()
- Client 1 sets this as its "local description" using setLocalDescription(offer)
- Client 1 sends this offer to Client 2 through the signaling server
- Client 2 receives the offer and sets it as its "remote description" using setRemoteDescription(offer)
- Client 2 then creates an "answer" in response (which is another SDP document)
The ICE Process in WebRTC
The ICE (Interactive Connectivity Establishment) process follows a specific sequence of operations from candidate gathering to establishing a connection. Here's a detailed breakdown of what happens during the ICE process: 1. ICE Process Initialization
- Triggered by: Setting the local description (setLocalDescription(offer/answer))
- Creates: ICE agent with unique identifiers (ufrag/password)
2. Candidate Gathering The ICE framework begins discovering all possible ways your device could be reached: Host Candidate Collection
- Gathers local IP addresses from all network interfaces
- Includes IPv4, IPv6, VPN, physical and virtual interfaces
- Example: 192.168.1.5, 10.0.0.5, fe80::1234:5678:9abc
- Callback: onicecandidate triggers for each host candidate
STUN Server Queries
- Sends requests to STUN servers (e.g., stun.l.google.com:19302)
- STUN server replies with your public IP and port as seen from the internet
- Creates "server reflexive" candidates
- Example: Your public IP like 203.0.113.5
- Callback: onicecandidate triggers for each STUN candidate
TURN Server Allocation (if configured)
- Requests relay address from TURN server
- Creates "relay" candidates as fallback options
- Example: TURN server address like 74.125.143.127
- Callback: onicecandidate triggers for each relay candidate
3. Candidate Transmission (Trickle ICE)
- Each candidate is sent to the remote peer as soon as it's discovered
- This happens in parallel with continuing to gather more candidates
- Uses signaling server to relay candidates
- Code: socket.send(JSON.stringify({type: 'candidate', candidate: event.candidate}))
4. Candidate Reception and Processing
- Remote peer receives each candidate via signaling channel
- Adds it to its RTCPeerConnection
- Code: peerConnection.addIceCandidate(new RTCIceCandidate(message.candidate))
5. Connectivity Checks For each pair of local and remote candidates, WebRTC performs:
- STUN Binding Requests: Sent from each local candidate to each remote candidate
- Consent Freshness: Periodic checks to ensure the connection remains valid
- Prioritization: Tests pairs in order of priority (host pairs first, then server reflexive, then relay)
6. ICE State Transitions The RTCPeerConnection goes through various ICE states:
- new: Initial state, no network activity yet
- checking: Connectivity checks in progress
- connected: At least one usable candidate pair found
- completed: ICE has found the final candidate pair to use
- failed: All candidate pairs have failed connectivity checks
- disconnected: Connectivity lost but may recover
- closed: ICE agent has shut down
Code: These transitions are monitored via:
javascriptpeerConnection.oniceconnectionstatechange = function() { console.log("ICE state changed to: " + peerConnection.iceConnectionState); }
7. Candidate Selection and Connection
- The "best" working candidate pair is selected (typically the highest priority pair that passes checks)
- Media begins flowing through this selected path
- If this path fails, ICE will switch to another working candidate pair
8. ICE Restarts (if needed)
- If connection quality degrades or fails, ICE can be restarted
- Creates new ufrag/password and gathers fresh candidates
- All previous candidates are discarded
- Code: peerConnection.restartIce()
9. DTLS/SRTP Establishment Once ICE establishes the connection path:
- DTLS handshake occurs over the selected candidate pair
- Encryption keys are exchanged
- Secure media channels are established
Real-World Example: Callback Sequence When a new participant joins our video chat, here's the actual sequence:
1. setLocalDescription(offer) is called 2. onicecandidate callback: received host candidate 192.168.1.5:45664 3. [Send candidate to remote peer via signaling server] 4. onicecandidate callback: received srflx candidate 203.0.113.5:62788 5. [Send candidate to remote peer via signaling server] 6. onicecandidate callback: received relay candidate 74.125.143.127:49203 7. [Send candidate to remote peer via signaling server] 8. [Meanwhile, receiving remote peer's candidates via signaling server] 9. addIceCandidate() for each received remote candidate 10. iceconnectionstatechange: "checking" - testing candidate pairs 11. iceconnectionstatechange: "connected" - usable candidate pair found 12. ontrack callback: remote media stream received
This entire process typically takes between 200ms to a few seconds, depending on network conditions and whether the peers can establish a direct connection or need to use a TURN relay.
========================================
NAT : Network Address Translation
SDP : Session Description Protocol
Session Description Protocol (SDP) Document:
- An offer is an SDP document created by the initiating peer (the caller)
- It contains details about:
- Media types supported (audio, video)
- Codecs supported (H.264, VP8, Opus, etc.)
- Bandwidth requirements
- Network transport information
- Security parameters
Example of an SDP Offer
v=0 o=- 7645642868968801390 2 IN IP4 127.0.0.1 s=- t=0 0 a=group:BUNDLE 0 1 a=extmap-allow-mixed a=msid-semantic: WMS 8f1617fd-cad5-465e-9a3c-2fdeb441e04e m=audio 9 UDP/TLS/RTP/SAVPF 111 63 103 104 9 0 8 106 105 13 110 112 113 126 c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:+5Be a=ice-pwd:vDdvUFxj6jrKbWr5iuWN8t/f ...