Media API v1

VideoSDK's v1 media surface. Clean class hierarchy, async/await throughout, no MediaStreamTrack in your code unless you reach for the documented escape hatch.

Quick start

import { VideoSDK, MediaKind } from '@videosdk/js';

// 1. Connect — Promise resolves only after the join handshake completes
const room = await VideoSDK.join({
  token: '...',
  roomId: 'my-room',
  name: 'Alice',
});

// 2. Publish your camera and mic — Promise returns the stream directly
const camStream = await room.localParticipant.publishVideo({ deviceId: 'default', resolution: 'h720' });
const micStream = await room.localParticipant.publishAudio({ deviceId: 'default' });
camStream.attach(document.querySelector('#self-view'));

// 3. Render every remote participant — both already-present and future joiners.
//    The SDK fires `participant-joined` retroactively for participants who
//    were in the room before you, so a single handler covers both cases.
room.on('participant-joined', async (p) => {
  // Register handler FIRST — fires once per kind as each pipeline becomes ready
  p.on('stream-subscribed', sub => {
    if (sub.video) document.querySelector('#grid').appendChild(sub.video.createElement());
    if (sub.audio) sub.audio.setVolume(0.7);
  });

  // Then subscribe — accepts MediaKind | MediaKind[]. Promise resolves on server ack.
  await p.subscribe([MediaKind.Audio, MediaKind.Video]);
});

Core concepts

Streams (not tracks)

The SDK exposes VideoStream, AudioStream, and ScreenStream as managed handles. Raw MediaStreamTrack is hidden by default; reach it via the getMediaStreamTrack() escape hatch only when integrating with browser APIs like MediaRecorder.

Local vs remote subtypes

Each kind has a base interface plus distinct local and remote subtypes:

Asymmetric publish vs subscribe

Local publishpublishVideo / publishAudio / publishScreen return the stream directly via Promise. The same delivery also fires the consolidated stream-published event on LocalParticipant (covers explicit calls, JoinOptions async publish, and pre-call auto-promote — same handler for all paths).

Remote subscribesubscribe(kind) accepts a single kind or array, returns Promise<void> (server ack). The stream payload is delivered via the consolidated stream-subscribed event — fires once per kind as each pipeline becomes ready independently. This avoids blocking fast kinds (audio ~100ms) on slow kinds (screen ~300ms).

Lifecycle vs operational events

Publish notifications on RemoteParticipant (stream-published, stream-unpublished) fire when a remote's publish state changes — payload is { kind: MediaKind }; fires retroactively for already-published kinds.

Subscribe lifecycle on RemoteParticipant (stream-subscribed, stream-unsubscribed) — stream-subscribed delivers a Subscription with kind + the matching video? / audio? / screen? stream slot when your subscription pipeline is ready.

Operational events fire on the stream object itself — register them after the stream exists. Remote streams expose a single state-changed event delivering all transitions of streamState — routing (paused / ended) for all three subtypes, plus decoder observations (frozen / stuck) on video and screen. Video and screen also fire a separate quality-changed event for simulcast layer switches. Local subtypes expose source-level events (ended, plus silent-detected on LocalAudioStream).

// Notification on participant — they published, decide to subscribe
p.on('stream-published', ({ kind }) => p.subscribe(kind));

// Subscribe lifecycle — payload is Subscription with the freshly-subscribed stream
p.on('stream-subscribed', sub => {
  if (sub.video) {
    sub.video.attach(videoEl);
    // Single operational event on the stream covers every transition
    sub.video.on('state-changed', ({ state }) => {
      if (state === 'frozen' || state === 'stuck') showFreezeIndicator();
      else if (state === 'ended') removeTile();
      else if (state === 'active') hideFreezeIndicator();
    });
  }
});

Async + Promise rejection on failure

Every method that talks to the server (publishX, subscribe, setInputDevice, setPreferredQuality, pause/resume, …) returns a Promise. publishX resolves with the stream; everything else resolves with void. Initial failures (permission denied, server reject, timeout) come through Promise rejection — there are no *-publish-failed events.

Class index

ClassPurpose
VideoSDKStatic entry — join rooms, enumerate devices.
RoomConnected room handle. Owns participants and room-level controls.
LocalParticipantYou. Holds your streams and publish controls.
RemoteParticipantAnother participant. Holds their streams and subscribe controls.
VideoStreamBase shape for camera-style video streams.
AudioStreamBase shape for audio streams.
ScreenStreamBase shape for screen-share streams.
LocalVideoStreamYour camera, with device + processor control + frame capture.
LocalAudioStreamYour microphone, with device + processor control + silence detection.
LocalScreenStreamYour screen share, with processor control + frame capture. Bundled audio via .audio.
LocalScreenAudioStreamYour screen-share audio (when publishScreen({ audio: true })). Reached via me.screen.audio.
RemoteVideoStreamTheir camera, with simulcast control and pause/resume.
RemoteAudioStreamTheir mic, with volume + audio level + auto-play.
RemoteScreenStreamTheir screen share, with simulcast control and pause/resume. Bundled audio via .audio.
RemoteScreenAudioStreamTheir screen-share audio (when publisher enabled it). Reached via p.screen.audio.
VideoFrameProcessorPer-frame video transform (virtual bg, filters, generative).
AudioFrameProcessorPer-frame audio transform (noise cancel, mixing, TTS).
EnumsMediaKind, StreamState, MediaQuality, VideoCodec, DegradationPreference, ContentHint, LocalEvent, RemoteEvent.
TypesCameraDeviceInfo, MicrophoneDeviceInfo, SpeakerDeviceInfo, JoinOptions, PublishVideoOpts, PublishAudioOpts, PublishScreenOpts, StreamElementOpts.