On the heels of a recent webinar that IHS Markit co-produced with Verizon Media, we decided to explore in more detail the webinar’s principal themes and discuss some of the underappreciated challenges of streaming live, premium video over the open internet. This post—the second in a two-part series—resumes where our prior analysis concluded.

On the heels of a recent webinar that IHS Markit co-produced with Verizon Media, we decided to explore in more detail the webinar’s principal themes and discuss some of the underappreciated challenges of streaming live, premium video over the open internet. This post—the second in a two-part series—resumes where our prior analysis concluded.

Our first article revolved around encoding best practices. Our focus here is on the remainder of the video workflow: how to achieve stream and experience personalization at scale.

Is it difficult to instantiate and subsequently manage millions of viewing sessions? Yes, it is. Is it more difficult still to individualize each session, and employ targeted ads to generate personal, uniquely defined, superlatively monetized consumption experiences? Indeed. With cost-per-thousand-impressions (CPM) advertising to compensate high-traffic sessions, is the promise of personalized experience and high-CPM monetization fundamentally beyond reach? It most certainly is not.

Read on as we delve into the complexities of individualized stream management and the practices that we believe allow media companies to address, and ultimately prevail over, complexity.

The video distribution chain: quick recap and personalization primer

After a live feed has been received from a venue or event, the video must be encoded, secured, described by an ensemble of business and consumption rules, and loaded with advertising content. We call this post-ingest, content-processing stage the workflow phase. The workflow phase subsequently yields to the delivery phase, during which content is offloaded to, and transmitted across, one or more content delivery networks (CDNs) and ISP-owned access networks.

IHS Markit graphic: end-to-end distribution chain for video IHS Markit graphic: end-to-end distribution chain for video

It is useful to think of personalization as a series of processes that occur after the video has been encoded. Prescribing business rules, applying them, and marrying ad inventory to ad assets obey a simple logic: treating the encoded content as an undifferentiated, shared input; and from it, creating a stream that is fully individualized and uniquely defined by its regionality, regional blackouts, commercial blackouts, advertisements, and advertising beacons.

Video personalization: goals, architecture, and functions

Personalizing a live stream has three goals: to ensure that the video content, as well as all associated advertising, is compliant with any regional blocks or blackout mandates; to facilitate a consumption experience that—be it through selectable camera angles, or ancillary live-event data and statistics—is responsive to viewer preferences and tastes; and to suffuse the video with targeted advertisements that secure high CPMs and render more favorable the fundamental economics of streaming video distribution.

IHS Markit graphic: Traditional manifest server architecture IHS Markit graphic: Traditional manifest server architecture

In turn, the ability to manage, personalize, and advertise relies on a single functional unit whose name could conceivably befit a bellicose Space Opera novel, but whose function is entirely benign: the manifest server.

At its essence, the manifest server generates a manifest file. The document describes, defines, and coordinates—in minute detail— every aspect of the consumption experience: from which piece of content a subscriber is entitled to watch, to whether an advertisement was properly delivered, and indeed, viewed. The manifest file comprises elements that revolve primarily around controlling the core playback stream along with ad monetization and reporting.

On the playback and compliance side, the manifest server does three things. One, the server talks with the media and asset library to understand which video assets are available, and to whom they should be delivered. Two, the server talks to each viewer’s video player, and instructs the player on where to find—and request—the appropriate assets. Three, the server receives instructions from the content management system on any blackouts or regional policies that need enforcing.

On the monetization side, the manifest server manages three entirely separate processes. One, the server communicates with the ad decision service (ADS) to receive information on which ads should be shown to whom. Two, the server—having received information from the ADS and parsed it—talks to the ad server to stitch ads, and their location, into the manifest. Three, the server collects information on ad viewership and sends this form of reporting, in the form of ad beacons, to external measurement services.

The nature of the problem

Generating a manifest sounds simple enough. How difficult can it be to put together a mere file? The bottleneck hides in plain sight. Creating a static, monolithic manifest that every video player downloads in the context of a video-on-demand (VOD) session is not onerous. Creating individualized manifests for each and every viewer, doing so in real time, and doing it in the context of live consumption—where audiences ebb, flow, and scale very quickly, and very unpredictably—is categorically tough. Tougher still is maintaining the ability to personalize the manifest in contexts where live audiences reach into the millions.

In a world without advertising, the mere process of creating and managing millions of individual playback sessions is difficult enough. It is economically impractical—indeed, unjustifiable—to over-procure manifest infrastructure and to keep the server infrastructure running at full capacity. Establishing 1:1 sessions requires that manifest resources be tightly coupled to demand, and be capable of scaling—instantaneously—to match both audience spikes and audience valleys.

The insertion of targeted ads increases manifest complexity by a full order of magnitude. Suppose that 1 million viewers are consuming a live stream. Suppose, further, in this stylized example that a single data point can fully describe and render unique each viewer’s manifest file. The manifest server is on the hook for generating 1 million unique datapoints, or manifests: one per viewer.

Now suppose that a single, targeted ad is inserted into each viewer’s video stream. The problem is that ad delivery cannot occur in the absence of measurement and reporting. Per ad within a given ad break, the manifest server is charged with generating and sending diagnostic data—chiefly to adjudicate and reconcile ad delivery against ad consumption—to external measurement services. This diagnostic data takes the form of ad beacons, and on average, ads are saddled with 8-10 of them; in the digital marketing ecosystem, some ads can carry up to 50 beacons.

And there we have it: From a base of 1 million unique datapoints, the act of targeted ad insertion and reporting now requires the manifest server to generate not 1, but 10 million unique datapoints, in real time. This additional load that the manifest server bears during ad breaks, at a minimum, is a major burden on CPU resources and CPU utilization.

First principles and best practices

Three features characterize high-viewership live audiences:Total viewership can increase quickly, and at key times, diminish precipitously; the audience’s aggregate geographic footprint can be very large; and in spite of this wide geographic distribution, viewers often cluster in multiple, spatially disparate metro areas.

Accordingly, we believe that manifest generation and manifest infrastructure alike need to obey three principles. First, manifest infrastructure should be placed in the Cloud. With few exceptions, we believe that cloud implementations and cloud architectures are a necessary condition for coupling server resources to viewership demand. Two, cloud-based manifest resources should be distributed across multiple, spatially disparate clusters. Three, an allocation system should assign viewers—randomly—to one of the manifest-generation clusters. We believe that clustering and random assignment achieves resource and load-bearing distribution; avoids unchecked, difficult-to-manage resource dilution; and ensures that any given cluster does not become overburdened as the audience grows.

IHS Markit graphic: Manifest architecture - a new approach IHS Markit graphic: Manifest architecture - a new approach

What of the manifest server’s ability to manage ad stitching and reporting? A robust implementation has to be capable of handling two issues. One, portions of the AdTech ecosystem are external to, and fully beyond the control of, the infrastructure of media companies. The manifest server must be able to continue to generate manifests and ensure seamless playback experiences, even when external ad servers, ad networks, or measurement servers are unresponsive. Two, during ad breaks in particular, we know that ad beaconing exerts a significant additional burden on the manifest server’s CPU and computational resources. The manifest server must be able to handle this load, but it must also avoid a scenario where beaconing’s resource monopolization simultaneously hinders the server’s ability to handle unpredictable swells in audience size and playback requests.

Consequently, we believe there exists a strong rationale for decoupling the manifest server’s computational resources from the manifest server’s ad monetization responsibilities. This decoupling, implemented in the form of a proxy service, achieves two things. First, a proxy architecture creates a buffer between manifest generation—and uninterrupted playback control—and the whims of an AdTech ecosystem that is beyond the purview of media companies. Second, given the unique, computationally intensive nature of ad monetization, a proxy architecture allows stitching, beaconing, and reporting to draw upon separate, dedicated resources. While we do not wish to argue that a proxy architecture is uniquely and categorically superior to all other forms of mitigating complexity, we do maintain that this architecture will constitute a best-practice example for many different types of media companies, operators, and service providers.

To learn more

For more information, watch on-demand at any time and for free our webinar, “Live sports: ensuring global streaming doesn’t leave audiences screaming,” presented by IHS Markit | Technology, now a part of Informa Tech, with Verizon Media.