SMPTE Discusses Video over IP

By Matthew Brennesholtz

At the recent meeting of the New York Chapter of the SMPTE, the topic of discussion was “IP Facilities for Content Creators”, AKA Video over Internet Protocol (VoIP). This meant the use of Information technology (IT) protocols and hardware to distribute real-time video within a production or post-production facility. The January 29th meeting was hosted by Cisco Systems and had a “standing room only” crowd.

“Standing room only” crowd at the SMPTE NY meeting “IP Facilities for Content Creators”

There were four speakers at the meeting: Fred Huffman, Huffman Technical Services; Hugo Gaggioni, CTO, Sony Professional Products; John Mailhot, Imagine Communications; and Ken Morse, CTO, Video Software and Solutions, Cisco Systems.

The discussion centered on high-quality, real-time video, as needed by HD and UHD/4k production of live content. Two things were not discussed because they do not meet the needs of the production community: HDBaseT and distribution of video to consumers, e.g. YouTube or OTT TV. While HDBaseT uses the same Cat 5/6 cables as VoIP, it is a specialized video interface and cannot use standard I/O cards or routers used for other IP applications. Internet distribution of video to consumers is highly compressed and not in what the production community would consider “real time,” which requires a reliable and known latency with no more than a couple of frames of delay.

The main drive behind the use of VoIP in production and post production is, not surprisingly, cost. VoIP would allow the use of standard IT components such as I/O cards and routers to replace dedicated SDI I/O cards and matrix switchers. Due to the huge volumes of the IT industry compared to the video production industry, these components can be significantly less expensive. Mailhot showed a slide with three cards on it, each of comparable capability. The generic network interface card (NIC) costs about 10% of what a specialized HDSDI I/O card for a dedicated piece of production equipment costs. Mailhot also said that while the production community (i.e. SMPTE) was struggling to standardize 12G-SDI interfaces, the IT community is already using 100G IP protocol interfaces with roughly 8x the bit rate. He added that the IT community is working on 200G and 400G, with 200G expected to be available to the end-user community within the year.


The savings are not just in the purchase price of the equipment, according to Gaggioni. He used an example of a 4k production truck designed to produce a program from the feeds from eight cameras. (Sony F55 cameras, of course!) In a conventional, SDI-based truck, he said 362 BNC cables would be needed, for a total weight of 268 KG. In a similar production truck where the SDI interfaces are replaced by VoIP, the weight of the cables alone would be reduced by 85% to about 40Kg.

Gaggioni said there were three major challenges to VoIP and its use by the production community. First there needs to be real-time operations with a minimum of latency. Second, there needs to be synchronous processing (i.e. Genlock). Finally there needs to be video stream switching without picture disruption.

These issues arise when IT routers, switchers and interfaces are used, because these devices don’t know about video signals, they just know about data packets. Since every frame of video requires a relatively large number of packets, if an IT router is allowed to route signals based on packets alone, it can switch a video frame in the middle, rather than cleanly switching it on a frame boundary, the way a video switcher would do. This can lead to glitches in the video and noise in the audio that would be unacceptable to the production community.

One proposed solution was to add ASICS and/or software to the IT routers so they would be video-aware. This, however, goes against the philosophy of using Commercial Off the Shelf (COTS) IT hardware. Once you have special software or ASICs in your router, it is no longer truly a COTS product.

A solution proposed to solve the switching problem is to send duplicate streams. At the receiving end, the system will use the packets from the first stream until the non-video aware IT router switches it, and then completes the frame from the packets in the second stream. “Worst case” is this will double the amount of data transmitted. Since the fastest SDI interfaces available are 12G and 3G is more common while IT networks regularly use 100G systems, there is still a huge increase in capacity of a cable in the switch from SDI to IT, even considering the duplicate data transmitted.

The issues involved are not simple and one confounding issue is that different companies are developing different proprietary standards. SMPTE is working on standards for the industry as a whole, including the SMPTE ST2022 series for the media plane and the SMPTE draft ST2059 series for the timing plane. One issue here is SMPTE is not the only organization working on standards for VoIP. Gaggioni tried to explain the advantages of SMPTE PTP (ST2059-2) over AVB gPTP (IEEE802.1AS) for synchronization of video. I’m not sure I fully understood the issues but one thing was clear to me and to the SRO audience of the New York production community: Gaggioni’s vision of a fully IP-based 4k production truck with no SDI interfaces isn’t going to be possible this year. Or next year, for that matter.

If you’d like to watch the complete 1 hour, 52 minute meeting, SMPTE NY has posted the video on YouTube. – Matthew Brennesholtz