VR Guidelines from the VRIF

The Virtual Reality Industry Forum (VRIF) has published a set of draft guidelines for virtual reality (VR). The version for public comment was placed on-line on September 13th and can be downloaded for free here.

The guidelines were then discussed by the VRIF at a business supersession co-hosted by the VRIF and the DASH Industry Forum (DASH-IF) on Saturday, September 16th at the IBC in Amsterdam. The DASH-IF has been setup to promote MPEG-DASH, a video compression scheme for OTT video and said to be suitable for VR.

VRIF Charter Members resizeCharter Members of the VRIF. Including Charter, Contributor and Associate Members, there are currently 40 VRIF members. (Image Credit: VRIF)

Standards for VR are (or should be) a major topic for everybody in the VR community. I have already discussed these standards in a previous Display Daily article plus in subscriber-only articles. I’m not the only one to discuss the topic – Neil Trevett and Yuval Boger cited my DD in their article “4 myths are blocking real, needed VR standards.”

One reason VR needs so many standards is the technology is essentially new from end-to-end: from content creation to content consumption by the consumer, as shown in the figure. The VRIF guidelines identify some standards that can be used unchanged, others that can be used with a different selection of profiles than are generally used for non-VR content and many cases where new or modified standards are needed. Most of the VRIF guidelines apply to the VR content creator through the distributor portion of the chain – there are only relatively modest discussions of the VR head-mounted displays or the hardware, software and standards needed to drive them. On the other hand, once companies like Intel and Qualcomm know what the incoming format is, they can develop the hardware and software needed to display it.

The end-to-end complexity of streamed, encrypted VR. (Image credit: VRIF)

In my earlier Display Daily, I didn’t even discuss the VRIF, in part because it isn’t a Standards Development Organization (SDO) and doesn’t write standards. Instead, the VRIF is composed of a broad range of participants, all interested in VR, from sectors including the movie, television, mobile, broadcast and interactive gaming ecosystems. VRIF members include content creators, content distributors, consumer electronics manufacturers, professional equipment manufacturers and other technology companies.

The VRIF says “The VR Industry Forum is not a standards development organization, but will rely on, and liaise with, standards development organizations for the development of standards in support of VR services and devices. Adoption of any of the work products of the VR Industry Forum is voluntary; none of the work products of the VR Industry Forum shall be binding on Members or third parties.”

While the VRIF calls its Draft Guidelines a “Work Product,” they look an awfully lot like standards to me. I’m not saying this is bad – SDOs typically work quite slowly and can take years to develop formal standards. When the formal standards are promulgated, they typically look an awful lot like industry sponsored guidelines, such as the ones from the VRIF. The VR industry is developing so quickly it cannot afford to wait on SDOs such as the IEEE, ITU, SMPTE, etc. to provide guidance via formal standards.

The initial release of the VRIF guidelines focus on the delivery ecosystem of 360° video with three degrees of freedom (3DOF) and include: documentation of cross-industry interoperability points (based on ISO MPEG’s Omnidirectional MediA Format (OMAF)); best industry practices for production of VR360 content, with an emphasis on human factors such as motion sickness; and security considerations for VR360 streaming, including user privacy and content protection.

With 3DOF VR content, the viewer can look in any direction to see what is around him but any change in the point of view (i.e. motion of the center of the 360° image) is controlled by the content maker. VR headsets powered by smartphones are typically used to show 3DOF content. Content with 6 degrees of freedom (6DOF) allows the viewer to not only look in any direction, but to move around (normally in a limited area) and control the motion of the center of the 360° point of view. Typically, 6DOF content is viewed on dedicated headsets such as the HTC Vive. These dedicated headsets can also, of course, be used to view 3DOF content. One reason for the separation of 3DOF and 6DOF VR content is the fact that 6DOF VR content requires much more complex processing before display. Typically, a smartphone processor can not handle all the required processing, at least not now.

The 68 page VRIF draft guidelines focus on five main topics:

  • Production: Technical aspects of the media formats used in the interface between the content provider and the service provider along with human factors considerations for compelling and usable 360° video experiences.
  • Compression: Media codecs for VR, i.e. encoding of different production formats and related media profiles for video, audio and possibly also other media types such as text, graphics, etc.. This includes decoding and rendering of the media based on an abstracted distribution data model.
  • Storage: Media formats for VR content (e.g. file/segment encapsulation) for different distribution means, including but not limited to storage, download, adaptive bitrate streaming and broadcasting.
  • Delivery: Interfaces and protocols for Live, Linear and VOD delivery over streaming (unicast), and broadcast applications.
  • Security: VR specific threat identification and mitigation techniques as well as methods for implementing security and privacy protection functions.

One very valuable item in the VRIF guidelines is Section 2: References. This lists 36 different standards, either current or under development, plus the SDOs developing them. All of these standards and SDOs are of interest to anyone interested in developing best practices, guidelines, standards or actual VR products. You can’t tell the players without a program.

The VRIF guidelines gives four options for delivering 3DOF VR content, as shown in the table.

VRIF Master File Formats resizeVRIF recommended master file formats for delivery of 3DOF VR content. (Image Credit” VRIF)

In a related discussion, the guidelines discuss frame rates for 3DOF content. They say 25Hz and 30Hz can be used for monoscopic VR if motion within the content is restricted but 50Hz or 60Hz is better. For stereoscopic VR, they recommend no frame rate below 60Hz and fame rates up to 120Hz can be used. They reccomend a minimum of 50Hz for monoscopic and 100Hz for stereoscoic content.

Audio is an important part of virtual reality and the guidelines pay a lot of attention to this issue. One of the problems with audio is that, when it is delivered by headphones, when the viewer moves his head, the apparent source of the audio must change to reflect how the head is now pointing. This requirement favors scene-based or object-oriented sound, rather than simple stereo or 5.1 surround sound.

One of the things that attracted my attention to the VRIF guidelines in the first place was an announcement from b<>com and Fraunhofer IIS that they had teamed-up for a demo of MPEG-H as an end-to-end audio solution for VR sound at IBC in Amsterdam. Ludovic Noblet, Director of Hypermedia at b<>com said, “High Order Ambisonics, or Scene Based Audio, is an essential format in order to produce compelling VR experiences. Together with DVGroup and Fraunhofer, we are proud to showcase, for the very first time, an end-to-end ecosystem in line with the VRIF recommendations for the distribution of such experiences.”

As a draft document, the guidelines includes a number of editor’s notes. My favorite is the one shown below. While this one was in a section on bit rates and image quality, I feel it applies to almost everything VR. Many of the other editor’s notes were more specific and were asking for feedback on particular topics.

VRIF Editors Note resize(Image Source: VRIF)

This draft of the guidelines is by no means a finished document. The VRIF indicates five areas that need special attention in the future and are ongoing work:

  • The role of HDR in content presentation – what works and what does not
  • Recommendations for stitching video content in the production domain
  • Master format adaptations to permit distribution over constrained links
  • Passing the desired viewport to the secure trust zone
  • Content encryption and watermarking on viewport dependent media profile

There is a lot of information in the VRIF guidelines related to VR content, distribution and hardware. I’ve reviewed the entire document and, while I’ve never made any VR content, it all makes sense to me. Anyone interested in making VR content, distributing VR or supplying professional hardware would do well to follow these guidelines. Anyone who is supplying consumer hardware or software should ensure it will work with content encoded according to the VRIF guidelines. Anyone who has made or distributed VR content or hardware and feels the guidelines are wrong or has information said to be needed in the editor’s notes should provide feedback to the VRIF whether you or your company is a VRIF member or not. This can be done either on the VRIF website or on the VRIF GitHub Issue Tracker. I suspect comments that fill in missing information from editor’s notes will be especially welcome. The VRIF would prefer it if comments are submitted by October 31, 2017 so they can continue their work in a timely manor.

While it is not on the speaking agenda, the VRIF is one of the organizational sponsors of Insight Media’s 2017 Display Summit, to be held October 4-5, 2017 in Sterling, Virginia. This conference will have a half-day on Oct. 5th dedicated to Virtual Reality and Augmented Reality topics. A member discount is available for VRIF members at Display Summit. –Matthew Brennesholtz