News

Nvidia leaves a ‘paper’ trail

Company teams with academics to present record number of research papers at SIGGRAPH 2022

Karen Moltenbrey

Groundbreaking research has always been an important aspect of SIGGRAPH, as scientists and researchers present the latest industry advancements to conference-goers. The acceptance rate for presenting a technical paper at SIGGRAPH averages between 17%­–27%, following a recently-instituted double-blinded peer-review process. So, the fact that Nvidia, in collaboration with top academic researchers at 14 universities, will be presenting a record number (16) of research papers at this year’s conference is astounding.

The research addresses some of the most challenging issues pertaining to computer graphics in the fields of neural rendering, 3D simulation, holography, and more. Below, Nvidia summarizes this work. 

Neural tool for multi-skilled simulated characters

A character can learn more than one action at a time using AI reinforcement learning. (Source: Nvidia)

 

 

When a reinforcement learning model is used to develop a physics-based animated character, the AI typically learns just one skill at a time: walking, running, or perhaps cartwheeling. But researchers from UC Berkeley, the University of Toronto, and Nvidia have created a framework that enables AI to learn a whole repertoire of skills—demonstrated with a warrior character who can wield a sword, use a shield, and get back up after a fall. 

Achieving these smooth, lifelike motions for animated characters is usually tedious and labor-intensive, with developers starting from scratch to train the AI for each new task. As outlined in this paper, the research team allowed the reinforcement learning AI to reuse previously learned skills to respond to new scenarios, improving efficiency, and reducing the need for additional motion data. 

According to Nvidia, tools like this one can be used by creators in animation, robotics, gaming, and therapeutics. At SIGGRAPH, Nvidia researchers will also present papers about 3D neural tools for surface reconstruction from point clouds and interactive shape editing, plus 2D tools for AI to better understand gaps in vector sketches and improve the visual quality of time-lapse videos. 

Bringing virtual reality to lightweight glasses 

New research into VR glasses. (Source: Nvidia)

Most virtual reality users access 3D digital worlds by putting on bulky head-mounted displays, but researchers are working on lightweight alternatives that resemble standard eyeglasses. 

A collaboration between Nvidia and Stanford researchers has packed the technology needed for 3D holographic images into a wearable display just a couple millimeters thick. The 2.5-mm display is less than half the size of other thin VR displays, known as pancake lenses, which use a technique called folded optics that can only support 2D images. 

The researchers accomplished this feat by approaching display quality and display size as a computational problem, and co-designing the optics with an AI-powered algorithm. 

While prior VR displays require distance between a magnifying eyepiece and a display panel to create a hologram, this new design uses a spatial light modulator, a tool that can create holograms right in front of the user’s eyes, without needing this gap. Additional components—a pupil-replicating waveguide and geometric phase lens—further reduce the device’s bulkiness. 

It’s one of two VR collaborations between Stanford and Nvidia at the conference, with another paper proposing a new computer-generated holography framework that improves image quality while optimizing bandwidth usage. A third paper in this field of display and perception research, co-authored with New York University and Princeton University scientists, measures how rendering quality affects the speed at which users react to on-screen information.  

Lightbulb moment: New levels of real-time lighting complexity

Accurately simulating the pathways of light in a scene in real-time has always been considered the “holy grail” of graphics. Work detailed in a paper by the University of Utah’s School of Computing and Nvidia is raising the bar, introducing a path resampling algorithm that enables real-time rendering of scenes with complex lighting, including hidden light sources.

Think of walking into a dim room with a glass vase on a table illuminated indirectly by a streetlight located outside. The glossy surface creates a long light path, with rays bouncing many times between the light source and the viewer’s eye. Computing these light paths is usually too complex for real-time applications like games, so it’s mostly done for films or other off-line rendering applications. 

This paper highlights the use of statistical resampling techniques—where the algorithm reuses computations thousands of times while tracing these complex light paths—during rendering to approximate the light paths efficiently in real time. The researchers applied the algorithm to a classic challenging scene in computer graphics.

An indirectly lit set of teapots made of metal, ceramic, and glass. (Source: Nvidia)

 

Related Nvidia-authored papers at SIGGRAPH include a new sampling strategy for inverse volume rendering, a novel mathematical representation for 2D shape manipulation, software to create samplers with improved uniformity for rendering and other applications, and a way to turn biased rendering algorithms into more efficient unbiased ones.

Neural rendering: NeRFs, GANs power synthetic scenes

Neural rendering algorithms learn from real-world data to create synthetic images—and Nvidia research projects are developing state-of-the-art tools to do so in 2D and 3D. 

A text-driven method allows shifting a generative model to new domains, without collecting a single image. (Source: Nvidia)

 

In 2D, the StyleGAN-NADA model, developed in collaboration with Tel Aviv University, generates images with specific styles based on a user’s text prompts, without requiring example images for reference. For instance, a user could generate vintage car images, turn their dog into a painting, or transform houses to huts.

And in 3D, researchers at Nvidia and the University of Toronto are developing tools that can support the creation of large-scale virtual worlds. Instant neural graphics primitives, the Nvidia paper behind the popular Instant NeRF tool, will be presented at SIGGRAPH. NeRFs  (3D scenes based on a collection of 2D images) are just one capability of the neural graphics primitives technique. It can be used to represent any complex spatial information, with applications including image compression, highly accurate representations of 3D shapes, and ultra-high-resolution images.

This work pairs with a University of Toronto collaboration that compresses 3D neural graphics primitives, just as JPEG is used to compress 2D images. This can help users store and share 3D maps and entertainment experiences between small devices like phones and robots. 

There are more than 300 Nvidia researchers around the globe, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars, and robotics.