Evolve Benchmark

Evolve is a revolutionary benchmark that can measure how well your GPU performs in various categories. Neural graphics and ray-tracing are a big part of that.
I became the product owner for Evolve's rendering team. This meant keeping track of the tasks for a team of 7 rendering engineers, and steering the direction based on a weekly demo review. While doing this, I was also still developing rendering features for the product.



Evolve Benchmark | Official Website
Evolve Benchmark | Steam Store

Neural Radiance Caching

During my time on Evolve, I implemented the Neural Radiance Caching paper by Thomas Muller. An important aspect of Evolve is measuring cutting edge workloads for GPUs, so neural graphics are a big part of that. My implementation stayed close to the paper, with some minor bug fixes. Later on the team built further on the foundation that I had laid out to add improvements from follow-up papers. The video on the left shows the final implementation with those improvements from my colleagues.

LOD System & AABBs

For Evolve, I was tasked with building a LOD system from scratch. The most important part of having LODs was being able to run very complex scenes on mobile. We needed support for simplified material lobes at a distance, as well as simpler geometry. All this had to work with ray-tracing and rasterization. I ended up building a system that calculates the LODs for each object in the scene based on the projected area on the camera sphere. This made LODs view-independent which is important for ray-tracing. Some compute passes update the BLAS instance buffers before a TLAS rebuild is triggered.

I also worked on a parallel AABB refitting algorithm for animated geometry. The wind system animating all plants in the scene looked great, but ate up our budget because we had to recalculate the AABBs using atomics. The same thing went for other animated geometry, such as the explorer and dinosaurs.

Fortunately, I developed a highly parallel AABB refitting algorithm that didn't need atomics. Using wave intrinsics and a float-flip trick I could compare AABB bounds as uint locally to a wave. Doing this in a tree structure (much like my CDF blog post) resulted in multiple milliseconds saved for AABB calculations, bringing the total down to just ~200 microseconds.

Location

Breda, Noord-Brabant
The Netherlands