Exploring the effects of synthetic data generation: a case study on autonomous driving for semantic segmentation

Sample renders tested with different material authoring strategies.

Abstract

Rendering 3D virtual scenarios has become a popular alternative for generating per-pixel-labeled image datasets, especially in fields like autonomous driving. The approach is valuable for training neural perception models, such as semantic segmentation models, particularly when data might be scarce, expensive, or difficult to collect. However, fundamental questions persist within the research community regarding the generation and processing of these synthetic images, particularly a better understanding of the key factors influencing the performance of deep learning models trained with such synthetic images. In response, we conducted a series of experiments to elucidate the impact that common aspects involved in the generation of rendered synthetic images may have on the performance of neural semantic segmentation tasks. Our study used a recent autonomous driving synthetic dataset as our main testbed, allowing us to investigate the effect of different approaches when modeling their geometric, material, and lighting details. We also studied the impact of rendering noise, typically produced by path-tracing algorithms, as well as the impact of using different color transformations and tonemapping algorithms.

Publication
The Visual Computer
Omar A. Mures
Omar A. Mures
Instructor

My research interests include Deep Learning, Computer Vision and Computer Graphics.