Realistic Object Placement Matters

Wed Apr 10 2024272 views

In the dynamic landscape of training object detection models, synthetic data has emerged as a key player. Yet, a persistent challenge remains—the domain gap between synthetic training and real test data. We, at Inverted AI, delve into the content gap realm. Specifically, we explore the impact of realistic diverse object placements.

The performance of machine learning models, especially object detectors, suffers when a domain gap exists between training and test data. This paper by Inverted AI isolates the object placement distribution variable and investigates its influence on the performance of vision-based object detectors in driving contexts.

Using CARLA driving simulator, we generate synthetic data for 3D object detection. The experiment pits a baseline model, where objects move freely, against our commercial model, INITIALIZE. What sets INITIALIZE apart is its ability to sample realistic vehicle placements, injecting a dose of authenticity into the synthetic dataset. To ensure a fair comparison, we carefully control other data aspects, including object types, appearances, counts, weather conditions, and locations.

Here are some samples of training set images with the baseline vehicle placement on the left and realistic vehicle placement on the right.

Training a monocular 3D detection model (PGD) on both baseline and INITIALIZE datasets, we evaluate performance on the KITTI validation set. Realistic object placements yield remarkable improvements in average precision for 2D and 3D bounding boxes, bird's eye view, and average orientation similarity across different difficulty levels. Visualizations highlight the enhanced performance of INITIALIZE in predicting 3D bounding boxes on real-world data. The images on the left depicts predictions of the model trained on synthetic data with baseline vehicle placements, while the images on the right shows predictions from the model trained on the synthetic data with realistic vehicle placements.

Our experiment provides compelling evidence that realistic object placement distribution significantly enhances the performance of object detection models in real-world scenarios. This sheds light on a critical consideration often overlooked—realistic object placement—when curating synthetic datasets. By prioritizing realistic object placement, we pave the way for datasets that mirror real-world complexities, setting the stage for more resilient and accurate object detection models.