Dynamic Object Segmentation in Turbulence (DOST) Dataset

Unsupervised Moving Object Segmentation with Atmospheric Turbulence

¹Clemson University,
²Arizona State University,
³George Mason University

Paper abstract

Moving object segmentation in the presence of atmospheric turbulence is highly challenging due to turbulence-induced irregular and time-varying distortions. In this paper, we present an unsupervised approach for segmenting moving objects in videos downgraded by atmospheric turbulence. Our key approach is to adopt a detect-then-grow scheme: we first identify a small set of pixels that belong to moving objects with high confidence, then gradually grow a foreground mask from those seeds that segment all moving objects in the scene. In order to disentangle different types of motions, we check the rigid geometric consistency among video frames. We then use the Sampson distance to initialize the seedling pixels. After growing per-frame foreground masks, we use spatial grouping loss and temporal consistency loss to further refine the masks in order to ensure their spatio-temporal consistency. Our method is unsupervised and does not require training on labeled data. For validation, we collect and release the first real-captured long-range turbulent video dataset with ground truth masks for moving objects. We evaluate our method both qualitatively and quantitatively on our real dataset. Results show that our method achieves good accuracy in segmenting moving objects and is robust for long-range videos with various turbulence strengths.

Pipeline

Our method starts with calculating bidirectional optical flow. To disentangle actual object motion from turbulent motion, we use a novel epipolar geometry-based consistency check to generate motion feature maps that only preserve object motions. We then adopt a region-growing scheme that generates per-object motion segmentation masks from a small set of seed pixels. Finally, we develop a U-Net trained by our proposed bidirectional consistency losses and a pixel grouping function to improve the spatio-temporal consistency of estimated motion segmentation masks

Motion disentanglement

We first tackle the problem of motion disentanglement, which is a major challenge posed by turbulence perturbation in rigid motion analysis. Our key idea is to check on the rigid geometric consistency among video frames: pixels on moving objects do not obey the geometric consistency constraint posed by the image formation model. Specifcially, we use the Sampson distance, which measures geometric consistency with a given epipolar geometry, to improve the spatial-temporal consistency among video frames. We first average the optical flow between adjacent frames to stabilize the direct estimations, since they are susceptible to turbulence perturbation. We then calculate the Sampson distance using fundamental matrices estimated from the averaged optical flow. Next, we merge the Sampson distance maps as the motion feature maps {Mt|t = 1, 2, . . . , T }. We use the motion feature map values as indicators of how likely a pixel has rigid motion (the higher the value, the higher the likelihood).

Quantitative comparisons result

We organize videos into two sets, “normal turb.” and “severe turb.”, according to their exhibited turbulence strength. Our method significantly outperforms these state-of-the-art on motion segmentation accuracy under various turbulence strengths. In normal cases, some can still achieve decent performance, whereas our method scores much higher in all metrics. Compared to TMO, whose overall score is the highest among the three state-of-the-art, our accuracy is increasing by 60.1% in J and 34.9% in F. In severe cases, the performance of all state-of-the-art significantly downgrades, with all J values lower than 0.25 and F lower than 0.35. In contrast, our method is relatively robust to strong turbulence.

BibTeX

@inproceedings{qin2024unsupervised, title={Unsupervised Object Segmentation for Video With Atmospheric Turbulence}, author={Qin, Dehao and Saha, Ripon Kumar and Chung, Woojeh and Ye, Jinwei and Jayasuriya, Suren and Li, Nianyi}, booktitle={European Conference on Computer Vision}, year={2024}, organization={Springer} }