Perform semantic segmentation on images, using Facebook's Mask2Former models.
Streetview Segmentation script performs semantic segmentation on images, using models from Facebook's collection of Mask2Former models. Output consists of a CSV-file detailing, for each input image, the number of pixels in each semantic class in that image. Optionally, it can save a copy of each input image overlaid with the semantic segmentation. If the input images are 360° photo's, the script provides the possibility of tranforming them by projecting them onto a cube, resulting in six images per input image.
This script is specifically designed to run on a computer without a GPU. Some of the underlying libraries require the presence of CUDA-drivers to run, even if the actual device is absent. As it can be problematic to install such drivers on a computer without an actual GPU, the program is packaged as Docker-container, based on an official NVIDIA-image, which comes with pre-installed drivers. Building and running the container requires the presence of Docker engine. Note that the container is based on a Linux image (Ubuntu 18.04); running it on a Windows computer may require extra configuration.