January 30, 2023

A benchmark for satellite imagery segmentation models


Earth observation is one of the human activities that secretly supports us in our everyday lives. We use remote sensing solutions when watching a weather forecast, planning holiday trips, commuting, or even considering buying land for our new house. These are, of course, not the only applications. Notable application examples include:

  • Agriculture - monitoring crop health and assessing irrigation and fertilization needs.
  • Environmental monitoring - tracking land use changes or identifying potential pollution sources.
  • Disaster management - assess damage from natural events like floods and wildfires.
  • Urban planning - observe and predict cities' development.
  • Military and intelligence - gather information regarding conflict situations and troop movements.
  • Oceanography - study the world's oceans.
  • Geology and resource management - identify mineral deposits.

In simple words, remote sensing is gathering information regarding some objects or phenomena by acquiring data from satellites, radars, unmanned aerial vehicles (UAV), various sensors, and more. Not only are remote sensing projects a source of valuable data, but they also include essential activities such as processing, interpreting, and analyzing portions of information to form usable and accessible solutions.

Remote sensing is the subject of active scientific research in Earth and environmental sciences. Specialists in geographic information systems (GIS) are working on various issues. One such issue is the automatic classification of land use and land cover (LULC). 

In ReasonField Lab, we had the pleasure of working on the LULC problem with Dr. Marta Nalej, a scientist from The Faculty of Geographical Sciences, University of Łódź. The project was founded by the National Science Center

Our responsibility was to support the researcher by providing her with several machine-learning models capable of semantic segmentation of satellite imagery.


There are many classification or semantic segmentation model architectures available. They differ by parameter numbers, training and inference times, and the quality of obtained results. Moreover, the LULC segmentation model can be provided with various data, and one of many loss functions can be selected. It is essential to know which model to use to acquire the best results.

The research project aims to compare the possibility of using classifiers with different deep learning (DL) architectures based on convolutional neural networks to classify satellite images to obtain land use/land cover (LULC) data for the area of Poland.

We were present during all phases of the scientific process.


For the study, we utilised a benchmark dataset prepared by the researcher. The dataset consisted of preprocessed Sentinel-2 imagery labeled using the segmentation masks created from the Topographic Objects Database (BDOT10k). We split each satellite scene into multiple smaller multispectral patches (128px x 128px to 512px x 512px). The dataset's spatial extent covers the area of the whole country. 

The preparation of the dataset was a complicated and time-consuming process. The appropriate satellite scenes had to be selected, corrected for the presence of clouds, and then divided into patches. The whole procedure was carried out using ArcGIS. With the dataset ready, we could proceed with training semantic segmentation models.

Input data source: Sentinel-2 scene, Pilica River region. RGB composition (left), Near-infrared (right).


In our task, the key was not only to implement the chosen neural network architecture based on the research paper selected by the researcher. Ensuring repeatability and comparability of the obtained results was crucial from the point of view of conducting the scientific process. 

Experiment tracking is important in machine learning (ML) because it allows you to track the various experiments you perform, such as different model architectures, hyperparameter settings, and data sets used. This information can help you understand how multiple factors influence your model's performance and identify which configurations work best. 

In other words, we had to ensure that we could simultaneously provide relevant data to support the researcher in answering the specified scientific question and to enable the reproducibility of the results.

To do this, we needed a solution allowing training and testing models with various configuration parameters. Therefore, a high-quality code repository was created, which we decided to curate in cooperation with our client/researcher. In the future, we plan to make the code open-source to enable other researchers to validate and utilize our work. Open-science rocks!

Here are some of the results that we were able to produce with our models. We are sure they will be incredibly interesting for everyone dealing with LULC.

Semantic segmentation model results. Different colors represent different land cover classes, e.g., red segments represent urban areas. Background: OpenStreetMap.


We are delighted that we implemented the project with a scientist from the University of Łódź. Supporting the scientific process by providing scientists access to advanced machine learning and deep learning techniques is part of the ReasonField Lab mission. We hope to continue the work in cooperation with the Faculty of Geographical Sciences.

As ReasonField Lab we provide a holistic scientific approach to a Machine Learning project with Science as a Service. If you want to find out how we can help you with your research, let us know at ‍hello@reasonfieldlab.com.

If you are interested in the work of Dr. Marta Nalej or would like to apply the results of her study on utilizing semantic segmentation models in LULC, please visit her profile on ResearchGate or contact her at marta.nalej@geo.uni.lodz.pl.