Automated Metadata Extraction and Spatiotemporal Cataloging of High-Resolution PlanetScope Imagery

Teacher: Giovanna Venuti

Tutor: Dr. Daniela Stroppiana – CNR/IREA

Description: this project involves the development of a robust, automated pipeline in Python (or R) to parse, extract, and structure metadata from large PlanetScope datasets. PlanetScope data is characterized by high temporal frequency and high spatial resolution (~3m), but the volume of scenes requires efficient data management and checks before analysis. The script will work locally in a directory structure to perform the following tsks: navigate through nested folders to identify image files (in .tiff or eventually in zip format), extract spatial and temporal metadata/information (spatial extent, cloud coverage, date of acquisition) and the number of bands. The script will also check if image is combined with UDM layer and extract for each scene categorial information included in the UDM layer (classes and histogram). The result will be a reusable toolkit that significantly reduces the manual labor involved in data preparation. The script will be “exportable,” allowing researchers to point it at any folder and receive a clean, searchable inventory of their PlanetScope assets, facilitating the subsequent analysis.

Technologies: Python or R