Authors
Jerry Gammie (University of Colorado Boulder, Computer Science), Cibele Amaral (CIRES), Nayani Ilangakoon (CIRES), Anna LoPresti (University of Colorado Boulder, Ecology and Evolutionary Biology), Laura Dee (University of Colorado Boulder, Ecology and Evolutionary Biology), Claire Monteleoni (University of Colorado Boulder, Computer Science), Natasha Stavros (Space & Mission Systems, BAE Systems, Inc.)
Abstract
Wildfires pose an ever-increasing threat to human health, safety, and infrastructure due to climate change, land use, and past management strategies. Ecology research has shown that more prescribed burns are necessary to reduce the negative impacts of wildfires. As part of a data-driven approach to predicting the ecological outcomes of prescribed burns, we aim to predict proxies of ecological functions such as ECOSTRESS Water Use Efficiency (WUE) from other remotely sensed data, including elevation, land cover, and 30-year climate data. Our work focuses on making predictions within the state of Colorado, with the goal that our methods could be scaled to the Western United States. These datasets must be combined and reformulated before they can be analyzed. Widely used paradigms for extracting analysis-ready datasets from rasterized sources, particularly data cubes, struggle when working with multiple resolutions. This can result in information loss or an unnecessary replication of parameters that bloats dataset size, both serious problems for the ML pipeline and its scalability.
We have developed a novel formulation for analysis-ready data sets designed specifically for multi-resolution geospatial data, the âdata pyramid,â that keeps each layer at its original resolution. This allows for information to be retained from fine layers without upsampling coarse layers and needlessly replicating parameters. With our experimental data, this method reduced the number of parameters per sample, and thus the storage cost of the entire analysis-ready dataset, by more than 36% versus full-resolution data cubes.
We have developed novel multi-task, multi-scale machine learning architectures to take advantage of data cubes. Our convolutional neural networks (CNNs) take each resolution group as a separate branch and combine them within the convolutional layers of the model, implicitly aligning layers and reconciling resolutions. The model makes multi-task predictions in a mirrored manner. This architecture, paired with data pyramids, achieves higher performance far quicker than similar single-task models. It achieves similar performance to models paired with full-resolution data cubes, but with huge storage savings and slightly faster model training times. Our contributions lie in our novel data formulation, our novel model architecture, and in the high performance of our model, which may help to provide more information to decision-makers planning prescribed burns that enhance ecosystem functions and services.