Publications
Conference
Journal
* denotes equal contribution and joint lead authorship.
2024
Transformer Approach to Nowcasting Solar Energy Using Geostationary Satellite Data
Applied Energy. 2025
Unpredicted spatial and temporal variability of global horizontal irradiance (GHI) reaching the photovoltaic panels presents a challenge for integrating solar power into the grid stably and cost-effectively at a regional scale. Therefore, there is a recognized demand for large-scale GHI nowcasting that is both timely and accurate, an area where most existing studies fall short. This study introduces the SolarFormer model, which utilizes satellite data and incorporates a gated recurrent unit for near real-time GHI estimation. It also includes a space-time transformer to provide forecasts with a 3-h lead time at 15-min intervals, maintaining accuracy without significant degradation over extended lead times. SolarFormer requires only the selected satellite band information shared by GOES-16 and Himawari-8 as the dynamic input, enabling near-real-time application across all areas covered by these satellites. This feature makes it accessible and efficient for large-scale energy planning. We validate the forecasting result with the ground-measured GHI over seven SURFRAD stations in 2018. The model achieves an hourly prediction root-mean-square error (relative root-mean-square error) of 93.8 W/m2 (15.0%), 118.9 W/m2 (19.8%), and 129.1 W/m2 (24.2%) with 1–3 h lead time respectively. These results demonstrate lower root-mean-square error compared to existing hourly updated numerical weather prediction modeling, such as High-Resolution Rapid Refresh, and deep learning models, such as ConvLSTM. Moreover, the study highlights the potential of SolarFormer for extended lead-time forecasting due to its high computation and memory efficiency compared with the above-mentioned models, potentially benefiting long-term energy planning and power market bidding and clearing. However, SolarFormer exhibits accumulated bias as the predicted lead time increases and faces challenges in predicting GHI in the early morning due to the invalid visible satellite bands during the night, suggesting areas for improvement in future studies..When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery
Preprint, 2024.
Foundation models, i.e., very large deep learning models, have demonstrated impressive performances in various language and vision tasks that are otherwise difficult to reach using smaller-size models. The major success of GPT-type of language models is particularly exciting and raises expectations on the potential of foundation models in other domains including satellite remote sensing. In this context, great efforts have been made to build foundation models to test their capabilities in broader applications, and examples include Prithvi by NASA-IBM, Segment-Anything-Model, ViT, etc. This leads to an important question: Are foundation models always a suitable choice for different remote sensing tasks, and when or when not? This work aims to enhance the understanding of the status and suitability of foundation models for pixel-level classification using multispectral imagery at moderate resolution, through comparisons with traditional machine learning (ML) and regular-size deep learning models. Interestingly, the results reveal that in many scenarios traditional ML models still have similar or better performance compared to foundation models, especially for tasks where texture is less useful for classification. On the other hand, deep learning models did show more promising results for tasks where labels partially depend on texture (e.g., burn scar), while the difference in performance between foundation models and deep learning models is not obvious. The results conform with our analysis: The suitability of foundation models depend on the alignment between the self-supervised learning tasks and the real downstream tasks, and the typical masked autoencoder paradigm is not necessarily suitable for many remote sensing problems.SimFair: Physics-Guided Fairness-Aware Learning with Simulation Models
The 38th AAAI Conference on Artificial Intelligence (AAAI'24). Vancouver, Canada. 2024
Fairness-awareness has emerged as an essential building block for the responsible use of artificial intelligence in real applications. In many cases, inequity in performance is due to the change in distribution over different regions. While techniques have been developed to improve the transferability of fairness, a solution to the problem is not always feasible with no samples from the new regions, which is a bottleneck for pure data-driven attempts. Fortunately, physics-based mechanistic models have been studied for many problems with major social impacts. We propose SimFair, a physics-guided fairness-aware learning framework, which bridges the data limitation by integrating physical-rule-based simulation and inverse modeling into the training design. Using temperature prediction as an example, we demonstrate the effectiveness of the proposed SimFair in fairness preservation.
2023
High-Fidelity Deep Approximation of Ecosystem Simulation over Long-Term at Large Scale
ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL'23). Hamburg, Germany. 2023.
Oral Ecosystem services, such as carbon sequestration, biodiversity, and climate regulation, play essential roles in combating climate change. Projection of ecosystem dynamics under various scenarios is critical in understanding potential impacts and informing policies and mitigation strategies. Ecosystem Demography (ED) model is a major mechanistic model for ecosystem dynamics projection, but its computational cost has been a major bottleneck in performing large-scale (e.g., global, national) projections at very high spatial resolution. We aim to approximate the ED model using deep neural networks at operational high accuracy to assist large-scale climate studies. The deep approximation is non-trivial due to challenges by long-term error accumulation (e.g., 40 years), highly diverse scenarios, and high cost in training data generation. We propose a Deep-ED approximation model to address the challenges with a multi-scale cumulative loss reduction structure, significance-based scenario partitioning, self-guided forwarding, and physics-aware active learning strategies. Experiment results in the northeastern US demonstrate the high accuracy of Deep-ED and its potential in large-scale ecosystem projection.An Evaluation of the NOAA Global Daily Gap-filled VIIRS Surface Albedo
Remote Sensing of Environment (RSE), 2023
The land surface albedo product of visible infrared imaging radiometer suite (VIIRS) in National Oceanic and Atmospheric Administration (NOAA)’s operational system provides real-time, global daily mean surface albedo, which is a required parameter in the estimation of the daily shortwave net radiation budget. The global gridded VIIRS albedo product is derived from the level-2 granule surface albedo product, which is generated using a Direct Estimation Method. Special gridding and compositing algorithms were developed for aggregating the granular albedo data into the gridded albedo product. This paper describes the design and evaluation of the NOAA VIIRS gridded daily surface albedo product. The cloudy condition, retrieval path, retrieval method, and observing geometry are the criteria in deciding the priority order in the composition processing. The proposed albedo product possesses a complete spatial coverage over global land and ice surface and provides a timely response to surface dynamics. The validation of satellite retrieved daily mean albedo against ground counterparts over a series of well-maintained networks demonstrates the reliability of the composed albedo considering the interference of the seasonal surface heterogeneity conditions around each site. The inter-comparison between the S-NPP and NOAA-20 VIIRS albedo shows good agreement, except for a minor bias related to solar/view angle differences. The cross-comparison between VIIRS albedo and MODIS albedo shows good consistency with some deviations related to the controversy between their upstream snow mask.Global Hourly, 5 km, All-sky Land Surface Temperature Data from 2011 to 2021 Based on Integrating Geostationary and Polar-orbiting Satellite Data
Earth System Science Data (ESSD), 2023
Land surface temperature (LST) plays a dominant role in the surface energy budget (SEB) and hydrological cycling. Thermal infrared (TIR) remote sensing is the primary method of estimating LST globally. However, cloud cover leaves numerous data gaps in satellite LST products, which seriously restricts their applications. Efforts have been made to produce gap-free LST products from polar-orbiting satellites (e.g., Terra and Aqua); however, satellite data from limited overpasses are not suitable for characterizing the diurnal temperature cycle (DTC), which is directly related to heat waves, plant water stress, and soil moisture. Considering the high temporal variability in LST and the importance of the DTC, we refined the SEB-based cloudy-sky LST recovery method by improving its feasibility and efficiency and produced a global hourly, 5 km, all-sky land surface temperature (GHA-LST) dataset from 2011 to 2021. The GHA-LST product was generated using TIR LST products from geostationary and polar-orbiting satellite data from the Copernicus Global Land Service (CGLS) and the Moderate Resolution Imaging Spectroradiometer (MODIS). Based on ground measurements at the 201 global sites from the Surface Radiation Budget (SURFRAD), Baseline Surface Radiation Network (BSRN), Fluxnet, AmeriFlux, Heihe River basin (HRB), and Tibetan Plateau (TP) networks, the overall root-mean-square error (RMSE) of the hourly GHA-LST product was 3.31 K, with a bias of −0.57 K and R2 of 0.95. Thus, this product was more accurate than the clear-sky CGLS and MODIS MYD21C1 LST samples. The RMSE value of the daily mean LST was 1.76 K. Validation results at individual sites indicate that the GHA-LST dataset has relatively larger RMSEs for high-elevation regions, which can be attributed to high surface heterogeneity and input data uncertainty. Temporal and spatial analyses suggested that GHA-LST has satisfactory spatiotemporal continuity and reasonable variation and matches the reference data well at hourly and daily scales. Furthermore, the regional comparison of GHA-LST with other gap-free hourly datasets (ERA5 and Global Land Data Assimilation System, GLDAS) demonstrated that GHA-LST can provide more spatial texture information. The monthly anomaly analysis suggests that GHA-LST couples well with global surface air temperature datasets and other LST datasets at daily mean and minimum temperature scales, whereas the maximum temperature and diurnal temperature range of LST and air temperature (AT) have different anomalous magnitudes. The GHA-LST dataset is the first global gap-free LST dataset at an hourly, 5 km scale with high accuracy, and it can be used to estimate global evapotranspiration, monitor extreme weather, and advance meteorological forecasting models. GHA-LST is freely available at https://doi.org/10.5281/zenodo.7487284 (Jia et al., 2022b) and http://glass.umd.edu/allsky_LST/GHA-LST (last access: 10 February 2023; Jia et al., 2022c).
2022
Characterizing the Dynamics of Wildland-urban Interface and the Potential Impacts on Fire Activity in Alaska from 2000 to 2010
Landscape and Urban Planning (LUP), 2022
Climate change is exacerbating the fire activity in Alaska, which exposes lives and properties to great risk, especially residents living in Wildland-Urban Interface (WUI). Therefore, it is crucial to characterize the spatial distribution and temporal dynamics of WUI and assess its impacts on fire activity. However, existing WUI delineations in Alaska do not cover all communities and apply different mapping approaches, making it difficult to examine the WUI distribution and dynamics across the state. This study created the first statewide WUI map using census data and National Land Cover Database, and characterized the dynamics of WUI from 2000 to 2010. Furthermore, the relationship between WUI and fire was identified using fire ignition and fire perimeter datasets from Alaska Interagency Coordination Center. The findings showed WUI that only covered 0.22 % of the total area in Alaska contained 73.45 % of the housing units. Nearly 85 % of newly added WUI housing units were found in WUI, and the growth rates in WUI housing units far exceeded that in non-WUI. As the distance from WUI increased, both human and lightning ignition density decreased but the percentage of fire perimeters increased within 30 km from WUI. Our results demonstrated the importance of tracking the dynamics of WUI and characterizing the social change behind the pattern to strengthen wildfire preparedness and facilitate community-adapted management.Deep Semantic Segmentation for Building Detection Using Knowledge-informed Features from LiDAR Point Clouds
ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL'22). Seattle. 2022.
Airborne LiDAR point clouds record three-dimensional structures of ground surfaces with high precision, and have been widely used to identify geospatial objects, facilitating the understanding of the distribution and changing dynamics of the environment. Detection can be complicated by the complex structures of ground objects and noises in LiDAR point clouds. Related work has explored the use of deep learning techniques such as YOLO in detecting geospatial objects (e.g., building footprints) on both optical imagery and LiDAR point clouds. However, deep networks are data hungry and there are often limited labeled samples available for many geospatial object mapping tasks, making it difficult for the models to generalize to unseen test regions. This paper describes the framework used in the 11th SIGSPATIAL Cup Competition (GIS CUP 2022), which received the top-3 performance. Our framework incorporates domain knowledge to reduce the difficulty of learning and the model's reliance on large training sets. Specifically, we present knowledge-informed feature generation and filtering based on morphological characteristics to improve the generalizability of learned features. Then, we use a deep segmentation backbone (U-Net) with training- and test-time augmentation to generate preliminary candidates for building footprints. Finally, we utilize domain rules (e.g., geometric properties) to regularize and filter the detections to create the final map of building footprints. Experiment results show that the strategies can effectively improve detection results in different landscapes.Estimating Global Downward Shortwave Radiation from VIIRS Data Using a Transfer-learning Neural Network
Remote Sensing of Environment (RSE), 2022
In recent years, machine learning (ML) has been successfully used in estimating downward shortwave radiation (DSR). To achieve global estimations, traditional ML models need sufficient ground measurements covering various atmospheric and surface conditions globally, which is difficult to accomplish. Training on the simulated data of a radiative transfer model (RTM) is a possible solution, but widely used RTMs ignore some complex cloud conditions which brings bias to simulations. In this study, a neural network applied with the transfer-learning (TL) concept is introduced to utilize both radiative transfer simulations and ground measurement data, achieving global DSR estimation with only top-of-atmosphere and surface albedo at local solar noon as inputs. The proposed method estimates both instantaneous and daily DSR from Visible Infrared Imaging Radiometer Suite (VIIRS) data at 750-m resolution, and both the estimates are validated by 40 independent stations globally. The root mean-square error and relative root mean square error of instantaneous DSR validation over 25 Baseline Surface Radiation Network, seven Surface Radiation Network, and eight Greenland Climate Network stations in 2013 were 91.2 (16.1%), 106.3 (18.3%), 75.0 (24.2%) W/m2, respectively, and the daily validation achieved 30.8 (15.5%), 33.5 (17.6%), and 31.3 (14.4) W/m2, respectively. The proposed method presents significant high accuracy over polar regions and similar performances over other areas compared with traditional ML models, physics models (e.g., look-up tables and direct estimations), and existing DSR products. The algorithm is also applied to VIIRS swath data to test its global efficacy. Instantaneous mapping captures the spatial pattern of the cloud-mask product, and daily mapping shows spatial patterns similar to the Clouds and the Earth's Radiant Energy System Synoptic TOA and surface fluxes and clouds product, but with more detail. Further analysis indicates that model performance is less sensitive to the quantity of training data after TL has been incorporated. This study demonstrates the advantages of TL on boosting both the generality and accuracy of DSR estimation, which can potentially be applied to other variable retrievals.Impacts of Forest Loss on Local Climate Across the Conterminous United States: Evidence from Satellite Time-series Observations
Science of The Total Environment (STTE), 2022
Forest disturbances alter land biophysics. Their impacts on local climate and land surface temperature (LST) cannot be directly measured by comparing pre- and post-disturbance observations of the same site over time (e.g., due to confounding such as background climate fluctuations); a common remedy is to compare spatially-adjacent undisturbed sites instead. This space-for-time substitution ignores the inherent biases in vegetation between two paired sites, interannual variations, and temporal dynamics of forest recovery. Besides, there is a lack of observation-based analyses at fine spatial resolutions capable of capturing spatial heterogeneity of small-scale forest disturbances. To address these limitations, here we report new satellite analyses on local climate impacts of forest loss at 30 m resolution. Our analyses combined multiple long-term satellite products (e.g., albedo and evapotranspiration [ET]) at 700 sites across major climate zones in the conterminous United States, using time-series trend and changepoint detection methods. Our method helped isolate the biophysical changes attributed to disturbances from those attributed to climate backgrounds and natural growth. On average, forest loss increased surface albedo, decreased ET, and reduced leaf area index (LAI). Net annual warming—an increase in LST—was observed after forest loss in the arid/semiarid, northern, tropical, and temperate regions, dominated by the warming from decreased ET and attenuated by the cooling from increased albedo. The magnitude of post-disturbance warming was related to precipitation; climate zones with greater precipitation showed stronger and longer warming. Reduction in leaf or LAI was larger in evergreen than deciduous forests, but the recovery in LAI did not always synchronize with those of albedo and ET. Overall, this study presents new evidence of biophysical effects of forest loss on LST at finer spatial resolutions; our time-series method can be further leveraged to derive local policy-relevant ecosystem climate regulation metrics or support model-based climate-biosphere studies.