Chen et al. (2026): Integrating transformer-based learning and Sentinel-2 bare soil composites for soil organic carbon mapping in the black soil region of Northeast China
Na Chen, Zhikang Wei, Xuancheng Jin, Nan Lin, Fan Yang, Ling Zhao and Song Wu, IN: Scientific Reports, https://doi.org/10.1038/s41598-025-33682-4
Accurate assessment of soil organic carbon (SOC) is essential for sustainable cropland management and carbon sequestration monitoring. However, high-resolution SOC mapping remains challenging due to two persistent limitations: (1) the difficulty of extracting true bare-soil reflectance—especially when single-date imagery is used and spectral signals remain influenced by vegetation, residue, and soil moisture; and (2) reliance on models that require large training datasets and may underperform in typical small-sample soil survey settings. To address these challenges, the authors developed an approach that integrates multi-temporal Sentinel-2 bare-soil composites with a transformer-based foundation model—Tabular Prior-data Fitted Network (TabPFN)—for SOC prediction in the black soil region of Northeast China. Bare soil pixels were extracted using a Normalized Difference Vegetation Index threshold (0.1–0.4), and two compositing strategies—the 50th percentile (P50) and 90th percentile (P90)—were compared. The authors systematically evaluated three advanced algorithms: TabPFN, convolutional neural network (CNN), and Extreme Gradient Boosting (XGBoost).