Wheat Yield Prediction Using a Hybrid CNN-LSTM Deep Learning Framework with Tabular Agricultural Data

Authors

  • Alishba Rasool Department of Physics, University of Agriculture, Faisalabad, Pakistan, Email: alishbarasool59@gmail.com Author
  • Rimsha Shareef Department of Physics, University of Agriculture, Faisalabad, Pakistan Email: rshareef938@gmail.com Author
  • Saeed Rasheed Department of Computer Science, University of Agriculture, Faisalabad, Pakistan Email: saeed.rasheed0211@gmail.com Author

DOI:

https://doi.org/10.63163/jpehss.v4i2.1452

Keywords:

CNN-LSTM, Crop Yield Prediction, Deep Learning, Tabular Data, Wheat, Food Security, Uncertainty Quantification, Pakistan.

Abstract

Wheat yield prediction is critical for ensuring food security and enabling timely agricultural policy decisions in Pakistan, where wheat is the primary staple crop. This study proposes a hybrid Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) deep learning framework trained on open-access tabular agricultural data from Kaggle, comprising climatic and vegetation-derived features including average temperature, rainfall, NDVI, soil moisture, solar radiation, and wind speed, drawn from county-level historical records spanning 1990–2020 across the United States Great Plains. The CNN component extracts local non-linear feature interactions across the multi-variate input vector at each monthly timestep, while the LSTM component models temporal crop growth dynamics across a 12-month growing season window. The proposed CNN-LSTM achieved R2 = 0.446 and RMSE = 222.0 kg ha-1 on the held-out test set, performing comparably to XGBoost (R2 = 0.477) and standalone LSTM (R2 = 0.447) baselines. Prediction uncertainty was quantified via Monte Carlo Dropout, yielding a mean uncertainty of 46.4 kg ha-1. Practically, these results show that a deep learning pipeline for crop yield prediction can be built, validated, and benchmarked end-to-end using freely available data alone, without requiring any proprietary imagery or in-house infrastructure. This establishes a reproducible foundation that agricultural planners and researchers in data-scarce regions such as Pakistan can build upon directly — by substituting locally collected satellite imagery and district-level yield statistics — to support more timely, risk-aware decisions on buffer-stock planning, import scheduling, and early-season farm advisories than current deterministic forecasting tools allow.

Author Biography

  • Rimsha Shareef, Department of Physics, University of Agriculture, Faisalabad, Pakistan Email: rshareef938@gmail.com

    Department of Computer Science

Downloads

Published

2026-06-22

Issue

Section

Computer Science and Information Technology