Explainable Ensemble Learning for IoT Intrusion Detection: Multi-device Evaluation using SHAP-based Interpretability and Class Balancing
DOI:
https://doi.org/10.63163/jpehss.v4i2.1433Abstract
The existing literature on IoT intrusion detection (ID) has two common drawbacks: Most of the models are tested for a single device type and they do not provide much information about the reasons for their decisions. This paper tackles both these issues by performing interpretable ensemble-learning experiments on seven types of IoT devices ranging from consumer appliances to industrial sensors to environmental monitors and by studying the behaviour of the resulting models in detail. Three challenges are identified: first is the extreme class imbalance, in which attacks make up a very small share of the samples; second, limited interpretability, which limits the amount of trust security teams can give to the results of their detection; and third, a lack of evidence of the generalization of the detection across different types of devices. Gradient-boosting ensembles (LightGBM and XGBoost) were used, along with class balancing techniques such as the Synthetic Minority Over-sampling Technique (SMOTE) and interpretability techniques known as SHapley Additive exPlanations (SHAP). On 197,811 traffic samples, the ROC-AUC scores were boosted from 0.88–0.94 to 0.94–1.00 by SMOTE, while the inference latency increased from 2.3–3.1 to 3.2–3.4 ms. The most important features that contributed to the predictive signal in the models were found to be packet-size statistics, inter-arrival timing, and protocol attributes, with the SHAP analysis showing that about 68-73% of the signal was captured by these three feature groups. Compared with the baseline of a Long Short-Term Memory (LSTM) model (ROC-AUC 0.86–0.91 and latency 47 ms), the ensemble models outperformed the baseline in terms of recall and had significantly better interpretability with only a sub-50 MB memory footprint needed to deploy them on the edge.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Muhammad Irfan (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.