Multimodal Deep Learning Framework for Proactive Plant Disease Diagnosis

Authors

  • Purshottam J. Assudani Assistant Professor, School of Computer Science and Engineering, Ramdeobaba University, India
  • V. Rama Krishna Assistant professor, School of Engineering, Anurag University, India

DOI:

https://doi.org/10.63278/1419

Keywords:

RGB Imaging, Hyperspectral Imaging, Thermal Imaging, Convolutional Neural Network (CNN), Vision Transformer (ViT), Real-Time Inference, Knowledge Distillation, Model Quantization.

Abstract

Early diagnosis, the correct diagnosis of plant diseases is important to ensure sustainable agriculture and the minimalization of the loss of production. Traditional approaches of plant disease detection, which involve manual inspection and single modal imaging, are highly cumbersome, erroneous and lack in capturing the niche characteristics of the disease. Some recent achievements of deep learning advocate for possible automatic plant disease diagnosis; however, still most of the current models are plagued from low generalization capability, high computational cost and the issue of real time implementation. To alleviate these difficulties, this article introduces a brand-new multiple-mode deep learning framework, that combines RGB, hyperspectral and thermal imaging to take on the task of setting up precision and efficiency for plant disease detection. The described framework makes use of EfficientNet-based CNN for spatial feature extraction from RGB images, 1D-CNN for hyperspectral spectral feature learning and Vision Transformers (ViT) for learning long-range contextual dependencies. Above sensor- features are fused by Means of weighted summation methodology, dynamically adjusts contribution of per modality to Obtain endurance and accurate. To achieve real-time performance, the model is optimized via quantization, knowledge distillation and model pruning, with a substantial decrease in its computational load. The final optimal model is implemented in NVIDIA Jetson Nano to allow low-latency inference supporting high precision agriculture. The results of the experimental results show, the proposed multi-modal framework has achieved 97.8% accuracy, 96.5% precision, 95.7% recall and 96.1% score of F, all far exceed traditional deep learning models of ResNet-50, VGG-16, EfficientNet and Vision Transformers (ViT). Moreover, the framework offers inferences in 20 milliseconds, which makes it really suitable for real-time applications. Accomplishing a successful integration of multi-modal data fusion and model optimization not only increase classification performance, but also makes the solution/matter practical and deployable in real-world agricultural environment. The proposed framework provides a hopeful solution to smart farming, which provides a possibility of detecting disease early and managing effectively the crops.

Downloads

Published

2025-04-16

How to Cite

Purshottam J. Assudani, and V. Rama Krishna. 2025. “Multimodal Deep Learning Framework for Proactive Plant Disease Diagnosis ”. Metallurgical and Materials Engineering 31 (4):153-61. https://doi.org/10.63278/1419.

Issue

Section

Research