New content submissions are moderated, generally on Tuesdays and Thursdays.

Please see the Leddy Library Institutional Repository Policy for content scope, copyright, audience, and process. Please do report any issues or questions to scholarship@uwindsor.ca

Visual-language transformer-based tomato leaf disease detection for portable greenhouse monitoring device

dc.contributor.authorKaur, Manveen
dc.contributor.authorSingh, Rajmeet
dc.contributor.authorAlirezaee, Shahpour
dc.contributor.authorHussain, Irfan
dc.date.accessioned2025-11-11T21:04:53Z
dc.date.issued2025-10-28
dc.description.abstractTomato leaf diseases pose a significant threat to global food security, necessitating accurate and efficient detection methods. This paper introduces the Tomato Leaf Disease Visual Language Model (TLDVLM), a novel approach based on the BLIP-2 architecture enhanced with Low-Rank Adaptation (LoRA), for precise classification of 10 distinct tomato leaf diseases. Our methodology integrates a sophisticated image preprocessing pipeline, utilizing GroundingDINO for robust leaf detection and SAM-2 for pixel-level segmentation, ensuring that the model focuses solely on relevant plant tissue. The TLDVLM leverages the powerful multimodal understanding of BLIP-2, with LoRA applied to its Q-Former module, enabling parameter-efficient fine-tuning without compromising performance. Comparative experiments demonstrate that the TLDVLM significantly outperforms baseline models, including CLIP-LoRA and ConvNeXT-tiny, achieving an accuracy of 97.27%, a precision of 0.9587, a recall of 0.9789, and an F1-score of 0.9681. Beyond classification, the finetuned TLDVLM checkpoints are integrated into a practical application for new image inference. This application displays the raw and segmented images, the predicted disease, and offers functionalities to fetch comprehensive information on disease causes and remedies using external APIs (e.g., OpenAI), with an option to download a PDF summary for offline access on a portable device. This research highlights the potential of LoRA-adapted Vision-Language Models in developing highly accurate, efficient, and user-friendly agricultural diagnostic tools.
dc.identifier.urihttps://hdl.handle.net/20.500.14776/19136
dc.language.isoen_CA
dc.publisherBioMed Central Ltd
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/
dc.subjectTomato leaf
dc.subjectDisease
dc.subjectBLIP-2
dc.subjectLoRA
dc.subjectVLM
dc.subjectLLM
dc.titleVisual-language transformer-based tomato leaf disease detection for portable greenhouse monitoring device
oaire.citation.issue1
oaire.citation.titlePlant Methods
oaire.citation.volume21
person.affiliation.nameDepartment of Mechanical Engineering, University of Windsor
person.affiliation.nameDepartment of Mechanical Engineering, Khalifa University, Abu Dhabi, United Arab Emirates

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s13007-025-01456-8.pdf
Size:
6.42 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.65 KB
Format:
Item-specific license agreed to upon submission
Description: