2025/12/5
Fatemeh Daneshfar

Fatemeh Daneshfar

Academic rank: Assistant Professor
ORCID:
Education: PhD.
H-Index:
Faculty: Faculty of Engineering
ScholarId:
E-mail: f.daneshfar [at] uok.ac.ir
ScopusId: View
Phone:
ResearchGate:

Research

Title
Vision–Language Models: Applications in Image Retrieval, Fine-Grained Classification, and Parameter-Efficient Fine-Tuning (PEFT)
Type
WorkShop
Keywords
Vision-Language Models-Image retrieval
Year
2025
Researchers Fatemeh Daneshfar

Abstract

Vision–Language Models (VLMs) have emerged as a powerful paradigm that bridges computer vision and natural language processing, enabling machines to jointly understand images and text. This seminar explores three key application areas of VLMs: image retrieval, where multimodal embeddings allow for efficient cross-modal search; fine-grained classification, where VLMs capture subtle distinctions across categories by leveraging both visual and textual cues; and parameter-efficient fine-tuning (PEFT), which provides practical strategies for adapting large-scale models to specific tasks without the need for full retraining. By examining recent advances and case studies, we will highlight the strengths, limitations, and future directions of VLMs in real-world applications.