Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/150706
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSánchez Quijada, Marina-
dc.date.accessioned2024-07-10T17:36:46Z-
dc.date.available2024-07-10T17:36:46Z-
dc.date.issued2024-06-
dc.identifier.urihttp://hdl.handle.net/10609/150706-
dc.description.abstractIn recent years, large vision and language models (LVLMs) have gained a lot of attention due to their accessibility and impressive performance in various language and vision tasks. Consequently, their applications in the medical imaging field are being studied, showing already great potential in clinical settings. However, very few studies have been carried out to evaluate the potential of LVLMs for disease diagnosis, especially for microscopy images. In this work, we explore for the first time the capabilities of three of the most advanced LVLMs (GPT-4, Claude3, and LLaVa) in the analysis and classification of peripheral blood cells. To perform this exploration, we build multiple prompts based on different prompting techniques, including few-shot learning and chain of thought (CoT), to study and improve the performance of these LVLMs for blood cell image analysis. We also explore the functionality of the assistant and the system roles in model behaviour and performance. Moreover, we perform a comprehensive comparison of their accuracy rates and create a web application for white blood cell classification. Our experiments conclude that the best-performing method and LVLM combination is GPT-4o when using a two-shot learning strategy with the addition of the assistant role. When testing this approach on 100 images of leukocytes, we attained an accuracy rate of 78%. Although this performance is not reliable enough and LVLMs should not be used as diagnostic tools, we believe that due to the rapid advancement of large language-vision models, LVLMs could become a great asset in the analysis of pathology images, working as an assistant for quick blood cell description and classification.en
dc.format.mimetypeapplication/pdfca
dc.language.isoengca
dc.publisherUniversitat Oberta de Catalunya (UOC)ca
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/-
dc.subjectMachine learningen
dc.subjectMedical imaging analysisen
dc.subjectLarge visual-language modelsen
dc.titleExploring large vision-language models with prompt engineering for peripheral blood cell image analysis and classificationca
dc.typeinfo:eu-repo/semantics/masterThesisca
dc.contributor.tutorAlférez Baquero, Edwin Santiago-
Appears in Collections:Trabajos finales de carrera, trabajos de investigación, etc.

Files in This Item:
File Description SizeFormat 
TFM_Memoria_MarinaSanchez.pdf2,2 MBAdobe PDFThumbnail
View/Open
Share:
Export:
View statistics

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.