IMPROVING DOCUMENT UNDERSTANDING OF MEDICAL INVOICES USING MULTIMODAL APPROACH: A COMPARATIVE STUDY
| dc.contributor.author | Mahendratama, Abyakta Nadhif | |
| dc.contributor.author | Ipung, Heru Purnomo | |
| dc.contributor.author | Taqwim, Andi Darma | |
| dc.date.accessioned | 2026-05-21T08:44:24Z | |
| dc.date.issued | 2025-08-14 | |
| dc.description.abstract | This thesis presents a comprehensive comparative study on document understanding of medical invoices using both OCR and advanced multimodal models. A stratified dataset of 186 real-world Indonesian medical invoices, encompassing diverse forms of visual degradation, was manually annotated at the field level to ensure robust and representative evaluation. The research investigates the effectiveness of OCR systems, advanced multimodal models, and the combination of both systems in extracting structured information from preprocessed invoice images. Standard image preprocessing techniques were applied to all samples prior to evaluation. Performance was quantitatively assessed using established metrics, including precision, recall, and F1-score. The results demonstrate that multimodal models consistently outperform OCR systems on visually degraded invoices. This work offers practical insights for deploying robust automated document understanding solutions in claims processing, highlighting the advantages of integrating preprocessing with modern multimodal models for real-world, domain-specific applications. | |
| dc.identifier.uri | https://dspace-repository.sgu.ac.id/handle/123456789/201 | |
| dc.language.iso | en | |
| dc.publisher | Swiss German University | |
| dc.subject | Document understanding | |
| dc.subject | Optical Character Recognition | |
| dc.subject | multimodal models | |
| dc.subject | medical invoices | |
| dc.subject | Gemini | |
| dc.subject | structured data extraction | |
| dc.subject | preprocessing | |
| dc.subject | Mistral OCR | |
| dc.subject | Pixtral | |
| dc.subject | Qwen 2.5-VL | |
| dc.subject | PaddleOCR | |
| dc.title | IMPROVING DOCUMENT UNDERSTANDING OF MEDICAL INVOICES USING MULTIMODAL APPROACH: A COMPARATIVE STUDY | |
| dc.type | Thesis |
Files
Original bundle
1 - 5 of 6
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed to upon submission
- Description: