OPTIMIZING FINE-TUNING METHODS FOR BASE LARGE LANGUAGE MODELS: A CASE STUDY AT SWISS GERMAN UNIVERSITY

Chatbot and Large Language Models (LLM) are in growing trend recently. These chatbots can be used and customized for personal or commercial use, in this case for Swiss German University’s customer acquisition team. To customize an LLM, one of the methods is to do fine-tuning. This thesis research is done to find the most compatible existing base LLM when fine-tuned with SGU’s dataset which contains questions and answers pair about SGU. One of the problems of fine-tuning is that it requires a lot of computational resources. One of the solutions to this is to use a parameter efficient fine-tuning method named Low Rank Adaptation (LoRA). LoRA has multiple parameters that needs to be configured for fine-tuning efficiently. These parameter configurations are not fixed and varies depending on many variables. LoRA optimization is needed to find out the most optimal LoRA configurations for each base LLM. To do this, various LoRA configurations are created and used for fine-tuning the base LLM. The base LLMs used for this thesis research are Llama2 7B Chat by Meta and Mistral 7B Instruct by Mistral AI. Both base LLMs are tested with various LoRA configurations. Once the optimal configuration is found for both models, the two models are compared using ROUGE, BLEU, and METEOR. The fine-tuned Mistral 7B Instruct is able to achieve a higher score in all three metrics than the fine-tuned Llama2 7B Chat.

Keywords

Fine-tuning, LLM, Llama, Mistral, LoRA, Chatbot

URI

https://dspace-repository.sgu.ac.id/handle/123456789/254

Collections

Thesis

Full item page

OPTIMIZING FINE-TUNING METHODS FOR BASE LARGE LANGUAGE MODELS: A CASE STUDY AT SWISS GERMAN UNIVERSITY

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By