OPTIMIZING FINE-TUNING METHODS FOR BASE LARGE LANGUAGE MODELS: A CASE STUDY AT SWISS GERMAN UNIVERSITY
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Swiss German University
Abstract
Chatbot and Large Language Models (LLM) are in growing trend recently. These chatbots can be used and customized for personal or commercial use, in this case for Swiss German University’s customer acquisition team. To customize an LLM, one of the methods is to do fine-tuning. This thesis research is done to find the most compatible existing base LLM when fine-tuned with SGU’s dataset which contains questions and answers pair about SGU. One of the problems of fine-tuning is that it requires a lot of computational resources. One of the solutions to this is to use a parameter efficient fine-tuning method named Low Rank Adaptation (LoRA). LoRA has multiple parameters that needs to be configured for fine-tuning efficiently. These parameter configurations are not fixed and varies depending on many variables. LoRA optimization is needed to find out the most optimal LoRA configurations for each base LLM. To do this, various LoRA configurations are created and used for fine-tuning the base LLM. The base LLMs used for this thesis research are Llama2 7B Chat by Meta and Mistral 7B Instruct by Mistral AI. Both base LLMs are tested with various LoRA configurations. Once the optimal configuration is found for both models, the two models are compared using ROUGE, BLEU, and METEOR. The fine-tuned Mistral 7B Instruct is able to achieve a higher score in all three metrics than the fine-tuned Llama2 7B Chat.