Software Principal Engineer, | Dell Technologies
From Generic to Specialized: Fine Tuning Large Language Mode
With the emergence of Conversational AI tools like Chat GPT and Google Bard, the world has been exposed to incredible new possibilities of technologies with the help of Large Language Models (LLM). A large language model is a type of artificial intelligence algorithm that uses deep learning techniques and massively large data sets accompanied with huge computation infrastructure. However, training LLMs is a complex task which requires substantial computational resources and infrastructure. Fine-tuning large language models (LLMs) for domain-specific data has emerged as a crucial technique to enhance their performance in specialized tasks and industries. In this talk we give an overview of the basic concepts of LLMs , their pre-training process, highlighting the transfer learning paradigm that forms the basis of fine-tuning.
We will look into the preparatory steps required for successful fine-tuning, including dataset acquisition, cleaning, and structuring. Furthermore, we will discuss the workings of the fine-tuning process which involves adapting the pre-trained LLM’s parameters to domain-specific language patterns, contextual nuances, and task requirements. Architectural considerations, such as selecting appropriate model sizes, are explored in relation to the domain’s computational resources and target task complexity.
We evaluate different fine-tuning approaches, ranging from traditional fine-tuning to more advanced techniques like adapter-based architectures. It covers techniques to prevent overfitting, including data augmentation, regularization, and transfer learning from related domains. Lastly, we will address the ethical scope of fine-tuning LLMs, highlighting potential challenges related to bias, fairness, and unintended consequences. They audience will gain an overall knowledge about LLM also they can know how to apply it on their specific data domains.
TRACK: AI & ML
19 Oct 2023 | Time: 04:45-05:15 PM
Binitha MT is a Software Principal Engineer at Dell EMC with a total of 6 year of experience in Server Administration domain. She ensures to deliver high-quality products, provide trainings and mentor peers. She is passionate about data mining and search engines. Besides this, she is a core member of Women in Technology and an NGO activist. A happy to go person and a great Team player.