Senior Software Engineer | Dell Technologies
From Generic to Specialized : Fine Tuning Large Language Mode
With the emergence of Conversational AI tools like Chat GPT and Google Bard, the world has been exposed to incredible new possibilities of technologies with the help of Large Language Models (LLM). A large language model is a type of artificial intelligence algorithm that uses deep learning techniques and massively large data sets accompanied with huge computation infrastructure. However, training LLMs is a complex task which requires substantial computational resources and infrastructure. Fine-tuning large language models (LLMs) for domain-specific data has emerged as a crucial technique to enhance their performance in specialized tasks and industries.
In this talk we give an overview of the basic concepts of LLMs , their pre-training process, highlighting the transfer learning paradigm that forms the basis of fine-tuning. We will look into the preparatory steps required for successful fine-tuning, including dataset acquisition, cleaning, and structuring. Furthermore, we will discuss the workings of the fine-tuning process which involves adapting the pre-trained LLM’s parameters to domain-specific language patterns, contextual nuances, and task requirements.
Architectural considerations, such as selecting appropriate model sizes, are explored in relation to the domain’s computational resources and target task complexity. We evaluate different fine-tuning approaches, ranging from traditional fine-tuning to more advanced techniques like adapter-based architectures. It covers techniques to prevent overfitting, including data augmentation, regularization, and transfer learning from related domains. Lastly, we will address the ethical scope of fine-tuning LLMs, highlighting potential challenges related to bias, fairness, and unintended consequences. They audience will gain an overall knowledge about LLM also they can know how to apply it on their specific data domains.
TRACK: AI & ML
19 Oct 2023 | Time: 04:45-05:15 PM
Subhankar is a Software Senior Engineer at Dell EMC with a total of 8 year of experience in IT. He has worked in various domains like System Administration in Server Domain, CRM (salesforce), and Test Automation. He is passionate about innovation and application of AI to improve existing solution or implement new one