Efficient Large Language Models

Efficient LLM Project

This project focuses on incorporating memory efficient LoRa (Low-Rank Adaptation) layers to Transformer and feed forward layers on Distillbert-uncased-8B from HuggingFace.

Key Features:

  • Memory Efficiency: Implemented LoRa layers to reduce memory footprint during fine-tuning
  • Transformer Architecture: Applied to both transformer and feed-forward layers
  • HuggingFace Integration: Built on top of Distillbert-uncased-8B model
  • Performance Optimization: Maintained model performance while reducing computational requirements

Technologies Used:

  • Python
  • PyTorch
  • HuggingFace Transformers
  • LoRa (Low-Rank Adaptation)

This work contributes to making large language models more accessible by reducing their memory requirements during training and inference.