Efficient Large Language Models
Efficient LLM Project
This project focuses on incorporating memory efficient LoRa (Low-Rank Adaptation) layers to Transformer and feed forward layers on Distillbert-uncased-8B from HuggingFace.
Key Features:
- Memory Efficiency: Implemented LoRa layers to reduce memory footprint during fine-tuning
- Transformer Architecture: Applied to both transformer and feed-forward layers
- HuggingFace Integration: Built on top of Distillbert-uncased-8B model
- Performance Optimization: Maintained model performance while reducing computational requirements
Technologies Used:
- Python
- PyTorch
- HuggingFace Transformers
- LoRa (Low-Rank Adaptation)
This work contributes to making large language models more accessible by reducing their memory requirements during training and inference.