law L L M
AdaptLLMIntroduction
The AdaptLLM repository contains domain-specific language models developed from LLaMA-1-7B. The project explores the continual pre-training of large language models (LLMs) on domain-specific corpora. This approach aims to enrich LLMs with domain knowledge while maintaining their performance in tasks such as biomedicine, finance, and law. A new method transforms large-scale pre-training corpora into reading comprehension texts, improving prompting performance. AdaptLLM models, including the 7B version, demonstrate competitive performance against significantly larger models.
Architecture
AdaptLLM builds on the LLaMA-1 architecture, offering models tailored to specific domains such as biomedicine, finance, and law. The project also scales to larger models like LLaMA-1-13B and supports LLaMA-2-Chat configurations. The team has introduced LLaMA-3-8B, leveraging a context-based instruction synthesizer to enhance performance. The architecture is designed to support domain-specific tasks and is adaptable to different data formats, including multi-turn conversation formats required by some models.
Training
The training process involves continual pre-training on domain-specific datasets, enhancing models with relevant knowledge. AdaptLLM employs a reading comprehension transformation method to optimize prompting capabilities across tasks. The pre-templatized testing splits allow for the reproduction of prompting results, ensuring consistent evaluation across models. Specific instructions and datasets are available for domains such as biomedicine, finance, and law, supporting thorough model evaluation.
Guide: Running Locally
-
Set Up Dependencies
- Clone the repository:
git clone https://github.com/microsoft/LMOps
- Navigate to the directory:
cd LMOps/adaptllm
- Install required packages:
pip install -r requirements.txt
- Clone the repository:
-
Evaluate the Model
- Select a domain: e.g.,
'law'
- Specify a model: e.g.,
'AdaptLLM/law-LLM'
- Configure model parallelization based on model size.
- Choose the number of GPUs (1, 2, 4, 8).
- Set
add_bos_token
toFalse
for AdaptLLM models. - Run the evaluation script:
bash scripts/inference.sh ${DOMAIN} ${MODEL} ${add_bos_token} ${MODEL_PARALLEL} ${N_GPU}
- Select a domain: e.g.,
-
Cloud GPUs
- For larger models, consider using cloud GPU services like AWS, Google Cloud, or Azure to handle increased computational demands.
License
AdaptLLM and associated resources are available under licenses provided by Hugging Face and Microsoft. Users must comply with these terms when utilizing models, datasets, and scripts from the repository.