Fine-tuning Llama
To fine-tune with Llama, make a LlamaConfig
and use it to create a FinetuningRun
. You'll need a LLaMA dataset.
LLaMA is not licensed for commercial use
The LLaMA model is not currently licensed for commercial use. LLaMA fine-tuning is offered for research purposes.
Build a config
Create your LlamaConfig
:
from baseten.training import LlamaConfig
config = LlamaConfig(
input_dataset=dataset,
source_col_name="my_source_column_name",
target_col_name="my_target_column_name"
)
Run fine-tuning
Once your config is set, use it to create a FinetuningRun
:
from baseten.training import FinetuningRun
my_run = FinetuningRun.create(
trained_model_name="My LLaMA Model",
fine_tuning_config=config
)
You can get your run's status with:
Once the run starts (my_run.status
returns RUNNING
), stream logs with:
Tips
Monitor your run with Weights & Biases
Blueprint fine-tuning can integrate with Weights & Biases to monitor your fine-tuning run:
- Pass
wandb_api_key="YOUR_API_KEY"
to enable the integration. - Use
image_log_steps
to control how many steps you see images for
This will enable you to see your model generate images as it is being fine-tuned.
If your model doesn't generate the results you'd like, there may be a couple issues:
-
Learning Rate: Your learning rate might've been too high or too low. A smaller learning rate ensures a more gradual convergence but may require more training time, while a larger learning rate may cause the model to overshoot the optimal solution. To remedy this, you can use the
learning_rate
parameter to set your learning rate in theLlamaConfig
. -
Dataset: Your dataset might not have enough examples or the examples are poorly formatted. It's important to thorougly clean and preprocess your dataset before kicking off the job. If there are extraneous HTML tags, erratic whitespaces or unknown characters found in the dataset, this may cause model quality issues.
-
Weight Decay: You may've overfit or underfit on the dataset. Overfitting is when the model begins to memorize your training data. Underfitting is when the model struggle to find a function that maps from your input data to your output data. You can apply weight decay (L2 regularization) to prevent overfitting and improve generalization. Start with a small weight decay value (e.g., 1e-5 or 1e-4) and experiment with different values to find the right balance between overfitting and underfitting.
Check the LlamaConfig
reference for a complete list of parameters.
What's next?
Your model will be automatically deployed
Once the fine-tuning run is complete, your model will be automatically deployed. You'll receive an email when the deployment is finished and the model is ready to invoke.
You can turn off this behavior by setting auto_deploy=False
in FinetuningRun.create()
and instead deploy your model manually.
Once your model is deployed, you can invoke it:
from baseten.models import Llama
model = Llama(model_id="model_123")
completion = model("What is the meaning of life?)
View our docs on the Llama()
model object here
The model returns: