Fine-Tuning Llama 3.1B Tutorial
Tutorial: Fine Tuning Llama3.1-8B
The rapid advancements in open-source models like Llama-3.1 demonstrate the growing potential of smaller, fine-tuned models that can rival even the top closed-source alternatives. Lumino allows you to fine-tune Llama-3.1 8B on your proprietary datasets, creating powerful, accurate, and fully owned AI models.
In this guide, we’ll walk through how to fine-tune a Llama-3.1 8B model using the Lumino SDK on a sample dataset, transforming a base model into a fine-tuned, high-performance model for your specific needs. Our example will cover everything from dataset preparation to running evaluations, showing how a base model can be elevated to outperform larger models.
What You Will Learn
- How to prepare and format your dataset for fine-tuning
- Uploading datasets to Lumino using the SDK
- Initiating fine-tuning jobs with customizable parameters
- Running evaluations to assess model performance
Prerequisites
- Lumino SDK: Install using
pip install lumino
- API Key: Obtain it from the Lumino dashboard, and set it up as an environment variable:
export LUMINO_API_KEY=your_api_key
Dataset Preparation
To begin, let’s prepare a dataset for fine-tuning. In this tutorial, we’ll use a simple trivia-based dataset. If you’re following along, feel free to create or use any dataset that fits your use case.
Example Dataset - Original Format
[
{
"instruction": "What is the capital of France?",
"output": "The capital of France is Paris."
},
{
"instruction": "Who wrote the novel '1984'?",
"output": "The novel '1984' was written by George Orwell."
},
{
"instruction": "What is the speed of light?",
"output": "The speed of light is approximately 299,792 kilometers per second."
}
]
Formatting the Dataset to JSONL
To use this dataset with the Lumino SDK, we need to convert it to a .jsonl format, where each line represents a structured conversation between a system, user, and assistant. Here’s a Python script that formats the dataset into the correct structure for fine-tuning:
import json
dataset = [
{
"instruction": "What is the capital of France?",
"output": "The capital of France is Paris."
},
{
"instruction": "Who wrote the novel '1984'?",
"output": "The novel '1984' was written by George Orwell."
},
{
"instruction": "What is the speed of light?",
"output": "The speed of light is approximately 299,792 kilometers per second."
}
]
# Format the dataset into the required structure
formatted_data = []
for example in dataset:
formatted_data.append({
"messages": [
{"role": "system", "content": "You are a helpful assistant that provides accurate answers to general knowledge questions."},
{"role": "user", "content": example["instruction"]},
{"role": "assistant", "content": example["output"]}
]
})
# Write to a .jsonl file
with open("Formatted_Trivia.jsonl", "w", encoding="utf-8") as f:
for item in formatted_data:
f.write(json.dumps(item) + "\n")
print("Dataset has been formatted and saved to 'Formatted_Trivia.jsonl'.")
Result: Formatted_Trivia.jsonl
{"messages": [{"role": "system", "content": "You are a helpful assistant that provides accurate answers to general knowledge questions."}, {"role": "user", "content": "What is the capital of France?"}, {"role": "assistant", "content": "The capital of France is Paris."}]}
{"messages": [{"role": "system", "content": "You are a helpful assistant that provides accurate answers to general knowledge questions."}, {"role": "user", "content": "Who wrote the novel '1984'?"}, {"role": "assistant", "content": "The novel '1984' was written by George Orwell."}]}
{"messages": [{"role": "system", "content": "You are a helpful assistant that provides accurate answers to general knowledge questions."}, {"role": "user", "content": "What is the speed of light?"}, {"role": "assistant", "content": "The speed of light is approximately 299,792⬤
Uploading the Dataset to Lumino
Method 1: Uploading via the Lumino SDK
Once the dataset is prepared, it’s time to upload it to Lumino using the SDK. Here’s how to do that:
import os
import asyncio
from lumino.sdk import LuminoSDK
async def main():
async with LuminoSDK(os.environ.get("LUMINO_API_KEY")) as client:
await client.dataset.upload_dataset("Formatted_Trivia.jsonl")
print("Dataset uploaded successfully!")
asyncio.run(main())
This script uses the Lumino SDK to upload the dataset and confirm successful upload. Make sure to replace “Formatted_Trivia.jsonl” with the path to your dataset file.
Method 2: Uploading via the Lumino Web App UI
Alternatively, you can upload your dataset through the Lumino Web App, which provides an intuitive UI for managing datasets.
- Log in to the Lumino Web App: Go to the Lumino Web App and log in with your credentials.
- Navigate to the Dataset Page: Once logged in, head to the “Datasets” section on the dashboard.
- Upload Dataset:
- Click the Upload Dataset button.
- Select your formatted .jsonl file (e.g., Formatted_Trivia.jsonl) from your local machine.
- Confirm the Upload:
- After uploading, you’ll see the dataset listed in the Datasets section. You can now use this dataset for fine-tuning jobs.
Fine-tuning
After uploading the dataset, we can start a fine-tuning job.
Create Fine Tune Job
The fine-tuning parameters are flexible, allowing you to adjust aspects like batch size, learning rate, and the number of epochs. In this example, we’ll use 5 epochs and a small batch size to keep things lightweight.
import os
import asyncio
from lumino.sdk import LuminoSDK
async def main():
async with LuminoSDK(os.environ.get("LUMINO_API_KEY")) as client:
job = await client.fine_tuning.create_fine_tuning_job(
base_model_name="llm_llama3_1_8b",
dataset_name="Formatted_SampleInstruct.jsonl",
name="sample_fine_tune_job",
parameters={
"batch_size": 4,
"num_epochs": 5,
"learning_rate": 1e-5,
"use_lora": True,
}
)
print(f"Fine-tuning job started: {job}")
asyncio.run(main())
Monitoring the Job
While the job is running, you can monitor its status or adjust certain parameters by interacting with the Lumino API. Below is an example of retrieving the fine-tuning job’s status:
import os
import asyncio
from lumino.sdk import LuminoSDK
async def main():
async with LuminoSDK(os.environ.get("LUMINO_API_KEY")) as client:
jobs = await client.fine_tuning.list_fine_tuning_jobs()
print("Fine-tuning jobs:", jobs)
asyncio.run(main())
Evaluating the Fine-tuned Model
Since Lumino SDK currently does not support model evaluation, we can manually download the fine-tuned model and evaluate it using libraries like Hugging Face’s transformers. This allows you to load the model and compare its outputs with a set of evaluation prompts and expected responses.
Step 1: Export Your Fine-tuned Model
Once your model has been fine-tuned using the Lumino SDK, download the model files (this will depend on your hosting method) or save it locally if you’re running the fine-tuning job on your infrastructure.
Step 2: Prepare the Evaluation Dataset (EvalTrivia.jsonl)
We’ll use the same trivia dataset for evaluation:
{"instruction": "What is the capital of Japan?", "output": "The capital of Japan is Tokyo."}
{"instruction": "Who painted the Mona Lisa?", "output": "The Mona Lisa was painted by Leonardo da Vinci."}
{"instruction": "What is the boiling point of water?", "output": "The boiling point of water is 100 degrees Celsius at sea level."}
Step 3: Load the Model and Evaluate using Hugging Face Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import json
# Load your fine-tuned model and tokenizer (replace with your model path or Hugging Face hub path)
model_path = "path_to_your_finetuned_model"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
# Function to generate predictions from the model
def generate_answer(prompt):
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs['input_ids'], max_length=50)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Load the evaluation dataset
with open("EvalTrivia.jsonl", "r", encoding="utf-8") as f:
eval_data = [json.loads(line) for line in f]
correct_predictions = 0
total_predictions = len(eval_data)
# Evaluate the model
for example in eval_data:
predicted_answer = generate_answer(example["instruction"]).strip()
if predicted_answer == example["output"].strip():
correct_predictions += 1
else:
print(f"Incorrect prediction:\nQuestion: {example['instruction']}\nExpected: {example['output']}\nGot: {predicted_answer}\n")
# Calculate accuracy
accuracy = (correct_predictions / total_predictions) * 100
print(f"Evaluation Complete. Accuracy: {accuracy:.2f}%")
Conclusion
By using the Lumino SDK, you can quickly fine-tune open-source models like Llama-3.1 8B for specific tasks, creating faster, more accurate models at a fraction of the cost of closed-source alternatives. The flexibility of Lumino’s fine-tuning API allows you to customize the entire process, from dataset management to model deployment.
For more details, check out our full documentation or explore the PyPi repo to get started.