Tracing Text Generation Inference calls

How to trace text generation inference calls with Langfuse.
Author

Daniel van Strien

Published

April 5, 2024

%pip install openai langfuse --quiet
Note: you may need to restart the kernel to use updated packages.
LANGFUSE_SECRET_KEY="sk-lf-..."
LANGFUSE_PUBLIC_KEY="pk-lf-..."
LANGFUSE_HOST="https://cloud.langfuse.com" # 🇪🇺 EU region
from google.colab import userdata
import os
os.environ["LANGFUSE_SECRET_KEY"] = userdata.get('LANGFUSE_SECRET_KEY')
os.environ["LANGFUSE_PUBLIC_KEY"] = userdata.get('LANGFUSE_PUBLIC_KEY')
HF_TOKEN = userdata.get('HF_TOKEN')
from langfuse.decorators import observe
from langfuse.openai import openai, OpenAI # OpenAI integration


client = OpenAI(
        base_url="https://api-inference.huggingface.co/models/mistralai/Mixtral-8x7B-Instruct-v0.1/v1",
        api_key=HF_TOKEN,
    )
chat_completion = client.chat.completions.create(
    model="tgi",
    messages=[
        {"role": "user", "content": "What is Hugging Face?"}
    ],
    stream=False
)
chat_completion
ChatCompletion(id='', choices=[Choice(finish_reason='length', index=0, logprobs=None, message=ChatCompletionMessage(content=" Hugging Face is a technology company that specializes in natural language processing (NLP) and artificial intelligence (AI). The company is best known for its development of Transformers, an open-source library that provides a wide range of pre-trained models for various NLP tasks, such as text classification, question answering, and language translation.\n\nHugging Face's Transformers library has gained widespread popularity among developers and researchers due to its ease of use, flexibility, and", role='assistant', function_call=None, tool_calls=None))], created=1712314124, model='text-generation-inference/Mixtral-8x7B-Instruct-v0.1-medusa', object='text_completion', system_fingerprint='1.4.3-sha-e6bb3ff', usage=CompletionUsage(completion_tokens=100, prompt_tokens=15, total_tokens=115))

Langfuse Trace