mlflow

instrumenting-with-mlflow-tracing

Instruments Python and TypeScript code with MLflow Tracing for observability. Triggers on questions about adding tracing, instrumenting agents/LLM apps, getting started with MLflow tracing, or tracing specific frameworks (LangGraph, LangChain, OpenAI, DSPy, CrewAI, AutoGen). Examples - "How do I add tracing?", "How to instrument my agent?", "How to trace my LangChain app?", "Getting started with MLflow tracing", "Trace my TypeScript app"

mlflow 46 15 Updated 4mo ago

Resources

1
GitHub

Install

npx skillscat add mlflow/skills/instrumenting-with-mlflow-tracing

Install via the SkillsCat registry.

SKILL.md

MLflow Tracing Instrumentation Guide

Language-Specific Guides

Based on the user's project, load the appropriate guide:

  • Python projects: Read references/python.md
  • TypeScript/JavaScript projects: Read references/typescript.md

If unclear, check for package.json (TypeScript) or requirements.txt/pyproject.toml (Python) in the project.


What to Trace

Trace these operations (high debugging/observability value):

Operation Type Examples Why Trace
Root operations Main entry points, top-level pipelines, workflow steps End-to-end latency, input/output logging
LLM calls Chat completions, embeddings Token usage, latency, prompt/response inspection
Retrieval Vector DB queries, document fetches, search Relevance debugging, retrieval quality
Tool/function calls API calls, database queries, web search External dependency monitoring, error tracking
Agent decisions Routing, planning, tool selection Understand agent reasoning and choices
External services HTTP APIs, file I/O, message queues Dependency failures, timeout tracking

Skip tracing these (too granular, adds noise):

  • Simple data transformations (dict/list manipulation)
  • String formatting, parsing, validation
  • Configuration loading, environment setup
  • Logging or metric emission
  • Pure utility functions (math, sorting, filtering)

Rule of thumb: Trace operations that are important for debugging and identifying issues in your application.


Feedback Collection

Log user feedback on traces for evaluation, debugging, and fine-tuning. Essential for identifying quality issues in production.

See references/feedback-collection.md for:

  • Recording user ratings and comments with mlflow.log_feedback()
  • Capturing trace IDs to return to clients
  • LLM-as-judge automated evaluation

Reference Documentation

Production Deployment

See references/production.md for:

  • Environment variable configuration
  • Async logging for low-latency applications
  • Sampling configuration (MLFLOW_TRACE_SAMPLING_RATIO)
  • Lightweight SDK (mlflow-tracing)
  • Docker/Kubernetes deployment

Advanced Patterns

See references/advanced-patterns.md for:

  • Async function tracing
  • Multi-threading with context propagation
  • PII redaction with span processors

Distributed Tracing

See references/distributed-tracing.md for:

  • Propagating trace context across services
  • Client/server header APIs