MLflow Tracking Skill

- [Databricks MLflow Guide](https://docs.databricks.com/mlflow/tracking.html)

vivekgana 4 Updated 6mo ago

Resources

GitHub

Install

npx skillscat add vivekgana/databricks-platform-marketplace/plugins-databricks-mlops-skills-mlflow-tracking

Install via the SkillsCat registry.

SKILL.md

MLflow Tracking Skill

Last Updated: 2026-01-01 22:45:49
Version: 1.0.0
Category: Tracking

Overview

Master MLflow experiment tracking patterns for reproducible ML workflows on Databricks. This skill covers experiment organization, parameter tracking, metric logging, artifact management, and model registration.

Learning Objectives

By using this skill, you will understand:

Experiment Organization
- Hierarchical experiment naming
- Run organization strategies
- Tag-based filtering
- Nested runs for complex workflows
Auto-logging
- Framework-specific auto-logging
- Custom metric logging
- Artifact capture
- Configuration
Model Packaging
- Model signatures
- Input examples
- Custom Python models
- Dependencies management
Model Registry Integration
- Model registration workflows
- Version management
- Stage transitions
- Unity Catalog integration

Key Patterns

Pattern 1: Structured Experiment Tracking

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier

# Set hierarchical experiment
mlflow.set_experiment(f"/Users/{username}/projects/{project_name}/training")

# Start run with descriptive name
with mlflow.start_run(run_name=f"{model_type}_v{version}") as run:

    # Log dataset version
    mlflow.set_tag("data_version", "v2.1")
    mlflow.set_tag("feature_set", "standard")

    # Enable auto-logging
    mlflow.sklearn.autolog(log_input_examples=True, log_model_signatures=True)

    # Log hyperparameters
    params = {"n_estimators": 100, "max_depth": 10}
    mlflow.log_params(params)

    # Train model
    model = RandomForestClassifier(**params)
    model.fit(X_train, y_train)

    # Log custom business metrics
    mlflow.log_metric("business_metric", calculate_business_value(model, X_test))

    # Log artifacts
    mlflow.log_artifact("feature_importance.png")
    mlflow.log_dict(feature_names, "features.json")

    print(f"Run ID: {run.info.run_id}")

Pattern 2: Model Registration with Governance

from mlflow.tracking import MlflowClient

client = MlflowClient()

def register_model_with_validation(run_id: str, model_name: str):
    """Register model with validation gates"""

    # Get run metrics
    run = client.get_run(run_id)
    metrics = run.data.metrics

    # Validation gate
    if metrics.get("test_accuracy", 0) < 0.85:
        raise ValueError("Model does not meet minimum accuracy threshold")

    # Register to Unity Catalog
    model_uri = f"runs:/{run_id}/model"
    full_name = f"catalog.schema.{model_name}"

    result = mlflow.register_model(
        model_uri=model_uri,
        name=full_name,
        tags={
            "source_run_id": run_id,
            "validation_status": "passed",
            "deployed_by": "ml_pipeline"
        }
    )

    return result

Pattern 3: Custom Python Model

import mlflow.pyfunc

class CustomModel(mlflow.pyfunc.PythonModel):
    """Custom model with preprocessing"""

    def load_context(self, context):
        """Load model and artifacts"""
        import joblib
        self.model = joblib.load(context.artifacts["model"])
        self.scaler = joblib.load(context.artifacts["scaler"])

    def predict(self, context, model_input):
        """Custom prediction logic"""
        # Preprocess
        scaled_input = self.scaler.transform(model_input)

        # Predict
        predictions = self.model.predict(scaled_input)

        # Post-process
        return self.post_process(predictions)

    def post_process(self, predictions):
        """Custom post-processing"""
        # Add business logic
        return predictions

# Log custom model
artifacts = {
    "model": "model.pkl",
    "scaler": "scaler.pkl"
}

mlflow.pyfunc.log_model(
    artifact_path="custom_model",
    python_model=CustomModel(),
    artifacts=artifacts
)

Templates

experiment_tracking_template.py: Complete experiment tracking example
model_registration_template.py: Model registry workflow
custom_model_template.py: Custom Python model pattern
nested_runs_template.py: Nested runs for hyperparameter tuning

Examples

sklearn_classification.py: Scikit-learn classification with MLflow
xgboost_regression.py: XGBoost regression tracking
pytorch_training.py: PyTorch model tracking
hyperparameter_tuning.py: Hyperopt integration

Best Practices

Always set experiment name before starting runs
Use descriptive run names that indicate model type and version
Log all hyperparameters for reproducibility
Include model signatures for input validation
Tag runs with relevant metadata (data version, feature set, etc.)
Log artifacts (plots, reports, configs)
Use Unity Catalog for production models
Version control experiment code alongside tracking

Common Pitfalls

Forgetting to set experiment (uses default experiment)
Not logging model signature (causes serving issues)
Missing input examples (harder to debug)
Not tagging runs (hard to filter/search)
Logging too many artifacts (storage costs)

Integration Points

Feature Store: Track feature lineage
Model Serving: Automatic endpoint updates
Workflows: Automated training pipelines
Unity Catalog: Model governance

MLflow Tracking Skill

Resources

Install

MLflow Tracking Skill

Overview

Learning Objectives

Key Patterns

Pattern 1: Structured Experiment Tracking

Pattern 2: Model Registration with Governance

Pattern 3: Custom Python Model

Templates

Examples

Best Practices

Common Pitfalls

Integration Points

References

Categories

Install

MLflow Tracking Skill

Resources

Install

MLflow Tracking Skill

Overview

Learning Objectives

Key Patterns

Pattern 1: Structured Experiment Tracking

Pattern 2: Model Registration with Governance

Pattern 3: Custom Python Model

Templates

Examples

Best Practices

Common Pitfalls

Integration Points

References

Categories

Install

Recommended Skills