Resources
3Install
npx skillscat add vivekgana/databricks-platform-marketplace/plugins-databricks-mlops-skills-mlflow-tracking Install via the SkillsCat registry.
SKILL.md
MLflow Tracking Skill
Last Updated: 2026-01-01 22:45:49
Version: 1.0.0
Category: Tracking
Overview
Master MLflow experiment tracking patterns for reproducible ML workflows on Databricks. This skill covers experiment organization, parameter tracking, metric logging, artifact management, and model registration.
Learning Objectives
By using this skill, you will understand:
Experiment Organization
- Hierarchical experiment naming
- Run organization strategies
- Tag-based filtering
- Nested runs for complex workflows
Auto-logging
- Framework-specific auto-logging
- Custom metric logging
- Artifact capture
- Configuration
Model Packaging
- Model signatures
- Input examples
- Custom Python models
- Dependencies management
Model Registry Integration
- Model registration workflows
- Version management
- Stage transitions
- Unity Catalog integration
Key Patterns
Pattern 1: Structured Experiment Tracking
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
# Set hierarchical experiment
mlflow.set_experiment(f"/Users/{username}/projects/{project_name}/training")
# Start run with descriptive name
with mlflow.start_run(run_name=f"{model_type}_v{version}") as run:
# Log dataset version
mlflow.set_tag("data_version", "v2.1")
mlflow.set_tag("feature_set", "standard")
# Enable auto-logging
mlflow.sklearn.autolog(log_input_examples=True, log_model_signatures=True)
# Log hyperparameters
params = {"n_estimators": 100, "max_depth": 10}
mlflow.log_params(params)
# Train model
model = RandomForestClassifier(**params)
model.fit(X_train, y_train)
# Log custom business metrics
mlflow.log_metric("business_metric", calculate_business_value(model, X_test))
# Log artifacts
mlflow.log_artifact("feature_importance.png")
mlflow.log_dict(feature_names, "features.json")
print(f"Run ID: {run.info.run_id}")Pattern 2: Model Registration with Governance
from mlflow.tracking import MlflowClient
client = MlflowClient()
def register_model_with_validation(run_id: str, model_name: str):
"""Register model with validation gates"""
# Get run metrics
run = client.get_run(run_id)
metrics = run.data.metrics
# Validation gate
if metrics.get("test_accuracy", 0) < 0.85:
raise ValueError("Model does not meet minimum accuracy threshold")
# Register to Unity Catalog
model_uri = f"runs:/{run_id}/model"
full_name = f"catalog.schema.{model_name}"
result = mlflow.register_model(
model_uri=model_uri,
name=full_name,
tags={
"source_run_id": run_id,
"validation_status": "passed",
"deployed_by": "ml_pipeline"
}
)
return resultPattern 3: Custom Python Model
import mlflow.pyfunc
class CustomModel(mlflow.pyfunc.PythonModel):
"""Custom model with preprocessing"""
def load_context(self, context):
"""Load model and artifacts"""
import joblib
self.model = joblib.load(context.artifacts["model"])
self.scaler = joblib.load(context.artifacts["scaler"])
def predict(self, context, model_input):
"""Custom prediction logic"""
# Preprocess
scaled_input = self.scaler.transform(model_input)
# Predict
predictions = self.model.predict(scaled_input)
# Post-process
return self.post_process(predictions)
def post_process(self, predictions):
"""Custom post-processing"""
# Add business logic
return predictions
# Log custom model
artifacts = {
"model": "model.pkl",
"scaler": "scaler.pkl"
}
mlflow.pyfunc.log_model(
artifact_path="custom_model",
python_model=CustomModel(),
artifacts=artifacts
)Templates
experiment_tracking_template.py: Complete experiment tracking examplemodel_registration_template.py: Model registry workflowcustom_model_template.py: Custom Python model patternnested_runs_template.py: Nested runs for hyperparameter tuning
Examples
sklearn_classification.py: Scikit-learn classification with MLflowxgboost_regression.py: XGBoost regression trackingpytorch_training.py: PyTorch model trackinghyperparameter_tuning.py: Hyperopt integration
Best Practices
- Always set experiment name before starting runs
- Use descriptive run names that indicate model type and version
- Log all hyperparameters for reproducibility
- Include model signatures for input validation
- Tag runs with relevant metadata (data version, feature set, etc.)
- Log artifacts (plots, reports, configs)
- Use Unity Catalog for production models
- Version control experiment code alongside tracking
Common Pitfalls
- Forgetting to set experiment (uses default experiment)
- Not logging model signature (causes serving issues)
- Missing input examples (harder to debug)
- Not tagging runs (hard to filter/search)
- Logging too many artifacts (storage costs)
Integration Points
- Feature Store: Track feature lineage
- Model Serving: Automatic endpoint updates
- Workflows: Automated training pipelines
- Unity Catalog: Model governance