Guide to transform prototypes into robust, distributable Python packages using the src layout, hybrid paradigm, and strict configuration management.
Install
npx skillscat add fmind/mlops-python-package/mlops-industrialization Install via the SkillsCat registry.
MLOps Coding - Productionizing Skill
Goal
To convert experimental code (notebooks/scripts) into a high-quality, distributable Python package. This skill enforces the src/ layout, a Hybrid Paradigm (OOP structure + Functional purity), and Strict Configuration to ensure scalability, security, and maintainability.
Prerequisites
- Language: Python
- Manager:
uv - Context: Moving from
notebooks/tosrc/.
Instructions
1. Packaging Structure (src Layout)
Adopt the src layout to prevent import errors and separate source from tooling.
Directory Tree:
my-project/ ├── pyproject.toml # Dependencies & Metadata ├── uv.lock ├── README.md └── src/ └── my_package/ # Main package directory ├── __init__.py ├── io/ # Side-effects (Datasets, APIs) ├── domain/ # Pure business logic (Models, Features) └── application/ # Orchestration (Training loops, Inference)Configuration: Use
pyproject.tomlfor all build metadata and dependencies.
2. Modularity & Paradigm (Hybrid Style)
Balance structure with predictability.
- Domain Layer (Pure):
- Rule: Code here must be deterministic and free of side effects (no I/O).
- Use Case: Feature transformations, Model architecture definitions.
- Style: Functional (pure functions) or Immutable Objects (dataclasses).
- I/O Layer (Impure):
- Rule: Isolate external interactions here.
- Use Case: Loading data from S3, saving models to disk, logging to MLflow.
- Style: OOP (Classes to manage connections/state).
- Application Layer (Orchestration):
- Rule: Wire Domain and I/O together.
- Use Case: Tuning, Training, Inference, Evaluation, etc.
3. Application Entrypoints
Create standard, installable CLI tools.
Define Script: Create
src/my_package/scripts.pywith amain()function.Register: Add to
pyproject.toml:[project.scripts] my-tool = "my_package.scripts:main"CLI Execution:
- Dev:
uv run my-tool(No install needed). - Prod:
pip install .->my-tool(Installed on PATH).
- Dev:
Guard: Always use
if __name__ == "__main__":in scripts to prevent execution on import.
4. Configuration Management
Decouple settings from code using OmegaConf (Parsing) and Pydantic (Validation).
Define Schema (Pydantic):
- Create a class that defines expected types and defaults.
from pydantic import BaseModel class TrainingConfig(BaseModel): batch_size: int = 32 learning_rate: float = 0.001 use_gpu: bool = FalseParse & Validate (OmegaConf):
- Load YAML, merge with CLI args, and validate against the schema.
import omegaconf # 1. Load YAML conf = omegaconf.OmegaConf.load("config.yaml") # 2. Merge with CLI (optional) cli_conf = omegaconf.OmegaConf.from_cli() merged = omegaconf.OmegaConf.merge(conf, cli_conf) # 3. Validate -> Returns a validated Pydantic object cfg: TrainingConfig = TrainingConfig(**omegaconf.OmegaConf.to_container(merged))Secrets: Use Environment Variables (
os.getenv), never commit them.
5. Documentation & Quality
Make code usable and maintainable.
Docstrings: Use Google Style docstrings for all modules, classes, and functions.
def calculate_metric(y_true: np.ndarray, y_pred: np.ndarray) -> float: """Calculates the accuracy score. Args: y_true: Ground truth labels. y_pred: Predicted labels. Returns: The accuracy as a float between 0 and 1. """Type Hints: Use standard python typing (
typing,list[str]) everywhere.
6. Best Practices Summary
- Config != Code: Never hardcode paths or hyperparams; use the
Pydantic + OmegaConfpattern. - Entrypoints are APIs: Design your CLI (
[project.scripts]) as the public interface for your automation tools. - Immutable Core: Keep your domain logic side-effect free; push I/O to the edges.
Self-Correction Checklist
- No Side Effects on Import: Does
import my_packagerun any code? (It shouldn't). - Src Layout: Is code inside
src/? - Config Safety: Are secrets excluded from
pyproject.tomland YAML? - Typing: Are function signatures fully type-hinted?
- Entrypoints: Is the CLI registered in
pyproject.toml?