tylertitsworth
@tylertitsworth
Public Skills
github-actions
by tylertitsworth
"GitHub Actions — workflows, triggers, matrix, reusable workflows, ARC runners, caching, security, supply chain attestations. Use when building CI pipelines. NOT for GitOps delivery (see argocd)."
cert-manager
by tylertitsworth
"cert-manager — Issuers, ACME/Let's Encrypt, CA, Vault, Certificate resources, Ingress/Gateway TLS, ACME certificate profiles. Use when managing X.509 certificates in Kubernetes. NOT for manual TLS."
flash-attention
by tylertitsworth
"Flash Attention, FlashInfer, SDPA backends, PagedAttention, and attention kernel selection/configuration. Use when choosing or configuring attention backends for training or inference (FlashAttention-2/3, FlashInfer, SDPA, xFormers, PagedAttention, Ring Attention, FlexAttention/FlexDecoding, varlen_attn)."
docker-buildx
by tylertitsworth
"Docker Buildx — multi-platform builds, BuildKit mounts, Bake, build policies (Rego/OPA), cache backends, CI integration. Use when building container images. NOT for GPU runtime or Compose."
argocd
by tylertitsworth
"Argo CD GitOps on Kubernetes — Applications, ApplicationSets, sync policies, multi-cluster, Helm/Kustomize sources, RBAC, Source Hydrator, PreDelete hooks, KEDA integration, and troubleshooting. Use when deploying or managing apps via GitOps. NOT for CI pipelines or image builds."
kubeflow-trainer
by tylertitsworth
"Kubeflow Trainer v2 — TrainJob CRD, Training Runtimes, Python SDK, JobSet, multi-framework support. Use when orchestrating distributed training on K8s. NOT for inference or Ray Train."
gpu-operator
by tylertitsworth
"NVIDIA GPU Operator on Kubernetes — ClusterPolicy, DRA (Dynamic Resource Allocation), time-slicing, MIG, DCGM metrics, driver management, GPUDirect RDMA/GDS, CDI, and GPU scheduling. Use when installing, configuring, or troubleshooting GPU infrastructure on K8s."
karpenter
by tylertitsworth
"Karpenter — NodePool/EC2NodeClass CRDs, GPU provisioning, consolidation, drift, Spot, EFA placement. Use when autoscaling EKS nodes. NOT for Cluster Autoscaler or pod autoscaling."
axolotl
by tylertitsworth
"Axolotl — config-driven LLM fine-tuning framework. YAML-based SFT, instruction tuning, chat fine-tuning, DPO/IPO/KTO/ORPO preference optimization, GRPO reinforcement learning, reward modeling, LoRA/QLoRA, full fine-tuning with FSDP/DeepSpeed multi-GPU, N-D Parallelism (TP+CP+FSDP composition), multimodal VLM training, text diffusion training, QAT with NVFP4, sample packing, Flash Attention, dataset preprocessing, and checkpoint management. Use when fine-tuning or post-training LLMs with Axolotl."
keda
by tylertitsworth
"KEDA — ScaledObject/ScaledJob CRDs, 60+ triggers, scale-to-zero for GPU inference, TriggerAuthentication. Use when configuring event-driven pod autoscaling. NOT for HPA basics or Kueue."
cilium
by tylertitsworth
"Cilium â eBPF-based CNI for Kubernetes networking, security, and observability. Use when installing, configuring network policies, Hubble, ClusterMesh, service mesh, BGP, encryption, and troubleshooting. NOT for Calico/Flannel or standalone eBPF."
kueue
by tylertitsworth
"Kueue — ClusterQueues, ResourceFlavors, fair sharing, preemption, TAS, MultiKueue. Use when managing batch workload queuing and GPU quotas on K8s. NOT for Volcano."
aiconfigurator
by tylertitsworth
"NVIDIA AIConfigurator — optimal LLM serving configuration for disaggregated/aggregated deployments, parallelism selection (TP/PP/EP/DP), quantization, and MOE planning. Use when planning model deployment topology on NVIDIA GPUs."
external-secrets
by tylertitsworth
"External Secrets Operator (ESO) — SecretStore, ClusterSecretStore, ExternalSecret, PushSecret, templating, and multi-tenant patterns. Use when syncing secrets from AWS Secrets Manager or Vault into K8s. NOT for sealed-secrets or SOPS."
dvc
by tylertitsworth
"DVC — Git-based data/model versioning, reproducible pipelines, experiment tracking, remote storage. Use when versioning ML data or building reproducible pipelines. NOT for databases."
aws-fsx
by tylertitsworth
"FSx for Lustre — performance tuning, striping, S3 data repositories, EKS integration. Use when configuring high-performance storage for ML on EKS. NOT for EBS or EFS."