Run end-to-end fine-tuning for request_text-to-hidden-params JSON prediction with Ministral-3-3B in this repository. Use when rebuilding train/validation/test splits from train-only data, converting data to prompt-completion format, launching TRL SFT on Hugging Face Jobs, validating strict JSON output behavior, merging LoRA adapters into full models, and reporting job/model status with links.
Resources
1Install
npx skillscat add haruk1y/mistral-hackathon/mistral-hidden-params-ft Install via the SkillsCat registry.
SKILL.md
Mistral Hidden Params Ft
Use this skill to run the full loop from dataset preparation to deployed model validation.
Key Files
scripts/ft/build_train_valid_test.mjsscripts/ft/convert_to_prompt_completion_dataset.pyscripts/hf/train_sft_request_to_hidden_lm.pyscripts/hf/debug_full_model_json_inference.pyscripts/hf/merge_and_upload_full_model.pyartifacts/hf_jobs/submissions.jsonl
Workflow
- Rebuild split quality first.
- Use train-only data and regenerate
train/validation/testbefore training when existingvalidation/testquality is low.
- Normalize data to prompt-completion.
- Ensure each sample is
{prompt, completion}with completion as strict JSON text. - Keep the inference prompt text exactly aligned between conversion, training, and debug scripts.
- Launch TRL SFT job on Hugging Face Jobs.
- Train with
scripts/hf/train_sft_request_to_hidden_lm.py. - Record job IDs and model IDs in
artifacts/hf_jobs/submissions.jsonl.
- Validate JSON output behavior with real generations.
- Run
scripts/hf/debug_full_model_json_inference.pyagainst either: - Base + adapter mode (
--base-model-id+--adapter-model-id) - Full merged model mode (
--model-id) - Report
valid_cases=X/Yand include representative malformed output when failures occur.
- Merge adapter into full model only when distribution simplicity is needed.
- Use
scripts/hf/merge_and_upload_full_model.py. - Explain that merge does not inherently improve prediction quality; it simplifies inference deployment.
- Report status with concrete links.
- Return Hugging Face job URL(s), output model repo URL(s), and whether each repo is adapter-only or full.
Guardrails
- Keep prompt wording identical across train/inference unless explicitly requested.
- Keep target keys exactly:
energy,warmth,brightness,acousticness,complexity,nostalgia. - Fail evaluation when output is not valid JSON object with exactly the required keys and integer range
0..10. - Prefer evidence from actual job logs over assumptions.