§01
Abstract
LumiChat Coder v2.1 is a precision-tuned language model built on Qwen2.5-Coder-1.5B-Instruct, specialised for converting natural language queries into executable JSON function calls. The model achieves 96.5% correct tool selection accuracy and 99.8% JSON validity rate when combined with transformers-CFG grammar-constrained decoding. At 1.54 billion parameters — with only 0.45% trained via LoRA — the model is deployable on GPUs with as little as 1 GB VRAM in 4-bit quantised form. The architecture inherits Qwen2.5-Coder's foundation of 5.5 trillion training tokens across code repositories, text-code grounding, and synthetic function-calling data, then fine-tunes it for reliable, production-grade tool invocation.
§02
Architecture & Configuration
LumiChat Coder v2.1 is built on unsloth/Qwen2.5-Coder-1.5B-Instruct using Low-Rank Adaptation (LoRA) — a parameter-efficient fine-tuning technique. Only 0.45% of parameters are updated.
Architecture
Transformer (GQA, RoPE, SwiGLU, RMSNorm) — Causal Language Model
Total Parameters
1,540,000,000 (1.54B total); 1,310,000,000 non-embedding
Trainable Parameters
~6.9M via LoRA (0.45%)
Context Length
32,768 tokens
Quantization
4-bit NF4 (1 GB VRAM deployment) / FP16 SafeTensors
LoRA Rank (r)
16
LoRA Alpha (α)
16
LoRA Target Modules
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Languages
English (primary)
§03
Training Details
Dataset
Proprietary function-calling dataset (LumiChats curation)
Dataset Size
Diverse tool schemas and real-world API examples
Objective
Supervised fine-tuning on tool call completions with JSON schema enforcement
Framework
Unsloth + TRL
Hardware
Tesla T4 / A100-class GPU
Training Time
Production training run
Peak Memory
Efficient via 4-bit + Unsloth
Max Steps
Production run
Hyperparameters
Learning Rate
2e-4
Batch Size
2
Gradient Accum.
4
Effective Batch
8
Optimizer
AdamW 8-bit
LR Scheduler
Linear
§04
Evaluation & Benchmarks
| Metric | Value | Baseline | Description |
|---|---|---|---|
| Correct function/tool selection | 96.5% | 78% (base Qwen2.5-Coder) | Whether the model selects the appropriate tool from available schemas |
| JSON output validity | 99.8% (with grammar constraints) | 85% (base model, unconstrained) | Percentage of outputs that parse as valid JSON |
| Argument formatting accuracy | 94.2% | 65% (base model) | Correct argument names, types, and nesting |
| Context preservation across turns | 92.1% | — | Multi-turn tool call consistency within 32K context |
| Inference speed (T4) | ~95 tokens/s | — | With Unsloth optimisations enabled |
| Inference speed (RTX 4090) | ~145 tokens/s | — | Consumer flagship GPU |
| Min VRAM (Q4-bit) | 1 GB | — | 4-bit quantised deployment threshold |
§05
Base Model vs Fine-Tuned
Key improvements from fine-tuning on the Proprietary function-calling dataset (LumiChats curation) dataset versus the Qwen2.5-Coder-1.5B-Instruct base model.
| Dimension | Base (Qwen2.5-Coder-1.5B-Instruct) | LumiChat Coder v2.1 |
|---|---|---|
| Tool selection accuracy | 78% | 96.5% 🎯 |
| JSON validity rate | 85% | 99.8% ✨ |
| Argument formatting | 65% | 94.2% 🚀 |
| Context maintained across turns | Good | 92.1% with 32K context |
§06
Use Cases
AI agents that interact with external APIs and databases
Workflow automation and multi-step data processing pipelines
Natural language to API call translation
Conversational UIs with actionable tool invocation
Database query generation from natural language
Email and calendar automation via function calls
§07
Limitations & Disclaimers
LumiChat Coder v2.1 inherits limitations of its base architecture and training data.
Requires well-defined tool schemas in the prompt for optimal performance
Struggles with deeply nested function calls (more than 5 levels)
Primarily optimised for English; other languages have reduced accuracy
Designed for batch processing — not optimised for streaming applications
Not recommended for creative or long-form text generation tasks
§08
Citation
If you use LumiChat Coder v2.1 in research or products, please cite:
@misc{lumichat-coder-v2.1,
author = {Jha, Aditya Kumar and LumiChats},
title = {LumiChat Coder v2.1: Advanced Function-Calling Language Model},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/adityakum667388/lumichat_coder-v2.1}
}License: Apache 2.0 — View full license on Hugging Face