Conversationalv1.4October 1, 2025

LumiChats 4B v1.4

Gemma-3-4B fine-tuned for natural multi-turn dialogue

Parameters:4,314,980,720
Trainable:0.35%
Training time:~7 minutes (30 steps demonstration)
Dataset:99,990 samples after
Gemma License (Google DeepMind) — commercial use permitted
Only 0.35% of parameters trained — 14.9M of 4.31B+15% conversational coherence over Gemma-3-4B-IT (estimated)+20% multi-turn context retention (estimated)Supports 140+ languages inherited from Gemma-3 base128K token context windowTraining completed in ~7 minutes on Tesla T4
§01

Abstract

LumiChats 4B v1.4 is a conversational language model fine-tuned from Google's Gemma-3-4B-IT using LoRA and 4-bit quantisation. Trained on 99,990 curated dialogue samples from FineTome-100k using the response-only training objective, the model improves conversational coherence by approximately 15% and multi-turn context retention by approximately 20% over the base model while preserving Gemma-3's strong reasoning, coding, and multilingual capabilities across 140+ languages. With only 14.9M of 4.31B parameters updated (0.35%), the model retains full base model intelligence while gaining purpose-built conversational structure and instruction-following reliability.
§02

Architecture & Configuration

LumiChats 4B v1.4 is built on unsloth/gemma-3-4b-it (Google DeepMind) using Low-Rank Adaptation (LoRA) — a parameter-efficient fine-tuning technique. Only 0.35% of parameters are updated.

Architecture
Transformer-based LLM (Gemma-3 architecture) with 4-bit NF4 quantisation
Total Parameters
4,314,980,720 (4.31B)
Trainable Parameters
14,901,248 (14.9M) (0.35%)
Context Length
128,000 tokens
Quantization
4-bit NF4 — ~4 GB VRAM for inference
LoRA Rank (r)
8
LoRA Alpha (α)
8
LoRA Target Modules
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Languages
140+ languages (inherited from Gemma-3)
§03

Training Details

Dataset
mlabonne/FineTome-100k
Dataset Size
99,990 samples after quality filtering (ShareGPT → HuggingFace format)
Objective
Response-only causal LM — loss on assistant turns only (labels = −100 for user turns)
Framework
Unsloth + TRL
Hardware
NVIDIA Tesla T4 (Google Colab)
Training Time
~7 minutes (30 steps demonstration)
Peak Memory
~9.2 GB VRAM
Max Steps
30
Hyperparameters
Learning Rate
2e-4
Batch Size
2
Gradient Accum.
4
Effective Batch
8
Optimizer
AdamW 8-bit
LR Scheduler
Linear
§04

Evaluation & Benchmarks

MetricValueBaselineDescription
HellaSwag (base — inherited)77.2% (10-shot)Common sense NLI reasoning
PIQA (base — inherited)79.6% (0-shot)Physical intuition question answering
MMLU (base — inherited)59.6% (5-shot)Massive multitask language understanding
Conversational coherence (estimated)+15% over baseImprovement from SFT on 100K dialogue examples
Multi-turn context retention (estimated)+20% over baseImprovement in tracking conversational state
§05

Base Model vs Fine-Tuned

Key improvements from fine-tuning on the mlabonne/FineTome-100k dataset versus the gemma-3-4b-it (Google DeepMind) base model.

DimensionBase (gemma-3-4b-it (Google DeepMind))LumiChats 4B v1.4
Multi-turn conversationGeneric, not optimised✅ Specifically fine-tuned for dialogue
Instruction followingModerate pretrain behaviour✅ Reinforced via response-only SFT
Chat templateRequires manual configuration✅ Gemma-3 template pre-applied
Training dataN/A (base pretraining)✅ 99,990 curated conversational samples
Training objectivePredict every token equally✅ Assistant responses only — no prompt memorisation
§06

Use Cases

Multilingual conversational AI applications
Personal assistant with broad language support
Content creation: essays, articles, creative writing
Question answering and knowledge retrieval
Code generation and debugging assistance
Document summarisation
§07

Limitations & Disclaimers

LumiChats 4B v1.4 inherits limitations of its base architecture and training data.

30-step demonstration fine-tune; full epoch (~12,500 steps) will yield stronger alignment
Factual hallucination possible — verify outputs for high-stakes decisions
Training data cutoff at January 2025; no awareness of later events
4-bit quantisation may introduce slight precision reduction in some edge cases
Context length trained at 2,048 tokens; very long context may degrade performance
§08

Citation

If you use LumiChats 4B v1.4 in research or products, please cite:

@misc{lumichats4b2025,
  title     = {LumiChats 4B v1.4: Fine-Tuned Gemma-3 for Conversational AI},
  author    = {LumiChats Team},
  year      = {2025},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/adityakum667388/lumichats_4Bz_v1.4}
}
License: Gemma License (Google DeepMind) — commercial use permitted View full license on Hugging Face

Related Models