§01
Abstract
LumiChats v1.2 7B is a specialised vision-language model fine-tuned from Qwen2.5-VL-7B-Instruct for image-to-LaTeX optical character recognition. The model learns to convert photographs and scans of handwritten mathematical formulae — including integrals, partial derivatives, Greek symbols, and multi-line expressions — into properly formatted LaTeX code. Fine-tuned via LoRA on 68,686 samples from the unsloth/LaTeX_OCR dataset, training completed in 3.27 minutes on a Tesla T4 GPU consuming only 0.674 GB of additional memory, demonstrating extreme parameter efficiency: only 0.62% of the model's 8.3 billion total parameters were updated.
§02
Architecture & Configuration
LumiChats v1.2 7B is built on unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit using Low-Rank Adaptation (LoRA) — a parameter-efficient fine-tuning technique. Only 0.62% of parameters are updated.
Architecture
Vision-Language Transformer: Visual Encoder + 7B Language Decoder with multimodal fusion
Total Parameters
~8,343,688,192 (8.3B total)
Trainable Parameters
51,521,536 (51.5M) (0.62%)
Context Length
2,048 tokens (max_seq_length during training)
Quantization
4-bit bitsandbytes NF4; compute dtype bfloat16
LoRA Rank (r)
16
LoRA Alpha (α)
16
LoRA Target Modules
Vision layers, Language layers, Attention modules, MLP modules
Languages
Mathematical notation (language-agnostic LaTeX output)
§03
Training Details
Dataset
unsloth/LaTeX_OCR
Dataset Size
68,686 handwritten formula image-LaTeX pairs
Objective
Conversational image-to-text: User (image + instruction) → Assistant (LaTeX code)
Framework
Unsloth FastVisionModel + TRL SFTTrainer
Hardware
Tesla T4 (Google Colab)
Training Time
3.27 minutes (30 steps)
Peak Memory
0.674 GB additional GPU memory
Max Steps
30
Hyperparameters
Learning Rate
2e-4
Batch Size
2
Gradient Accum.
4
Effective Batch
8
Optimizer
AdamW 8-bit
LR Scheduler
Linear
§04
Evaluation & Benchmarks
| Metric | Value | Baseline | Description |
|---|---|---|---|
| Symbol recognition improvement over base | Significant (qualitative) | Base model: incorrect denominator, wrong ∂ subscripts | Corrects denominator: 2B²N² → 2β²PN²; fixes partial derivative subscripts |
| LaTeX formatting adherence | Improved | Base: irregular spacing, inconsistent delimiter usage | Proper \left\{, \right\}, consistent spacing between operators |
| Min VRAM (4-bit) | 6 GB | — | Minimum GPU VRAM for inference |
| Recommended VRAM | 8 GB+ | — | For reliable batch processing |
§05
Base Model vs Fine-Tuned
Key improvements from fine-tuning on the unsloth/LaTeX_OCR dataset versus the Qwen2.5-VL-7B-Instruct-bnb-4bit base model.
| Dimension | Base (Qwen2.5-VL-7B-Instruct-bnb-4bit) | LumiChats v1.2 7B |
|---|---|---|
| Symbol accuracy | ❌ Incorrect (hallucinated B²N² as denominator) | ✅ Correct (2β²PN²) |
| Partial derivative subscripts | ❌ Wrong (\partial_\lambda) | ✅ Correct (\partial _ { s }) |
| Delimiter usage | ⚠️ Inconsistent | ✅ Proper \left\{, \right\} |
| Spacing style | ⚠️ Irregular | ✅ Standard LaTeX conventions |
§06
Use Cases
Converting handwritten lecture notes or exam solutions to typeset LaTeX
Digitising legacy mathematical manuscripts and textbooks
Educational tools for automatic grading of handwritten math
Research paper digitisation pipelines
Scientific documentation automation
§07
Limitations & Disclaimers
LumiChats v1.2 7B inherits limitations of its base architecture and training data.
Optimised for mathematical notation — not suitable for general handwriting OCR
May struggle with very ambiguous handwriting or extremely dense notation
4-bit quantisation may introduce minor precision loss on subtle symbol distinctions
Limited to single-image inference; does not process multi-page documents natively
30-step fine-tune; extended training will improve robustness further
§08
Citation
If you use LumiChats v1.2 7B in research or products, please cite:
@misc{lumichats_v1.2_2025,
title = {LumiChats v1.2 7B: Vision-Language Model for Handwritten LaTeX OCR},
author = {LumiChats Team},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/adityakum667388/lumichats-v1.2-7b-bnb-4bit}
}License: Apache 2.0 — View full license on Hugging Face