§01

Abstract

LumiChats v1.2 7B is a specialised vision-language model fine-tuned from Qwen2.5-VL-7B-Instruct for image-to-LaTeX optical character recognition. The model learns to convert photographs and scans of handwritten mathematical formulae — including integrals, partial derivatives, Greek symbols, and multi-line expressions — into properly formatted LaTeX code. Fine-tuned via LoRA on 68,686 samples from the unsloth/LaTeX_OCR dataset, training completed in 3.27 minutes on a Tesla T4 GPU consuming only 0.674 GB of additional memory, demonstrating extreme parameter efficiency: only 0.62% of the model's 8.3 billion total parameters were updated.

§02

Architecture & Configuration

LumiChats v1.2 7B is built on unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit using Low-Rank Adaptation (LoRA) — a parameter-efficient fine-tuning technique. Only 0.62% of parameters are updated.

Architecture

Vision-Language Transformer: Visual Encoder + 7B Language Decoder with multimodal fusion

Total Parameters

~8,343,688,192 (8.3B total)

Trainable Parameters

51,521,536 (51.5M) (0.62%)

Context Length

2,048 tokens (max_seq_length during training)

Quantization

4-bit bitsandbytes NF4; compute dtype bfloat16

LoRA Rank (r)

LoRA Alpha (α)

LoRA Target Modules

Vision layers, Language layers, Attention modules, MLP modules

Languages

Mathematical notation (language-agnostic LaTeX output)

§03

Training Details

Dataset

unsloth/LaTeX_OCR

Dataset Size

68,686 handwritten formula image-LaTeX pairs

Objective

Conversational image-to-text: User (image + instruction) → Assistant (LaTeX code)

Framework

Unsloth FastVisionModel + TRL SFTTrainer

Hardware

Tesla T4 (Google Colab)

Training Time

3.27 minutes (30 steps)

Peak Memory

0.674 GB additional GPU memory

Max Steps

Hyperparameters

Learning Rate

2e-4

Batch Size

2

Gradient Accum.

4

Effective Batch

8

Optimizer

AdamW 8-bit

LR Scheduler

Linear

Dataset: https://huggingface.co/datasets/unsloth/LaTeX_OCR

§04

Evaluation & Benchmarks

Metric	Value	Baseline	Description
Symbol recognition improvement over base	Significant (qualitative)	Base model: incorrect denominator, wrong ∂ subscripts	Corrects denominator: 2B²N² → 2β²PN²; fixes partial derivative subscripts
LaTeX formatting adherence	Improved	Base: irregular spacing, inconsistent delimiter usage	Proper \left\{, \right\}, consistent spacing between operators
Min VRAM (4-bit)	6 GB	—	Minimum GPU VRAM for inference
Recommended VRAM	8 GB+	—	For reliable batch processing

§05

Base Model vs Fine-Tuned

Key improvements from fine-tuning on the unsloth/LaTeX_OCR dataset versus the Qwen2.5-VL-7B-Instruct-bnb-4bit base model.

Dimension	Base (Qwen2.5-VL-7B-Instruct-bnb-4bit)	LumiChats v1.2 7B
Symbol accuracy	❌ Incorrect (hallucinated B²N² as denominator)	✅ Correct (2β²PN²)
Partial derivative subscripts	❌ Wrong (\partial_\lambda)	✅ Correct (\partial _ { s })
Delimiter usage	⚠️ Inconsistent	✅ Proper \left\{, \right\}
Spacing style	⚠️ Irregular	✅ Standard LaTeX conventions

§06

Use Cases

Converting handwritten lecture notes or exam solutions to typeset LaTeX

Digitising legacy mathematical manuscripts and textbooks

Educational tools for automatic grading of handwritten math

Research paper digitisation pipelines

Scientific documentation automation

§07

Limitations & Disclaimers

LumiChats v1.2 7B inherits limitations of its base architecture and training data.

Optimised for mathematical notation — not suitable for general handwriting OCR

May struggle with very ambiguous handwriting or extremely dense notation

4-bit quantisation may introduce minor precision loss on subtle symbol distinctions

Limited to single-image inference; does not process multi-page documents natively

30-step fine-tune; extended training will improve robustness further

§08

Citation

If you use LumiChats v1.2 7B in research or products, please cite:

@misc{lumichats_v1.2_2025,
  title     = {LumiChats v1.2 7B: Vision-Language Model for Handwritten LaTeX OCR},
  author    = {LumiChats Team},
  year      = {2025},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/adityakum667388/lumichats-v1.2-7b-bnb-4bit}
}

License: Apache 2.0 — View full license on Hugging Face

LumiChats v1.2 7B