Recent Posts

April 10, 2026
Vision-LLM vs Document Parser

Recently, LandingAI published a blog post about their Agentic Document Extraction (ADE) system hitting 99.16% accuracy on the DocVQA benchmark. Impressive number! The methodology is interesting too — instead of feeding images directly to a vision-...
April 04, 2026
Accelerating LLM Inference - Speculative Decoding

LLM are incredibly capable, but they are notoriously slow to run. Because they generate text one token at a time, decoding $K$ tokens requires $K$ serial runs of the model. Speculative Decoding is a novel algorithm that breaks this bottleneck, all...
February 15, 2026
Preference Leakage: Contamination in LLM as a Judge

Modern development relies on two pillars of efficiency and scalability: Synthetic Data Generation and Automated Evaluations as shown in the the pipeline which is now the industry standard for aligning and benchmarking models. Recently I explore...
February 06, 2026
LayoutLM vs. LLMs + OCR: When Specialized Models Still Win

Over the last a couple of years, I’ve seen more and more document pipelines quietly converge on the same pattern: OCR a PDF, dump everything into a large language model, and hope the model figures out the rest.And to be fair — sometimes it works r...
January 25, 2026
Transformer -- Decoder-Only Model Explained In Codes

This is the very first post of many Transformer series posts. Keeping track of my own learning notes.Two types of model classesI will mainly use transformers library from HuggingFace. The transformers library has two types of model classes: AutoM...

Older Posts

Honglei Xie

Recent Posts

Vision-LLM vs Document Parser

Accelerating LLM Inference - Speculative Decoding

Preference Leakage: Contamination in LLM as a Judge

LayoutLM vs. LLMs + OCR: When Specialized Models Still Win

Transformer -- Decoder-Only Model Explained In Codes