Recent Posts
-
February 15, 2026
Preference Leakage: Contamination in LLM as a Judge
Modern development relies on two pillars of efficiency and scalability: Synthetic Data Generation and Automated Evaluations as shown in the the pipeline which is now the industry standard for aligning and benchmarking models. Recently I explore...
-
February 06, 2026
LayoutLM vs. LLMs + OCR: When Specialized Models Still Win
Over the last a couple of years, I’ve seen more and more document pipelines quietly converge on the same pattern: OCR a PDF, dump everything into a large language model, and hope the model figures out the rest.And to be fair — sometimes it works r...
-
January 25, 2026
Transformer -- Decoder-Only Model Explained In Codes
This is the very first post of many Transformer series posts. Keeping track of my own learning notes.Two types of model classesI will mainly use transformers library from HuggingFace. The transformers library has two types of model classes: AutoM...
-
January 19, 2026
Floating Point Computations Errors
Floating-point computations are well-known for their susceptibility to round-off errors. In this post, I aim to document a couple of scenarios where these errors occur and explore potential workarounds where applicable.Scenario 1When the component...
-
December 25, 2024
Simple Way to Use Mathjax with Jekyll
To add MathJax to a Jekyll Blog, the easiest option is to simply add the following script into your layout file such as _layouts/head.html<script type="text/javascript" id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/...