AI Engineering, JavaScript Solutions, Competitive programming in JavaScript, MCQ in JS

Tuesday, May 12, 2026

Day 10 - AI Engineering - Chunking in RAG

Day 10 — Chunking in RAG (The Most Underrated Part of AI Systems)

AI Engineering — Day by Day
My journey to becoming an AI Engineer




After learning about embeddings and semantic search, I started feeling like I finally understood how retrieval works.

But then I realized something important:

Even perfect embeddings cannot save a badly chunked system.

And honestly, this completely changed how I think about RAG pipelines.


The Question That Started Everything

Once I understood embeddings, the next question became:

What exactly are we embedding and retrieving?

The answer:

Chunks of text

And this process of splitting documents into smaller pieces is called:

Chunking


Why Chunking Exists

Documents are usually:

  • Large
  • Unstructured
  • Too big for direct retrieval

For example:

100-page PDF  
Large knowledge base  
Long policy documents

We cannot simply embed an entire document as one giant block.

So, we split it into smaller meaningful units.


What I Initially Thought

At first, chunking sounded trivial to me.

I thought:

“Just split text every few hundred words.”

But the deeper I explored, the more I realized:

Chunking is not text splitting.
It is context preservation engineering.

What Happens with Large Chunks

Suppose one chunk contains:

  • Refund policy
  • Shipping policy
  • Account setup

Now imagine the user asks:

What is the refund policy?

The embedding generated for this chunk becomes:

A mixed representation of multiple topics

This creates a problem:

  • The semantic meaning becomes diluted
  • Retrieval accuracy decreases

This was a major realization for me:

Larger chunks don’t always mean better context.

What Happens with Small Chunks

Now let’s go to the opposite extreme.

Suppose the chunk is:

"Returned within 7 days"

What’s missing?

  • What is being returned?
  • Under what conditions?

This leads to:

  • Loss of surrounding context
  • Fragmented retrieval
  • Incomplete answers

At this point, I realized:

Too small chunks improve precision… but reduce meaning.

Why Overlapping Chunks Matter

This was one of the most interesting concepts.

Instead of splitting like this:

Chunk 1 → 1–500  
Chunk 2 → 501–1000

We overlap:

Chunk 1 → 1–500  
Chunk 2 → 400–900

Why?

Because ideas often continue across boundaries.

Overlap helps:

  • Preserve continuity
  • Reduce context loss
  • Improve retrieval reliability

Types of Chunking

1. Fixed Chunking

Split based on:

  • Character count
  • Token count

Example:

Every 500 tokens

Pros:

  • Simple
  • Fast

Cons:

  • Breaks meaning
  • Ignores structure

2. Semantic Chunking

Split based on:

  • Paragraphs
  • Topics
  • Meaning

Pros:

  • Preserves context
  • Improves retrieval quality

Cons:

  • More complex

Questions I Had While Learning

Why do large chunks reduce retrieval accuracy?

Because large chunks contain multiple topics, causing embeddings to represent a broad mixture of meanings instead of a focused semantic concept.


Why do very small chunks reduce answer quality?

Because they often lose surrounding context and relationships between ideas, resulting in fragmented retrieval and incomplete answers.


Why is overlap important?

Overlap preserves continuity between chunks and ensures important contextual information spanning chunk boundaries is not lost during retrieval.


The Biggest Insight I Got

At this point, something became very clear:

Most RAG failures are not model failures.
They are retrieval and chunking failures.

This changed my perspective completely.

Earlier:

  • I focused mainly on prompts and models

Now:

  • I understand the retrieval pipeline is equally important

What’s Next

Now that I understand chunking conceptually, the next step is:

Actually, implementing and experimenting with chunking strategies.

In the next post, I’ll:

  • Create different chunking strategies
  • Compare retrieval quality
  • Observe real-world failures

Final Thought

Before today, chunking felt like a preprocessing step.

Now it feels like:

One of the most important design decisions in a RAG system.


 

This is Day 10 of my AI engineering journey — and this concept completely changed how I think about retrieval systems.

No comments:

Post a Comment