Day 10 — Chunking in RAG (The Most Underrated Part of AI Systems)

AI Engineering — Day by Day
My journey to becoming an AI Engineer

After learning about embeddings and semantic search, I started feeling like I finally understood how retrieval works.

But then I realized something important:

Even perfect embeddings cannot save a badly chunked system.

And honestly, this completely changed how I think about RAG pipelines.

The Question That Started Everything

Once I understood embeddings, the next question became:

What exactly are we embedding and retrieving?

The answer:

Chunks of text

And this process of splitting documents into smaller pieces is called:

Chunking

Why Chunking Exists

Documents are usually:

Large
Unstructured
Too big for direct retrieval

For example:

100-page PDF  
Large knowledge base  
Long policy documents

We cannot simply embed an entire document as one giant block.

So, we split it into smaller meaningful units.

What I Initially Thought

At first, chunking sounded trivial to me.

I thought:

“Just split text every few hundred words.”

But the deeper I explored, the more I realized:

Chunking is not text splitting.
It is context preservation engineering.

What Happens with Large Chunks

Suppose one chunk contains:

Refund policy
Shipping policy
Account setup

Now imagine the user asks:

What is the refund policy?

The embedding generated for this chunk becomes:

A mixed representation of multiple topics

This creates a problem:

The semantic meaning becomes diluted
Retrieval accuracy decreases

This was a major realization for me:

Larger chunks don’t always mean better context.

What Happens with Small Chunks

Now let’s go to the opposite extreme.

Suppose the chunk is:

"Returned within 7 days"

What’s missing?

What is being returned?
Under what conditions?

This leads to:

Loss of surrounding context
Fragmented retrieval
Incomplete answers

At this point, I realized:

Too small chunks improve precision… but reduce meaning.

Why Overlapping Chunks Matter

This was one of the most interesting concepts.

Instead of splitting like this:

Chunk 1 → 1–500  
Chunk 2 → 501–1000

We overlap:

Chunk 1 → 1–500  
Chunk 2 → 400–900

Why?

Because ideas often continue across boundaries.

Overlap helps:

Preserve continuity
Reduce context loss
Improve retrieval reliability

Types of Chunking

1. Fixed Chunking

Split based on:

Character count
Token count

Example:

Every 500 tokens

Pros:

Simple
Fast

Cons:

Breaks meaning
Ignores structure

2. Semantic Chunking

Split based on:

Paragraphs
Topics
Meaning

Pros:

Preserves context
Improves retrieval quality

Cons:

More complex

Questions I Had While Learning

Why do large chunks reduce retrieval accuracy?

Because large chunks contain multiple topics, causing embeddings to represent a broad mixture of meanings instead of a focused semantic concept.

Why do very small chunks reduce answer quality?

Because they often lose surrounding context and relationships between ideas, resulting in fragmented retrieval and incomplete answers.

Why is overlap important?

Overlap preserves continuity between chunks and ensures important contextual information spanning chunk boundaries is not lost during retrieval.

The Biggest Insight I Got

At this point, something became very clear:

Most RAG failures are not model failures.
They are retrieval and chunking failures.

This changed my perspective completely.

Earlier:

I focused mainly on prompts and models

Now:

I understand the retrieval pipeline is equally important

What’s Next

Now that I understand chunking conceptually, the next step is:

Actually, implementing and experimenting with chunking strategies.

In the next post, I’ll:

Create different chunking strategies
Compare retrieval quality
Observe real-world failures

Final Thought

Before today, chunking felt like a preprocessing step.

Now it feels like:

One of the most important design decisions in a RAG system.

This is Day 10 of my AI engineering journey — and this concept completely changed how I think about retrieval systems.

JSDevLife

Tuesday, May 12, 2026

Day 10 - AI Engineering - Chunking in RAG

No comments:

Post a Comment

Trending

Blog Archive

Pages

Total Pageviews