AI Engineering, JavaScript Solutions, Competitive programming in JavaScript, MCQ in JS

Thursday, May 7, 2026

Day 9 - AI Engineering journey - RAG Learning - Embeddings

Day 9— Embeddings
(The Backbone of RAG Systems)

AI Engineering — Day by Day
My journey to becoming an AI Engineer




After understanding why prompting alone is not enough and how RAG changes the system design, I reached a point where one question became unavoidable:

How does a system actually find the “right” information?

This is where I came across one of the most important concepts in modern AI systems:

👉 Embeddings


🧠 What I Initially Thought

At first, I assumed search would work like:

  • Match keywords
  • Find exact words

But that approach quickly breaks:

  • "refund" ≠ "money back"
  • "car" ≠ "vehicle"

And that’s when I realized:

Machines don’t need to match words — they need to match meaning.

🔍 What is an Embedding?

An embedding is a way to convert text into numbers such that:

Similar meaning → similar numerical representation

For example:

"Apple is a fruit"
"Banana is a fruit"
"Car is fast"

After converting into embeddings:

Apple  → [0.12, 0.87, 0.44, ...]  
Banana → [0.11, 0.85, 0.46, ...]  
Car    → [0.91, 0.02, 0.77, ...]  

Here:

  • Apple and Banana are close in vector space
  • Car is far away

🧠 The Important Shift

This changed how I think about search:

  • It’s not about matching words
  • It’s about matching meaning

📐 How Similarity is Measured

To compare embeddings, we use something called:

Cosine Similarity

Without going into heavy math, it simply measures:

How similar the direction of two vectors is

Result:

  • Closer to 1 → very similar
  • Closer to 0 → not similar

📊 Visualizing Embeddings

You can imagine embeddings as points in a high-dimensional space where:

  • Similar concepts cluster together
  • Different concepts are far apart

🔄 How This Fits Into RAG

This is where everything connects:

User Query  
↓  
Convert to embedding  
↓  
Compare with document embeddings  
↓  
Find closest matches  
↓  
Send to LLM  

This step is called:

Retrieval


🧪 My First Experiment (Code)

I used a local embedding model:

pip install sentence-transformers

Then I ran this:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

sentences = [
    "Apple is a fruit",
    "Banana is a fruit",
    "Car is fast"
]

embeddings = model.encode(sentences)
print(embeddings)

Next, I compared similarity:

from sklearn.metrics.pairwise import cosine_similarity

similarity = cosine_similarity([embeddings[0]], embeddings)
print(similarity)

What I Observed

  • Apple and Banana had very high similarity
  • Car was clearly different

This confirmed something important:

The model understands meaning, not just words.

Why This is Critical for RAG

Without embeddings:

  • Search is keyword-based
  • Many relevant results are missed

With embeddings:

  • Search becomes semantic
  • Results are more relevant

And this directly impacts:

The quality of final answers generated by the LLM.

🧠 Key Insight

At this point, I realized:

If embeddings are wrong → retrieval is wrong → final answer is wrong.

So embeddings are not just a technical detail…

They are the foundation of the entire RAG system.


🚀 What’s Next

Now that I understand how to represent meaning as vectors, the next challenge is:

How do we split documents, so retrieval actually works well?

Because:

Even perfect embeddings won’t help if the input chunks are bad.

In the next post (Day 9), I’ll explore:

  • Chunking strategies
  • Why most RAG systems fail at this step

💭 Final Thought

Embeddings completely changed how I think about search.

It’s no longer about:

  • Finding exact matches

It’s about:

  • Finding meaning

And that shift is what makes RAG systems possible.

This is Day 9 of my AI engineering journey — and this concept feels like a major unlock.

No comments:

Post a Comment