On this page
What Is a Database Index?
How Indexes Speed Up Queries
B-Tree: The Data Structure Behind Indexes
The Trade-offs: Why Not Index Everything?
Conclusion
Grokking Database Indexing: The Secret Behind Blazing-Fast Queries


This blog explores the internal workings of database indexes, focusing on how structures like B-trees enable fast data retrieval. It explains how indexes optimize query performance, the trade-offs involved, and why understanding index internals is crucial for designing efficient, scalable systems.
Imagine running a query on a massive database table and getting results in milliseconds.
How is that possible?
For many developers, the answer comes as an epiphany the first time they add an index to a slow query – a report or API call that was taking 5-6 seconds suddenly runs in under a second after indexing the right column.
It may feel like magic, but there’s no sorcery involved – just clever use of data structures.
Let's explore the internals of database indexes to understand how they make data retrieval so fast (and even when not to use them!).
What Is a Database Index?
A database index is essentially an auxiliary data structure that helps the database find information without scanning every row (like the index of a book saves you from reading every page).
In practical terms, an index stores a sorted copy of certain column values along with pointers to the full rows.
This way, the database can search the smaller index instead of the entire table, drastically reducing the work needed to find the data.
How Indexes Speed Up Queries
Let’s dive a bit deeper into how an index makes queries faster.
Without an index, if you search for WHERE name = 'Alice' on a large table, the database might have to inspect every row until it finds all the Alices – this is akin to reading a whole phone book to find one name.
That approach is O(N) in computational terms (meaning time grows linearly with number of rows).
Now enter the index: with an index on the name column, the database can use a much more efficient strategy, roughly O(log N), by doing something like a binary search on the sorted index data.
In essence, the index lets the database skip over huge swaths of data that don’t matter to your query, homing in only on the relevant portions.
Here’s what happens under the hood when you query a table using an indexed column:
- 
Index lookup: The database checks the index (sorted by the target field) instead of scanning the entire table. 
- 
Traverse the tree: It follows the index’s B-tree pointers from root to leaf, quickly narrowing down to the matching entries with just a few comparisons. 
- 
Retrieve the row: Once the index finds the right key, it uses the stored pointer to fetch the full row from the main table. 
A well-chosen index can turn a brute-force scan into a surgical lookup, saving a ton of time. It’s worth noting that under the hood, the index is often maintained as a sorted “pseudo-table” with just the indexed column and a reference to the full row.
So when you CREATE INDEX on most_recent_activity in a table, the database builds a structure sorted by most_recent_activity containing values and pointers to each corresponding row.
This sorted order is the secret sauce that enables efficient binary search-like behavior.
B-Tree: The Data Structure Behind Indexes
So, what is this tree structure we keep alluding to?
Meet the B-tree (and its close relative, the B+tree) – a workhorse data structure used by most databases for indexing.
A B-tree is essentially a multi-level tree where each node can have many children (not just two, as in a binary tree).
It’s balanced, meaning all the leaf nodes (the bottom of the tree where the actual index entries live) are at the same depth.
This ensures that every lookup takes roughly the same number of steps, preventing some lookups from being slower than others.
Here are a few key points about how a B-tree index works internally:
- 
Wide & shallow tree: A B-tree node can hold many keys, keeping the tree broad and shallow. Even millions of records might only create a tree 3–4 levels deep, so the database only needs a few pointer hops (and thus a few disk reads) to find any record. 
- 
Sorted keys and pointers: Within each node, keys are maintained in sorted order, each with pointers to child nodes (or to data at the leaves). This ordering ensures the database can quickly decide which path to follow for a given lookup, rather than checking every entry. 
The Trade-offs: Why Not Index Everything?
Indexes sound awesome – and they are! – but they come with trade-offs.
It’s important to understand that the speed boost for reads comes at a cost for writes and storage:
- 
Storage Overhead: Indexes occupy extra disk space – often nearly as much as the data itself, since they store copies of keys plus pointers. 
- 
Slower Writes: Inserts, updates, and deletes all have to update every index on the table, so too many indexes can make write operations significantly slower. 
- 
Too Many Indexes: Besides wasting space and slowing writes, excessive indexes can even confuse the optimizer and hurt overall performance. 
In summary, indexes are a double-edged sword – they make retrieving data fast, but make changing data a bit slower and consume extra space.
This is why understanding their internals and behavior is so useful: it helps you decide what to index, and what not to.
Conclusion
It should be clear now that there’s no magic in what indexes do – just well-crafted data structures working behind the scenes.
Understanding these internals lets you use indexes wisely.
The next time you add an index to speed up a query, you’ll know exactly why it helps.
Remember, an index is just a tool – a powerful one, but not free.
Use indexes to improve query performance when it matters, but keep an eye on the trade-offs.
With this understanding of index internals, you can design better databases that strike the right balance between fast reads and efficient writes.
For further reading, check out the SQL vs NoSQL key differences to see how different database models impact concepts like indexing and data retrieval.
Happy querying!
What our users say
AHMET HANIF
Whoever put this together, you folks are life savers. Thank you :)
KAUSHIK JONNADULA
Thanks for a great resource! You guys are a lifesaver. I struggled a lot in design interviews, and this course gave me an organized process to handle a design problem. Please keep adding more questions.
Roger Cruz
The world gets better inch by inch when you help someone else. If you haven't tried Grokking The Coding Interview, check it out, it's a great resource!
Access to 50+ courses
New content added monthly
Certificate of completion
$33.25
/month
Billed Annually
Recommended Course
Grokking the System Design Interview
0+ students
4.7
Grokking the System Design Interview is a comprehensive course for system design interview. It provides a step-by-step guide to answering system design questions.
View Course