In around 2006-2009 I was working on my own database engine. A Graph db before they were cool. For the prototype, I mostly only needed one real on-disk data structure in addition to the blobs for attribute values, and that was the b-tree. Some structures I was using are like B-trees of b-trees, or even b-trees of b-trees of b-trees, but still yes this concept is central to the whole shebang. I eventually migrated to using Lucene, and wrote my own query language and server for it, just in time to close my company down and get a real job. But it was indeed the binary search that was central to the whole thing. I still have a soft spot in my heart for the algorithm, and I see binary trees or the potential to build them, everywhere. When it comes to data, I see it in two ways now, a 'pay now' or 'pay later' for consuming it: you either sort your data now when you're about to store it, or spend time sorting it on the way out when you're fetching it, at least if it has comparable keys!

Nicely written 👌👌. In the past, when I’ve had to introduce binary search to students, I’ve used the same example of the books.

It’s worth noticing that in real life, we (us as humans) don’t really do binary search, but randomised binary search instead, since it’s hard for us to pick the exact middle.

We could think of it as picking the book to check randomly. It’s not hard to prove that this way, the expected number of steps until having a single book is logarithmic wrt to the amount of books.

In around 2006-2009 I was working on my own database engine. A Graph db before they were cool. For the prototype, I mostly only needed one real on-disk data structure in addition to the blobs for attribute values, and that was the b-tree. Some structures I was using are like B-trees of b-trees, or even b-trees of b-trees of b-trees, but still yes this concept is central to the whole shebang. I eventually migrated to using Lucene, and wrote my own query language and server for it, just in time to close my company down and get a real job. But it was indeed the binary search that was central to the whole thing. I still have a soft spot in my heart for the algorithm, and I see binary trees or the potential to build them, everywhere. When it comes to data, I see it in two ways now, a 'pay now' or 'pay later' for consuming it: you either sort your data now when you're about to store it, or spend time sorting it on the way out when you're fetching it, at least if it has comparable keys!

Nicely written 👌👌. In the past, when I’ve had to introduce binary search to students, I’ve used the same example of the books.

It’s worth noticing that in real life, we (us as humans) don’t really do binary search, but randomised binary search instead, since it’s hard for us to pick the exact middle.

We could think of it as picking the book to check randomly. It’s not hard to prove that this way, the expected number of steps until having a single book is logarithmic wrt to the amount of books.

Beautiful piece Alejandro!

I am deeply curious to know though if you could talk about how search engines such as Google's search and sorting systems work.