Tech on Quach

How PostgreSQL Evaluates Execution Plans: Cost Estimation Deep Dive

Wed, 01 Oct 2025 10:00:00 +0700

When diving deep into the EXPLAIN command, you already know what index scans, sequential scans, and hash joins are. But have you ever wondered how exactly the cost numbers are calculated? In this deep dive, we’ll explore the formulas behind PostgreSQL’s cost estimation for the three main scanning approaches: sequential scan, index scan, and bitmap heap scan.

1. Cost-based vs Rule-based

Consider this query:

SELECT * FROM table_a ta
JOIN table_b tb ON ta.id = tb.foreign_id
WHERE ta.status = 'active' AND tb.created_at > '2020-12-01';

How would you approach this query? The intuitive flow would be: filter each table first, then join. Make tables smaller first, then start joining. Quite intuitive, right?

Why the Average Complexity of Bucket sort is O(n)?

Tue, 20 May 2025 17:29:51 +0700

Part 1: Introduction and Code

Bucket Sort is an efficient sorting algorithm when input values are uniformly distributed over a range. It works by distributing elements into different “buckets”, sorting each bucket, and then concatenating the results.

Here’s a typical Python implementation where each bucket is sorted with Insertion Sort:

def insertion_sort(bucket):
    for i in range(1, len(bucket)):
        key = bucket[i]
        j = i - 1
        while j >= 0 and bucket[j] > key:
            bucket[j + 1] = bucket[j]
            j -= 1
        bucket[j + 1] = key

def bucket_sort(arr):
    n = len(arr)
    buckets = [[] for _ in range(n)]

    # Put array elements in different buckets
    for num in arr:
        bi = int(n * num)  # assuming input numbers are in [0,1)
        buckets[bi].append(num)

    # Sort individual buckets using insertion sort
    for bucket in buckets:
        insertion_sort(bucket)

    # Concatenate all buckets into arr[]
    index = 0
    for bucket in buckets:
        for num in bucket:
            arr[index] = num
            index += 1

Why Insertion Sort?

Insertion sort is simple and efficient for small or nearly sorted lists. Since each bucket contains only a fraction of the input, sorting each bucket with insertion sort is fast.

How Computers Do Differentiation?

Thu, 24 Apr 2025 17:29:51 +0700

Differentiation is a key concept in machine learning, especially when optimizing functions like loss functions in neural networks. It helps us find the minimum of these functions, which is crucial for tasks like training a model. But have you ever wondered how popular libraries like TensorFlow and PyTorch perform differentiation? Let’s break it down!

1. Manual Differentiation: The Old-School Method

In school, we learn how to manually compute derivatives using calculus. You apply a set of rules to functions to find how they change with respect to their inputs. For example, given a simple function like:

Why the Average Complexity of QuickSort is O(nlogn)?

Mon, 21 Apr 2025 17:29:51 +0700

For most developers, QuickSort is a fast and efficient sorting algorithm with a time complexity of O(nlogn). This makes it significantly better than other common sorting algorithms, like Selection Sort or Bubble Sort, which have a time complexity of O(n²). However, the question remains: Why is the average time complexity of QuickSort O(nlogn)?

In this blog, we will delve deep into the mathematical and probabilistic principles that explain this efficiency, helping you understand the underlying reasons why QuickSort is faster than other algorithms on average.

Bloom Filters Explained: A Fast and Space-Efficient Probabilistic Solution

Fri, 18 Apr 2025 17:29:51 +0700

Part 1: Motivation

How do we check if something is in a set — fast?

The simplest way is a List:

if x in items:
  ...

But this is O(n) — too slow for large-scale systems.

A HashSet improves to O(1) lookups on average, but it stores the full elements, requiring more memory than raw data — especially for strings or objects.

So what if we trade a little accuracy for massive savings?

What if a structure could: