//
Imagine you run a URL shortener like bit.ly with 10 billion shortened URLs. When a user creates a new short URL, you need to check if it already exists. Querying a database of 10 billion records for every single creation request is prohibitively slow. A Bloom Filter can answer "Does this URL exist?" in constant time, using a tiny fraction of the memory a full database index would require.
A Bloom Filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. It can tell you with certainty:
It NEVER produces false negatives. If the Bloom Filter says an element is not in the set, you can trust it 100%.
0. Choose $k$ independent hash functions.1.0: The element is definitely NOT in the set.1: The element is probably in the set. (Other elements might have set those same bits.)1, increasing the probability that a query for a non-existent element finds all 1s by coincidence. The false positive rate can be tuned by adjusting $m$ (bit array size) and $k$ (number of hash functions).0 could affect other elements). Counting Bloom Filters replace each bit with a counter to support deletion.