"BloomFilter" (Data Structure)
"BloomFilter"
represents a set that tests whether elements are definitely not members.
Details
- A Bloom filter is typically used as a probabilistic set that can test if elements are definitely not members.
- Testing an element for membership of a Bloom filter will always return False if the element is not a member of the set.
- Testing an element for membership of a Bloom filter will not always return True if the element is a member of the set:
-
CreateDataStructure[ "BloomFilter",capacity] create a new "BloomFilter" of specified capacity CreateDataStructure["BloomFilter", capacity,p] create a new "BloomFilter" with specified false positive probability p. Typed[x,"BloomFilter"] give x the type "BloomFilter" - For a data structure of type "BloomFilter", the following operations can be used:
-
ds["Capacity"] the storage capacity of ds time: O(1) ds["Copy"] return a copy of ds time: O(n) ds["CouldContain",x] return True if ds could contain x and False otherwise time: O(1) ds["Insert",x] insert x into ds time: O(1) ds["LoadFactor"] the load factor of ds time: O(1) ds["FalsePositiveProbability"] return the false positive probability for ds time: O(1) ds["Visualization"] return a visualization of ds time: O(n) - The following functions are also supported:
-
dsi===dsj True, if dsi equals dsj FullForm[ds] full form of ds Information[ds] information about ds InputForm[ds] input form of ds Normal[ds] convert ds to a normal expression
Examples
open all close allBasic Examples (1)
A new "BloomFilter" can be created with CreateDataStructure:
ds = CreateDataStructure["BloomFilter", 2]Any expression can be inserted into a "BloomFilter" data structure:
ds["Insert", f[1]]It is possible to test if an expression is not present. A result of False means that f[2] is definitely not in the set:
ds["CouldContain", f[2]]A result of True means that f[1] is possibly in the set:
ds["CouldContain", f[1]]Return an expression version of ds:
Normal[ds]A visualization of the data structure can be generated:
ds["Visualization"]Scope (2)
Information (1)
A new "BloomFilter" can be created with CreateDataStructure:
ds = CreateDataStructure["BloomFilter", 2]Information about the data structure ds:
Information[ds]FalsePositiveProbability (1)
Create a "BloomFilter" with a high false positive probability:
dsH = CreateDataStructure["BloomFilter", 2, 0.7]Create a "BloomFilter" with the default false positive probability:
dsL = CreateDataStructure["BloomFilter", 2]dsH["Insert", f[1]]dsL["Insert", f[1]]The data structure with the higher false positive probability returns a higher count of false positives:
Count[Map[dsH["CouldContain", #]&, Range[100]], True]Count[Map[dsL["CouldContain", #]&, Range[100]], True]Possible Issues (1)
A Bloom filter that is very full is more likely to return false positives on testing for inclusion.
Create a Bloom filter and insert many elements. The visualization shows it appears quite full:
ds = CreateDataStructure["BloomFilter", 1];Do[ ds["Insert", RandomReal[]], {1000}];
ds["Visualization"]The load factor shows that the filter is full:
ds["LoadFactor"]The test for inclusion returns True even though the element was never inserted. This is because the filter is full:
ds["CouldContain", "element"]Related Guides
History
Introduced in 2020 (12.1)