"HashSet" (Data Structure)
"HashSet"
represents a set where the members are general expressions and membership is computed by using a hash function.
Details
- A hash set is useful for storing a collection of unique elements and provides highly efficient membership testing, insertion and removal.
-
CreateDataStructure["HashSet"] create a new empty "HashSet" CreateDataStructure["HashSet",elems] create a new "HashSet" containing elems Typed[x,"HashSet"] give x the type "HashSet" - For a data structure of type "HashSet", the following operations can be used:
-
ds["Complement",list] remove elements from ds that appear in list time: O(n) ds["Copy"] return a copy of ds time: O(n) ds["Elements"] return a list of the elements of ds time: O(n) ds["EmptyQ"] True, if ds has no elements time: O(1) ds["Delete",x] delete x from ds, return True if x was actually an element time: O(1) ds["DeleteAll"] delete all the elements from ds time: O(n) ds["Insert",x] add x to the set and return True if insertion succeeded time: O(1) ds["Intersection",list] remove elements from ds that do not appear in list time: O(n) ds["Length"] returns the number of elements stored in ds time: O(1) ds["MemberQ",x] True, if x is an element of ds time: O(1) ds["Pop"] remove an element from ds and return it time: O(1) ds["SubsetQ",ds1] return True if hash set ds1 is a subset of ds time: O(n) ds["Union",list] add elements to ds that appear in list time: O(n) ds["Visualization"] return a visualization of ds time: O(n) - The following functions are also supported:
-
dsi===dsj True, if dsi equals dsj FullForm[ds] full form of ds Information[ds] information about ds InputForm[ds] input form of ds Length[ds] the length of the ds Normal[ds] convert ds to a normal expression
Examples
open all close allBasic Examples (2)
A new "HashSet" can be created with CreateDataStructure:
ds = CreateDataStructure["HashSet"]Insert an element into the set:
ds["Insert", f[1]]There is one element in the set:
ds["Length"]Test if an expression is stored:
ds["MemberQ", f[1]]If an expression is not stored, False is returned:
ds["MemberQ", f[2]]Remove an element from the set, returning True if the element was successfully removed:
ds["Delete", f[1]]Return an expression version of ds:
Normal[ds]It is fast to insert elements:
ds = CreateDataStructure["HashSet"];
Do[ds["Insert", i], {i, 1000}]A visualization of the data structure can be generated:
ds["Visualization"]Scope (17)
Information (1)
A new "HashSet" can be created with CreateDataStructure:
ds = CreateDataStructure["HashSet"]Information about the data structure ds:
Information[ds]Creation (2)
Operations (14)
"Complement" (1)
Create a "HashSet" with initial elements:
ds = CreateDataStructure["HashSet", Range[100]]Create a new set containing elements that are members of the original "HashSet" but not members of the given list:
ds1 = ds["Complement", Range[50]]Verify the elements of the new hash set:
ds1["Elements"]"Copy" (1)
Create a "HashSet" with initial elements:
ds = CreateDataStructure["HashSet", Range[4]]ds1 = ds["Copy"]The contents of both arrays are the same:
{Normal[ds], Normal[ds1]}Modifying the copy will not affect the original set:
ds1["Insert", 42];
{Normal[ds], Normal[ds1]}"Elements" (1)
"EmptyQ" (1)
Test whether a "HashSet" is empty or not:
ds = CreateDataStructure["HashSet"];
ds["EmptyQ"]After inserting an element, the set is no longer empty:
ds["Insert", f[x]];
ds["EmptyQ"]"Delete" (1)
"DeleteAll" (1)
Delete all elements in a "HashSet":
ds = CreateDataStructure["HashSet", Range[300]];
ds["DeleteAll"]ds["EmptyQ"]"Insert" (1)
"Intersection" (1)
Create a "HashSet" with initial elements:
ds = CreateDataStructure["HashSet", Range[100]]Create a new set containing only those elements common to both the original "HashSet" and a given list:
ds1 = ds["Intersection", Range[50]]Verify the elements of the new hash set:
ds1["Elements"]"Length" (1)
"MemberQ" (1)
Test whether an element is present in a "HashSet":
ds = CreateDataStructure["HashSet", {f[x], f[y], f[z]}];
{ds["MemberQ", f[x]], ds["MemberQ", f[w]]}"Pop" (1)
Create a "HashSet" with initial elements:
ds = CreateDataStructure["HashSet", {f[x], f[y], f[z]}];
ds["Elements"]ds["Pop"]"SubsetQ" (1)
Test whether all elements of one set are contained in another:
ds1 = CreateDataStructure["HashSet", {1, 3, 5}];
ds2 = CreateDataStructure["HashSet", {1, 2, 3, 4, 5}];
ds1["SubsetQ", ds2]Check if the smaller set is a subset of the larger one:
ds2["SubsetQ", ds1]"Union" (1)
Create a "HashSet" with initial elements:
ds = CreateDataStructure["HashSet", Range[10]]Create a new set containing elements present both in the original "HashSet" and a given list:
ds1 = ds["Union", Range[10, 20]]Verify the elements of the new hash set:
ds1["Elements"]"Visualization" (1)
Create a "HashSet" with initial elements:
ds = CreateDataStructure["HashSet", Range[300]]Visualize the contents of the hash set:
ds["Visualization"]Applications (2)
Sets of Strings (1)
Built-in set operations are good for working with rectangular arrays of machine numbers. Data structure operations are good for working with sets of non-numeric data such as strings. Create a list of 1000000 strings:
list1 = Table[ StringJoin[RandomChoice[ {"A", "B", "C"}, 4]], 1000000];ds = CreateDataStructure["HashSet"];
ds["Union", list1];//AbsoluteTimingUnion[list1];//AbsoluteTimingA similar example shows benefits for "Complement":
list1 = Table[ StringJoin[ RandomChoice[ {"A", "B", "C"}, 10]], 100000];
listC = Table[ StringJoin[{"A", RandomChoice[ {"A", "B", "C"}, 9]}], 50];ds = CreateDataStructure["HashSet"];
ds["Union", list1];//AbsoluteTiminglistI = Union[list1];//AbsoluteTimingds["Complement", listC];//AbsoluteTimingComplement[listI, listC];//AbsoluteTimingGraph Cycle Detection (1)
Detect if a directed graph contains cycles using "HashSet" to track visited nodes during depth-first-search traversal:
hasCycle[graph_ ? GraphQ] := Module[{
visited = CreateDataStructure["HashSet"],
recursionStack = CreateDataStructure["HashSet"],
dfs},
dfs[node_] := If[!visited["MemberQ", node],
visited["Insert", node];
recursionStack["Insert", node];
Do[
If[recursionStack["MemberQ", neighbor],
Return[True, Module]
,
dfs[neighbor]
],
{neighbor, VertexOutComponent[graph, node, {1}]}
];
recursionStack["Delete", node];
];
Scan[dfs, VertexList[graph]];
False
]Check whether the following graphs have non-self cycles:
hasCycle /@ {[image], [image], [image]}Properties & Relations (5)
InputForm (1)
Length (1)
Normal (1)
SameQ (1)
SameQ can be used to test whether two hash sets contain identical elements, no matter the order:
ds1 = CreateDataStructure["HashSet", Range[10]];
ds2 = CreateDataStructure["HashSet", Reverse@Range[10]];
ds1 === ds2If one of the elements is removed, the sets are no longer the same:
ds1["Pop"];
ds1 === ds2"OrderedHashSet" (1)
Many algorithms that work for "OrderedHashSet" also work for "HashSet".
Unlike "OrderedHashSet", the order in which elements are inserted into a "HashSet" is not preserved.
See Also
Association Dispatch Hash CreateDataStructure DataStructure
Data Structures: OrderedHashSet HashTable OrderedHashTable DisjointSet SortedMultiset
Related Guides
History
Introduced in 2020 (12.1)