"ByteTrie" (Data Structure)
"ByteTrie"
represents a trie where the members are sequences of bytes.
Details
- A byte trie is useful for efficient insertion as well as membership testing:
-
CreateDataStructure["ByteTrie", start, end] create a new empty "ByteTrie" for sequences of bytes that range from start to end Typed[x,"ByteTrie"] give x the type "ByteTrie" - For a data structure of type "ByteTrie", the following operations can be used:
-
ds["ByteLists"] return the byte sequences stored in ds represented as a list of lists of bytes time: O(n) ds["ByteLists",s] return the byte sequences stored in ds that start with the sequence s time: O(n) ds["ByteRange"] the range of bytes that ds can hold time: O(1) ds["Copy"] return a copy of ds time: O(n) ds["EmptyQ"] True, if ds has no elements time: O(1) ds["FreeQ",s] True , if the byte sequence s is not stored in ds time: O(log n) ds["Insert",s] insert the byte sequence s in ds time: O(log n) ds["Length"] the number of byte sequences stored in ds time: O(1) ds["MemberQ",s] True, if the byte sequence s is stored in ds time: O(log n) ds["NumericArrays"] return the byte sequences stored in ds represented as a list of numeric arrays time: O(n) ds["NumericArrays",s] return the byte sequences stored in ds that start with the sequence s time: O(n) ds["Strings"] return the byte sequences stored in ds represented as a list of strings time: O(n) ds["Strings",s] return the byte sequences stored in ds that start with the sequence s time: O(n) - The following functions are also supported:
-
dsi===dsj True, if dsi equals dsj FullForm[ds] full form of ds Information[ds] information about ds InputForm[ds] input form of ds Normal[ds] convert ds to a normal expression - The byte sequences stored in a "ByteTrie" data structure can be represented as lists of bytes, strings or numeric arrays.
- A trie uses a prefix to store its elements so it never suffers from the collisions that are a problem for other data structures, such as those that use hash computations.
Examples
open all close allBasic Examples (6)
A new "ByteTrie" can be created with CreateDataStructure:
ds = CreateDataStructure["ByteTrie", "a", "c"]ds["Insert", "abc"]Test if a byte sequence is stored:
ds["MemberQ", "abc"]If a sequence is not stored, False is returned:
ds["MemberQ", "abb"]A list of the bytes represented as strings can be returned:
ds["Strings"]Numerical values for the range of bytes can be given:
ds = CreateDataStructure["ByteTrie", 97, 99]A list of bytes can be inserted:
ds["Insert", {97, 98, 99}]Test if a byte sequence is stored:
ds["MemberQ", {97, 98, 98}]If a sequence is not stored, False is returned:
ds["MemberQ", {97, 98, 98}]A list of the bytes represented as strings can be returned:
ds["ByteLists"]A byte trie with several entries:
ds = CreateDataStructure["ByteTrie", "a", "c"];
ds["Insert", "abc"];
ds["Insert", "acb"];
ds["Insert", "bbc"];ds["Strings"]Return all the entries with a specific prefix:
ds["Strings", "a"]The prefix to select entries can contain more than one byte:
ds["Strings", "ab"]A byte trie with several entries:
ds = CreateDataStructure["ByteTrie", 97, 99];
ds["Insert", {97, 98, 99}];
ds["Insert", {97, 99, 98}];
ds["Insert", {98, 98, 99}];ds["ByteLists"]Return all the entries with a specific prefix:
ds["ByteLists", {97}]The prefix to select entries can contain more than one byte:
ds["ByteLists", {97, 98}]A byte trie with some insertions:
ds = CreateDataStructure["ByteTrie", "a", "c"];
ds["Insert", "ab"];
ds["Insert", "abb"];
ds["Insert", "acb"];
ds["Insert", "bbc"];A visualization of the data structure can be generated:
ds["Visualization"]The elements colored in red show the end of actual entries.
It is an error to insert bytes that are outside of the range:
ds = CreateDataStructure["ByteTrie", "a", "c"];
ds["Insert", "d"]
Scope (1)
Information (1)
A new "ByteTrie" can be created with CreateDataStructure:
ds = CreateDataStructure["ByteTrie", "a", "c"]Information about the data structure ds:
Information[ds]Applications (2)
Command Completion (1)
One use of a byte trie is for command completion. This works well with the RandomWord function.
Create a list of 1000 words, removing hyphens and making everything lowercase:
words = ToLowerCase[ StringReplace[ RandomWord[{"CommonWords", "Noun"}, 1000], "-" -> ""]];Create a byte trie and insert the words:
ds = CreateDataStructure["ByteTrie", "a", "z"];
Scan[(ds["Insert", #])&, words]Now efficiently find all the words that start with a specific prefix:
ds["Strings", "br"]Storing Integers (1)
Integers can be stored in a byte trie by breaking them down into sequences of bytes.
One way to do this is with bit operations, but IntegerDigits is also effective:
trie = CreateDataStructure["ByteTrie", 0, 7];
trie["Insert", IntegerDigits[ 1024, 8]];
trie["ByteLists"]A utility function that inserts a number into a byte trie:
insert[ trie_, num_] :=
trie["Insert", IntegerDigits[num, 8]]A function that tests for membership:
test[trie_, num_] :=
trie["MemberQ", Developer`FromPackedArray[IntegerDigits[num, 8]]]Create a byte trie and make an insertion:
trie = CreateDataStructure["ByteTrie", 0, 7];
insert[ trie, 123456];test[trie, 123456]Neat Examples (1)
Visualization (1)
A collection of random byte sequences of random lengths:
data = Table[ RandomChoice[{97, 98, 99}, RandomInteger[{1, 10}]], {20}]Create a byte trie and insert the byte sequences:
ds = CreateDataStructure["ByteTrie", 97, 99];
Scan[ds["Insert", #]&, data]The visualization shows how the data is laid out:
ds["Visualization"]See Also
Related Guides
History
Introduced in 2020 (12.2)