NetUnfold[fnet]
produces the elementary net of the folded net fnet, exposing the recurrent states.
NetUnfold
NetUnfold[fnet]
produces the elementary net of the folded net fnet, exposing the recurrent states.
Details and Options
- A folded net is a net iterating over a sequence unidirectionally by repeating the same operation, such as recurrent nets and unidirectional transformers.
- NetUnfold is typically used to extract the repeating operation, in order to efficiently generate sequences from a trained decoder that can be used in applications such as text and audio generation, text translation and more.
- With a recurrent network with state equations
and output equation
for
and training parameters
, the unfolded net corresponds to just a single step of this recurrence
and
. - In particular, NetUnfold exposes the recurrent states of the following folded layers:
-
BasicRecurrentLayer[…] one-state vector GatedRecurrentLayer[…] one-state vector LongShortTermMemoryLayer[…] two-state vectors, among which one internal cell state NetFoldOperator[net,{"out1""in1",…,"outn""inn"},…] n-state vectors AttentionLayer[…,"Mask""Causal"] two-state sequences, which are the previous keys and values - Exposed states of recurrent layers are vectors that are typically initialized with zeros. Exposed states of transformers are sequences of vectors with a variable length, which are typically initialized with empty sequences.
- NetUnfold can also be applied to a folded net that is followed by an operation on the last element of its output sequence. In such cases, the corresponding SequenceLastLayer is dropped.
- NetUnfold can be seen as the inverse operation of NetFoldOperator.
Examples
open all close allBasic Examples (1)
Get the core operation folded in a GatedRecurrentLayer:
NetUnfold[GatedRecurrentLayer[5]]Scope (5)
Unfold a single recurrent layer:
NetUnfold[LongShortTermMemoryLayer[5]]Unfold an attention layer with causal masking:
AttentionLayer["Mask" -> "Causal"]NetUnfold[%]Unfold a chain of recurrent operations:
NetChain[Table[LongShortTermMemoryLayer[5], 4]]NetUnfold[%]NetModel["Wolfram English Character-Level Language Model V1"]NetUnfold[%]folded = NetModel["GPT2 Transformer Trained on WebText Data"];
Information[folded, "SummaryGraphic"]unfolded = NetUnfold[folded];
Show[Information[unfolded, "SummaryGraphic"], ImageSize -> {300, 700}]Applications (1)
Implementing efficient text generation. First, get a trained language model:
lm = NetModel[{"GPT2 Transformer Trained on WebText Data", "Task" -> "LanguageModeling"}]The most straightforward function to stochastically generate text is the following:
generateText[input_String, numTokens_] := Nest[Function[StringJoin[#, lm[#, "RandomSample"]]], input, numTokens];generateText["I am", 20]The problem of this function is that it has quadratic time complexity, because the model is fed several times with the same input:
ListLinePlot[AssociationMap[First@AbsoluteTiming[generateText["I am", #]]&, Range[10, 210, 50]], Rule[...]]NetUnfold permits you to avoid recomputing the same activations twice, by exposing the states:
unfolded = NetUnfold[lm]Information[unfolded, "InputPortNames"]Information[unfolded, "OutputPortNames"]Write an efficient stochastic text generation based on this unfolded net:
encoder = NetExtract[lm, "Input"]generateTextEfficient[input_String, numTokens_] := Block[{encodedinput = encoder[input], index = 1, init, props, generated = {}},
init = Join[<|"Input" -> First@encodedinput, "Index" -> index|>, Association@Table["State" <> ToString[i] -> {}, {i, 24}]];
props = Append[Table[NetPort["OutState" <> ToString[i]], {i, 24}], NetPort["Output"] -> "RandomSample"];
Nest[Function@Block[{newinput = KeyMap[StringReplace["OutState" -> "State"], unfolded[#, props]]},
Join[newinput, <|"Index" -> ++index, "Input" -> If[index <= Length[encodedinput], encodedinput[[index]], AppendTo[generated, newinput["Output"]];Last@encoder@newinput["Output"]]|>]
],
init, numTokens + Length[encodedinput] - 1];
StringJoin[input, generated]
];generateTextEfficient["I am", 20]This efficient text generation has linear time complexity:
ListLinePlot[AssociationMap[First@AbsoluteTiming[generateTextEfficient["I am", #]]&, Range[10, 210, 50]], Rule[...]]Properties & Relations (2)
NetUnfold is the inverse operation of NetFoldOperator:
core = NetGraph[{Plus}, {{NetPort["Input"], NetPort["State"]} -> 1}, "Input" -> "Real", "State" -> "Real"]folded = NetFoldOperator[core]NetUnfold[folded] == coreAny SequenceLastLayer after a recursion is automatically removed:
NetChain[{LongShortTermMemoryLayer[5], SequenceLastLayer[], Ramp}]NetUnfold[%]Related Guides
History
Text
Wolfram Research (2021), NetUnfold, Wolfram Language function, https://reference.wolfram.com/language/ref/NetUnfold.html.
CMS
Wolfram Language. 2021. "NetUnfold." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/NetUnfold.html.
APA
Wolfram Language. (2021). NetUnfold. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/NetUnfold.html
BibTeX
@misc{reference.wolfram_2026_netunfold, author="Wolfram Research", title="{NetUnfold}", year="2021", howpublished="\url{https://reference.wolfram.com/language/ref/NetUnfold.html}", note=[Accessed: 12-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_netunfold, organization={Wolfram Research}, title={NetUnfold}, year={2021}, url={https://reference.wolfram.com/language/ref/NetUnfold.html}, note=[Accessed: 12-June-2026]}