LearningRateMultipliers
Details
- With the default value of LearningRateMultipliers->Automatic, all layers learn at the same rate.
- LearningRateMultipliers->{rule1,rule2,…} specifies a set of rules that will be used to determine learning rate multipliers for every trainable array in the net.
- In LearningRateMultipliers->{rule1,rule2,…}, each of the rulei can be of the following forms:
-
"part"r use multiplier r for a named layer, subnetwork or array in a layer nr use multiplier r for the n
layerm;;nr use multiplier r for layers m through n {part1,part2,…}r use multiplier r for a nested layer or array _r use multiplier r for all layers - LearningRateMultipliersr specifies using the same multiplier r for all trainable arrays.
- If r is zero or None, it specifies that the layer or array should not undergo training and will be left unchanged by NetTrain.
- If r is a positive or negative number, it specifies a multiplier to apply to the global learning rate chosen by the training method to determine the learning rate for the given layer or array.
- For each trainable array, the rate used is given by the first matching rule, or 1 if no rule matches.
- Rules that specify a subnet (e.g. a nested NetChain or NetGraph) apply to all layers and arrays within that subnet.
- LearningRateMultipliers->{part->None} can be used to "freeze" a specific part.
- LearningRateMultipliers->{part->1,_->None} can be used to "freeze" all layers except for a specific part.
- The hierarchical specification {part1,part2,…} used by LearningRateMultipliers to refer to parts of a net is equivalent to that used by NetExtract and NetReplacePart.
- Information[net,"ArraysLearningRateMultipliers"] yields the default learning rate multipliers for all arrays of a net.
- The multipliers that are genuinely used when training can be obtained from a NetTrainResultsObject via the property "ArraysLearningRateMultipliers".
Examples
open all close allBasic Examples (2)
Create and initialize a net with three layers, but train only the last layer:
net = NetInitialize@NetChain[{LinearLayer[3], Ramp, LinearLayer[{}]}, "Input" -> "Real", "Output" -> "Real"]trained = NetTrain[net, {1 -> 1.9, 2 -> 4.1, 3 -> 6.0, 4 -> 8.1}, LearningRateMultipliers -> {3 -> 1, _ -> None}]The biases of the first layer remain unmodified in the trained net:
NetExtract[net, {1, "Biases"}] == NetExtract[trained, {1, "Biases"}]The biases of the third layer have been trained:
NetExtract[net, {3, "Biases"}] == NetExtract[trained, {3, "Biases"}]Create a frozen layer with given array values:
frozen = LinearLayer[3, "Weights" -> {{1}, {2}, {3}}, "Biases" -> {-1, -2, -3}, LearningRateMultipliers -> None]Nest this layer inside a bigger net:
net = NetChain[{frozen, Ramp, LinearLayer[{}]}, "Input" -> "Real", "Output" -> "Real"]trained = NetTrain[net, {1 -> 1.9, 2 -> 4.1, 3 -> 6.0, 4 -> 8.1}]The arrays of the frozen layer were unchanged during training:
Normal@NetExtract[trained, {{1, "Weights"}, {1, "Biases"}}]Scope (1)
Replace LearningRateMultipliers in a Network (1)
net = NetInitialize@NetChain[{LinearLayer[3], Ramp, LinearLayer[{}]}, "Input" -> "Real", "Output" -> "Real"]Set the LearningRateMultipliers of the first layer of this net to zero:
fnet = NetReplacePart[net, {1, LearningRateMultipliers} -> 0]Check programmatically the values of LearningRateMultipliers options:
NetExtract[net, {1, LearningRateMultipliers}]NetExtract[fnet, {1, LearningRateMultipliers}]Applications (1)
Train an existing network to solve a new task. Obtain a pre-trained convolutional model that was trained on handwritten digits:
lenet = NetModel["LeNet Trained on MNIST Data"]Remove the final two layers, and attach two new layers, in order to classify images into 3 classes:
net = NetJoin[
NetDrop[lenet, -2],
NetChain[{LinearLayer[], SoftmaxLayer[]}]];
net = NetReplacePart[net, "Output" -> NetDecoder[{"Class", {"x", "y", "z"}}]]Generate training data by rasterizing the characters "x", "y", and "z" with a variety of fonts, sizes, and cases:
letterImage[str_, size_, style_, font_] := Rasterize[Style[str, style, FontSize -> size, FontFamily -> font], "Image", ImageSize -> {28, 28}];trainingData = Table[
letterImage[case[class], size, style, font] -> class,
{class, {"x", "y", "z"}},
{font, {"Courier", "Helvetica", "Times New Roman"}},
{style, {Plain, Italic, Bold}},
{size, {7, 8, 9, 10}},
{case, {ToLowerCase, ToUpperCase}}
]//Flatten;Length[trainingData]RandomSample[trainingData, 10]Train the modified network on the new task:
trained = NetTrain[net, trainingData, TimeGoal -> 10, LearningRateMultipliers -> {-2 -> 1, _ -> None}, ValidationSet -> Scaled[0.1]]x = letterImage["x", 11, Italic, "Arial Narrow"]trained[x, "Probabilities"]Measure the performance on the original training data, which includes the training and validation set:
NetMeasurements[trained, trainingData, "Accuracy"]Properties & Relations (1)
Train LeNet on the MNIST dataset with specific learning rate multipliers, returning a NetTrainResultsObject:
results = NetTrain[NetModel["LeNet"], ResourceData["MNIST"], All, LearningRateMultipliers -> {4 ;; -> 2}]Obtain the actual learning rate multipliers used on individual weight arrays:
results["ArraysLearningRateMultipliers"]Possible Issues (1)
When a shared array occurs at several places in the network, only a unique learning rate multiplier will be applied to all the occurrences of the shared array.
Create a network with shared arrays:
sharedlayer = NetInsertSharedArrays[LinearLayer[{}, "Input" -> "Real"]];
net = NetChain[{sharedlayer, Tanh, sharedlayer}]Specifying a LearningRateMultipliers to a shared array in the network will assign the same multiplier to all places:
NetTrain[net, {1 -> 0, 0 -> 1}, "ArraysLearningRateMultipliers", TimeGoal -> 0.01, LearningRateMultipliers -> {{1, "Weights"} -> None}]If there is a conflict, the first matching value will be used:
NetTrain[net, {1 -> 0, 0 -> 1}, "WeightsLearningRateMultipliers", TimeGoal -> 0.01, LearningRateMultipliers -> {{1, "Weights"} -> 0, {3, "Weights"} -> 2}]The same happens when LearningRateMultipliers is specified when constructing the network:
sharedlayer = NetInsertSharedArrays[LinearLayer[{}, "Input" -> "Real"]];net2 = NetChain[{sharedlayer, Tanh, sharedlayer}, LearningRateMultipliers -> {{1, "Weights"} -> 0, {3, "Weights"} -> 2}]Information[net2, "ArraysLearningRateMultipliers"]Tech Notes
Related Guides
Text
Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).
CMS
Wolfram Language. 2017. "LearningRateMultipliers." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/LearningRateMultipliers.html.
APA
Wolfram Language. (2017). LearningRateMultipliers. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearningRateMultipliers.html
BibTeX
@misc{reference.wolfram_2026_learningratemultipliers, author="Wolfram Research", title="{LearningRateMultipliers}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}", note=[Accessed: 12-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_learningratemultipliers, organization={Wolfram Research}, title={LearningRateMultipliers}, year={2020}, url={https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}, note=[Accessed: 12-June-2026]}