represents a net layer that computes the connectionist temporal classification loss by comparing a sequence of class probability vectors with a sequence of indices representing the target classes.
CTCLossLayer
represents a net layer that computes the connectionist temporal classification loss by comparing a sequence of class probability vectors with a sequence of indices representing the target classes.
Details and Options
- CTCLossLayer[] represents a net that takes an input matrix representing a sequence of vectors and a target vector representing a sequence of integers and outputs a real value.
- CTCLossLayer is typically used inside NetGraph.
- CTCLossLayer exposes the following ports for use in NetGraph etc.:
-
"Input" a sequence of probability vectors of size c+1 "Target" a sequence of integers between 1 and c "Output" a real number - The layer definition is based on Graves et al., "Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks", 2006.
- The input should be a sequence of probability vectors of size c+1 where each vector sums to 1. The last element of each vector represents the probability of a special blank class, with the remaining elements representing the probability of the indexed classes 1 to c. The target is a sequence of integers between 1 and c. The target sequence cannot be longer than the input sequence.
- CTCLossLayer[…][<|"Input"in,"Target"target|>] explicitly computes the output from applying the layer.
- CTCLossLayer[…][<|"Input"->{in1,in2,…},"Target"->{target1,target2,…}|>] explicitly computes outputs for each of the ini and targeti.
- When given a NumericArray as input, the output will be a NumericArray.
- The size of the input is usually inferred automatically within a NetGraph.
- CTCLossLayer[n,"Input"ishape,"Target"tshape] allows the shape of the input and target to be specified. Possible forms for ishape are:
-
NetEncoder[…] encoder producing a sequence of vectors {len,c+1} sequence of len length-(c+1) vectors {len,Automatic} sequence of len vectors whose length is inferred {"Varying",c+1} varying number of vectors each of length c+1 {"Varying",Automatic} varying number of vectors each of inferred length - Possible forms for tshape are:
-
NetEncoder[…] encoder producing a sequence of integers {len2} sequence of len2 integers {"Varying"} varying number of integers RepeatingElement[Restricted[Integer,c]] varying number of integers in the range 1 to c - Options[CTCLossLayer] gives the list of default options to construct the layer. Options[CTCLossLayer[…]] gives the list of default options to evaluate the layer on some data.
- Information[CTCLossLayer[…]] gives a report about the layer.
- Information[CTCLossLayer[…],prop] gives the value of the property prop of CTCLossLayer[…]. Possible properties are the same as for NetGraph.
Examples
open all close allBasic Examples (2)
Create a CTCLossLayer object:
CTCLossLayer[]Create a CTCLossLayer where the input is a matrix whose rows are probability vectors and the target is a vector of indices:
loss = CTCLossLayer[]Apply it to an input and a target:
loss[<|"Input" -> {{0.1, 0.2, 0.7}, {0.4, 0.2, 0.4}, {0.3, 0.5, 0.2}}, "Target" -> {1, 2}|>]Applications (1)
Train a net that classifies a vector of characters in an image. First generate training and test data, which consists of images of words and the corresponding word string:
imgMap = <|" " -> [image], "a" -> [image], "b" -> [image], "c" -> [image], "d" -> [image], "e" -> [image], "f" -> [image], "g" -> [image], "h" -> [image], "i" -> [image], "j" -> [image], "k" -> [image], "l" -> [image], "m" -> [image], "n" -> [image], "o" -> [image], "p" -> [image], "q" -> [image], "r" -> [image], "s" -> [image], "t" -> [image], "u" -> [image], "v" -> [image], "w" -> [image], "x" -> [image], "y" -> [image], "z" -> [image]|>;maxLen = 7;
wordList = ToLowerCase[Select[WordList[], StringLength[#] ≤ maxLen && LetterQ[#]&]];
SeedRandom[1234];
wordList = RandomSample[StringPadRight[wordList, maxLen]];
dataset = Dataset@Map[<|"Input" -> ImageAssemble[Lookup[imgMap, Characters[#]]], "Output" -> StringTrim[#]|>&, wordList];Split the dataset into a test and a training set:
{testData, trainData} = TakeDrop[dataset, Ceiling[Length[dataset] / 10]];Take a RandomSample of the training set:
RandomSample[trainData, 4]chars = CharacterRange["a", "z"]The decoder is a beam search decoder with a beam size of 50:
decoder = NetDecoder[{"CTCBeamSearch", chars, "BeamSize" -> 50}]Define a net that takes an image and then treats the width dimension as a sequence dimension. A matrix whose rows are probability vectors over the width dimension is produced:
ocrNet = NetChain[{ConvolutionLayer[20, 3], BatchNormalizationLayer[], Ramp, PoolingLayer[2], ConvolutionLayer[15, 3], BatchNormalizationLayer[], Ramp, PoolingLayer[2], FlattenLayer[1], TransposeLayer[], GatedRecurrentLayer[19], GatedRecurrentLayer[19], NetMapOperator[LinearLayer[Length[chars] + 1]], SoftmaxLayer[]}, "Input" -> NetEncoder[{"Image", {63, 15}, "Grayscale"}], "Output" -> decoder]Define a CTCLossLayer with a character NetEncoder attached to the target port:
loss = CTCLossLayer["Target" -> NetEncoder[{"Characters", chars}]]Train the net using the CTC loss:
trainNet = NetTrain[ocrNet, trainData, LossFunction -> loss, MaxTrainingRounds -> 20, ValidationSet -> testData]Evaluate the trained net on images from the test set:
testIms = Normal@testData[1 ;; 3, "Input"]trainNet[testIms]Obtain the top-5 decodings for an image, along with the negative log likelihood of each decoding:
trainNet[[image], {"TopNegativeLogLikelihoods", 5}]See Also
SoftmaxLayer CrossEntropyLossLayer BasicRecurrentLayer LongShortTermMemoryLayer GatedRecurrentLayer NetGraph
Net Decoders: CTCBeamSearch
Tech Notes
Related Guides
History
Text
Wolfram Research (2018), CTCLossLayer, Wolfram Language function, https://reference.wolfram.com/language/ref/CTCLossLayer.html.
CMS
Wolfram Language. 2018. "CTCLossLayer." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/CTCLossLayer.html.
APA
Wolfram Language. (2018). CTCLossLayer. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/CTCLossLayer.html
BibTeX
@misc{reference.wolfram_2026_ctclosslayer, author="Wolfram Research", title="{CTCLossLayer}", year="2018", howpublished="\url{https://reference.wolfram.com/language/ref/CTCLossLayer.html}", note=[Accessed: 13-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_ctclosslayer, organization={Wolfram Research}, title={CTCLossLayer}, year={2018}, url={https://reference.wolfram.com/language/ref/CTCLossLayer.html}, note=[Accessed: 13-June-2026]}