Wolfram Language & System Documentation Center

LQRegulatorTrain

trains the regulator that minimizes the quadratic cost with weights wts for the environment specification espec by simulating it over time specification tspec.

LQRegulatorTrain[espec,wts,g,tspec]

starts with the value g for the regulator gain.

LQRegulatorTrain[…,"prop"]

gives the value of the property "prop".

Details

LQRegulatorTrain trains the regulator by simulation and is useful when the model of the environment is not available or is varying.
The regulator is also known as the agent.
The regulator value starts with a stabilizing gain that results in the control action and updates the gain after observing the resulting state values of the environment for the various values commanded by the regulator.

The regulator minimizes the quadratic cost .
The regulator is computed by iteratively solving for the function. The function is also known as the quality function or the action-value function.
The function is given by .
The function is the cost for taking action when the environment's state is at time instant , and from then on taking the optimal action resulting in the optimal cost .
The critic in the regulator observes batches of action and state values, computes a better estimate of the Q function, and the actor updates the control action.

The environment specification espec is the simulation model of the environment together with the x(k) and u(k) specifications.
The environment specification espec can be specified as a state update function with :
f user-defined function

Function pure function

CompiledFunction compiled function
As library functions that load an external state update function with signature , where and are pointers to and :
LibraryFunction function loaded from a Wolfram library

ForeignFunction function loaded from a C library
As models:

	StateSpaceModel	linear control action and linear state
	AffineStateSpaceModel	linear control action and nonlinear state
	NonlinearStateSpaceModel	nonlinear control action and nonlinear state
	SystemModel	general system model

As a device or a more detailed specification:
"Device" external device

<|…|> detailed system specification given as an Association
The detailed system specification can have the following keys:

	"FeedbackInputs"	the inputs to use for feedback
	"InitialStateValues"	the initial state values
	"InputModel"	any of environments
	"InputOperatingValues"	operating values of the inputs
	"PertubationSize"	size of the feedback input perturbation
	"SamplingPeriod"	sampling period for devices and continuous-time systems
	"StateOperatingValues"	operating values of the states

The weights wts can have the following forms:
{q,r} cost function with no cross-coupling

{q,r,p} cost function with cross-coupling matrix p
The time specification tspec can have the following forms:
k_max number of simulations

{b,k_max} specify the batch length b as well
The batch length b specifies the number of simulations after which the control action is updated.
LQRegulatorTrain[…, "Data"] returns a SystemsModelControllerData object cd that can be used to extract additional properties using the form cd["prop"].
LQRegulatorTrain[…, "prop"] can be used to directly give the value of cd["prop"].
Possible values for properties "prop" include:

	"BatchLength"	number of iterations before the gain is updated
	"ConvergedQ"	whether the gain values have converged
	"Design"	type of controller design
	"FeedbackGains"	final gain matrix
	"FeedbackGainsSequence"	sequence of gain matrices
	"FeedbackInputs"	inputs used for feedback
	"FeedbackInputsSequence"	applied feedback input sequence
	"InputCount"	number of inputs
	"InputModel"	input model
	"KernelMatrix"	kernel matrix of the Q function
	"SamplingPeriod"	sampling period
	"SimulationRange"	simulation range
	"StateCount"	number of states
	"StateResponse"	state response

Examples

open all close all

Basic Examples (1)

An environment specification:

Wolfram Language code: espec = <|"InputModel" -> (0.55#1 + 1.5 #2&), "InitialStateValues" -> {5}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 20]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], IconizedObject[«opts»]]

The feedback gain sequence:

Wolfram Language code: ListStepPlot[Flatten[cd["FeedbackGainsSequence"], 1], IconizedObject[«opts»]]

Scope (16)

Environments (11)

An environment specified as a function:

Wolfram Language code: env[x_, u_, k_] := x + u

The complete environment specification:

Wolfram Language code: espec = <|"InputModel" -> env, "InitialStateValues" -> {1}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 30]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> All, DataRange -> cd["SimulationRange"]]

A multi-state environment:

Wolfram Language code:

env[x_, u_, k_] := (⁠|       |      |
| ----- | ---- |
| -0.75 | -0.2 |
| 0.5   | 0    |⁠).x + (⁠|    |
| -- |
| 1  |
| -1 |⁠).u

Wolfram Language code: espec = <|"InputModel" -> env, "InitialStateValues" -> {-0.1, 0.1}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {IdentityMatrix[2], {{2}}}, 25]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> {{0, 12}, All}, DataRange -> cd["SimulationRange"]]

An environment specified using a pure function:

Wolfram Language code:

env = Function[{x, u, k}, (⁠|      |     |
| ---- | --- |
| 0.3  | 0.1 |
| -0.4 | 0   |⁠).x + (⁠|    |
| -- |
| 1  |
| -1 |⁠).u]

Wolfram Language code: espec = <|"InputModel" -> env, "InitialStateValues" -> {1, -1}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {IdentityMatrix[2], {{2}}}, 50]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> {{0, 10}, All}, DataRange -> cd["SimulationRange"]]

An environment specified using a compiled function:

Wolfram Language code:

env = Compile[{{x, _Real, 1}, {u, _Real, 1}, k}, (⁠|      |     |
| ---- | --- |
| 0.3  | 0.1 |
| -0.4 | 0   |⁠).x + (⁠|    |
| -- |
| 1  |
| -1 |⁠).u]

Wolfram Language code: espec = <|"InputModel" -> env, "InitialStateValues" -> {1.0, -1.0}|>;

Wolfram Language code: cd = LQRegulatorTrain[espec, {IdentityMatrix[2], {{2}}}, 50]

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> {{0, 10}, All}, DataRange -> cd["SimulationRange"]]

Create a library function to provide the environment:

Wolfram Language code:

Needs["SymbolicC`"]
Needs["CCompilerDriver`"]

The core environment code:

Wolfram Language code: model = ToCCodeString[IconizedObject[«model»]]

The wrapper function:

Wolfram Language code: modelWrapper = ToCCodeString[IconizedObject[«modelwrapper»]]

Compile the code and create a library:

Wolfram Language code: modelObj = CreateObjectFile[model, "model", "TargetDirectory" -> $TemporaryDirectory]

Wolfram Language code: modelWrapperObj = CreateObjectFile[modelWrapper, "modelwrapper", "TargetDirectory" -> $TemporaryDirectory]

Wolfram Language code: modelLib = CreateLibrary[{modelObj, modelWrapperObj}, "modelLib", "TargetDirectory" -> $TemporaryDirectory]

Load the function in the library:

Wolfram Language code: env = LibraryFunctionLoad[modelLib, "model_wrapper", {{Real, 1, "Shared"}, {Real, 1, "Shared"}, Integer}, "Void"]

The complete environment specification:

Wolfram Language code: espec = <|"InputModel" -> env, "InitialStateValues" -> {1.0}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 35]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> {{0, 20}, All}, DataRange -> cd["SimulationRange"]]

Create a foreign function to provide the environment:

Wolfram Language code:

Needs["SymbolicC`"]
Needs["CCompilerDriver`"]

The C code:

Wolfram Language code: code = ToCCodeString[SymbolicC`CFunction[...]]

Compile the code and create a library:

Wolfram Language code: modelLib = CreateLibrary[code, "modelLib", "TargetDirectory" -> $TemporaryDirectory]

Load the function in the library:

Wolfram Language code: env = ForeignFunctionLoad[modelLib, "model", {"RawPointer"::["CDouble"], "RawPointer"::["CDouble"], "CInt"} -> "Void"]

The complete environment specification:

Wolfram Language code: espec = <|"InputModel" -> env, "InitialStateValues" -> {2.0}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 20]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> {{0, 10}, All}, DataRange -> cd["SimulationRange"]]

An environment with nonzero operating points:

Wolfram Language code:

espec = <|"InputModel" -> (-0.85(#1 - 1) - 1.5 (#2 - 0.5)&), "StateOperatingValues" -> {1}, "InputOperatingValues" -> {0.5}, "InitialStateValues" -> {5}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 30]

The state settles at its operating value of 1:

Wolfram Language code:

Show[ListStepPlot[cd["StateResponse"], PlotRange -> All, DataRange -> cd["SimulationRange"]], Plot[1, {$, 0, 30}, IconizedObject[«opts»]]]

The input settles at its operating value of 0.5:

Wolfram Language code: ListStepPlot[cd["FeedbackInputsSequence"], PlotRange -> All, DataRange -> cd["SimulationRange"]]

An environment specified using a state-space model:

Wolfram Language code: espec = <|"InputModel" -> StateSpaceModel[{{{0.6}}, {{2}}}, SamplingPeriod -> 1], "InitialStateValues" -> {1}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 20]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], DataRange -> cd["SimulationRange"], PlotRange -> All]

An environment specified using a nonlinear state-space model:

Wolfram Language code:

espec = <|"InputModel" -> NonlinearStateSpaceModel[{{u + 0.1*x + Sin[x]}, {x}}, x, u, SamplingPeriod -> 1], "InitialStateValues" -> {1}|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, {{1}}, 20]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], DataRange -> cd["SimulationRange"], PlotRange -> All]

An environment specified using a system model:

Wolfram Language code:

espec = <|"InputModel" -> CreateSystemModel["env", StateSpaceModel[{{{-1}}, {{1}}}]], "InitialStateValues" -> {1}, "SamplingPeriod" -> 0.01|>

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 250]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> All, DataRange -> 0.01 cd["SimulationRange"]]

Create an environment on a microcontroller:

Wolfram Language code: Needs["MicrocontrollerKit`"]

Wolfram Language code: MicrocontrollerEmbedCode[IconizedObject[«sys»], IconizedObject[«μc»], "/dev/ttyACM0"]

The environment specification:

Wolfram Language code: espec = <|"InputModel" -> "Device", "Connection" -> "Serial", "Port" -> "/dev/ttyACM0", "SamplingPeriod" -> 0.75|>;

Train the regulator:

Wolfram Language code: cd = LQRegulatorTrain[espec, {{{1}}, {{1}}}, 40]

The state response:

Wolfram Language code: ListStepPlot[cd["StateResponse"], PlotRange -> All, DataRange -> cd["SimulationRange"]]

Properties (5)

LQRegulatorTrain returns a SystemsModelControllerData object:

Wolfram Language code: LQRegulatorTrain[<|"InputModel" -> (0.1 #1 + #2&), "InitialStateValues" -> {1}|>, {{{1}}, {{1}}}, 15]

The data object can be used to obtain additional properties:

Wolfram Language code:

SystemsModelControllerData[Association["SummaryItemsFunction" -> Control`RLDump`iQLLQRSummaryItems, 
  "PropertyFunction" -> Control`RLDump`iQLLQRProperty, "FeedbackGains" -> {{0.05012499921757796}}, 
  "FeedbackGainsValues" -> {{{0, 0.050251256281 ... nsValues", 
  "StateValues", "FeedbackInputValues", "StateCount", "InputCount", "BatchLength", 
  "SimulationRange", "KernelMatrix", "ConvergedQ", "Design", "SimulationModel", "SamplingPeriod", 
  "InputModel", "FeedbackInputs", "TrackedOutputs"}]["FeedbackInputValues"]

Wolfram Language code:

SystemsModelControllerData[Association["SummaryItemsFunction" -> Control`RLDump`iQLLQRSummaryItems, 
  "PropertyFunction" -> Control`RLDump`iQLLQRProperty, "FeedbackGains" -> {{0.05012499921757796}}, 
  "FeedbackGainsValues" -> {{{0, 0.050251256281 ... nsValues", 
  "StateValues", "FeedbackInputValues", "StateCount", "InputCount", "BatchLength", 
  "SimulationRange", "KernelMatrix", "ConvergedQ", "Design", "SimulationModel", "SamplingPeriod", 
  "InputModel", "FeedbackInputs", "TrackedOutputs"}]["BatchLength"]

Obtain a list of properties:

Wolfram Language code:

SystemsModelControllerData[Association["SummaryItemsFunction" -> Control`RLDump`iQLLQRSummaryItems, 
  "PropertyFunction" -> Control`RLDump`iQLLQRProperty, "FeedbackGains" -> {{0.05012499921757796}}, 
  "FeedbackGainsValues" -> {{{0, 0.050251256281 ... nsValues", 
  "StateValues", "FeedbackInputValues", "StateCount", "InputCount", "BatchLength", 
  "SimulationRange", "KernelMatrix", "ConvergedQ", "Design", "SimulationModel", "SamplingPeriod", 
  "InputModel", "FeedbackInputs", "TrackedOutputs"}][{"FeedbackGains", "SimulationRange", "SimulationModel"}]

Obtain all properties:

Wolfram Language code:

SystemsModelControllerData[Association["SummaryItemsFunction" -> Control`RLDump`iQLLQRSummaryItems, 
  "PropertyFunction" -> Control`RLDump`iQLLQRProperty, "FeedbackGains" -> {{0.05012499921757796}}, 
  "FeedbackGainsValues" -> {{{0, 0.050251256281 ... nsValues", 
  "StateValues", "FeedbackInputValues", "StateCount", "InputCount", "BatchLength", 
  "SimulationRange", "KernelMatrix", "ConvergedQ", "Design", "SimulationModel", "SamplingPeriod", 
  "InputModel", "FeedbackInputs", "TrackedOutputs"}]["PropertyDataset"]

Obtain a property directly:

Wolfram Language code: LQRegulatorTrain[<|"InputModel" -> (0.1 #1 + #2&), "InitialStateValues" -> {1}|>, {{{1}}, {{1}}}, 15, "FeedbackGains"]

Applications (1)

Chemical Systems (1)

Train an agent to regulate a CSTR decomposition process:

A simulation model of the CSTR:

Wolfram Language code: IconizedObject[«CSTR»]

Create a library from the simulation model:

Wolfram Language code: Needs["CCompilerDriver`"]

Wolfram Language code: cstrLib = CreateLibrary[IconizedObject[«CSTR»], "cstrLib", "TargetDirectory" -> $TemporaryDirectory]

The environment specification:

Wolfram Language code:

espec = <|"InputModel" -> ForeignFunctionLoad[cstrLib, "update_cstr", {"RawPointer"::["CDouble"], "RawPointer"::["CDouble"], "CInt"} -> "Void"], "InitialStateValues" -> {0, -1}|>

Train a regulator:

Wolfram Language code:

cd = LQRegulatorTrain[espec, wts = {(⁠|      |      |
| ---- | ---- |
| 10^3 | 0    |
| 0    | 10^3 |⁠), (⁠1)}, {{1, -1}}, 200]

The state response:

Wolfram Language code:

p1 = Table[ListStepPlot[sr, PlotRange -> All, DataRange -> {0, (200 - 1)}0.1], {sr, cd["StateResponse"]}];
GraphicsRow[p1]

The closed-loop system with the regulator computed using an explicit model:

Wolfram Language code: csys = LQRegulatorGains[IconizedObject[«ssm»], wts, "ClosedLoopSystem"]

Compare the simulation- and model-based state responses:

Wolfram Language code:

p2 = Table[Plot[sr, {t, 0, 20}, PlotRange -> All, PlotStyle -> ColorData[116, 2]], {sr, StateResponse[{csys, {0, -1}}, {0, 0}, {t, 0, 20}]}];

Wolfram Language code: Legended[GraphicsRow[{Show[p1[[1]], p2[[1]]], Show[p1[[2]], p2[[2]]]}], IconizedObject[«leg»]]

Properties & Relations (2)

The gains typically converge to the optimal regulator gains:

Wolfram Language code: espec = <|"InputModel" -> Function[{x, u, k}, 0.7x + 0.5 u ], "InitialStateValues" -> {RandomReal[{-1, 1}]}|>;

Wolfram Language code: κlist = LQRegulatorTrain[espec, IconizedObject[«wts»], 30, "FeedbackGainsSequence"]

The optimal gain:

Wolfram Language code: StateSpaceModel[Thread[{x[k + 1]} == espec["InputModel"][{x[k]}, {u[k]}, k]], x[k], u[k], x[k], k]

Wolfram Language code: κ = LQRegulatorGains[%, IconizedObject[«wts»]]

Compare the iteratively computed gain and the optimal gain:

Wolfram Language code: Show[ListStepPlot[Flatten[κlist, 1], IconizedObject[«opts»]], Plot[κ, {$, 0, 30}, IconizedObject[«opts»]]]

The Q function can be computed from the kernel matrix:

Wolfram Language code:

{a, b} = {{{0.4}}, {{1}}};
{q, r} = {{{1}}, {{1}}};

Wolfram Language code:

cd = LQRegulatorTrain[<|"InputModel" -> StateSpaceModel[{a, b}, SamplingPeriod -> 1], "InitialStateValues" -> {RandomReal[{-1, 1}]}|>, {q, r}, 20]

The Q function:

Wolfram Language code: Q = {x, u}.cd["KernelMatrix"].{x, u}//Expand

The optimal value function:

Wolfram Language code: Q /. Thread[{u} -> -cd["FeedbackGains"].{x}]

The same result can be obtained using DiscreteRiccatiSolve:

Wolfram Language code: {x}.DiscreteRiccatiSolve[{a, b}, {q, r}].{x}

Possible Issues (2)

The initial gain must be stabilizing, otherwise the state response will blow up:

Wolfram Language code: espec = <|"InputModel" -> Function[{x, u, k}, -0.9x + 2 u ], "InitialStateValues" -> {RandomReal[{-1, 1}]}|>;

Wolfram Language code: LQRegulatorTrain[espec, IconizedObject[«wts»], {{2}}, IconizedObject[«tspec»], "StateResponse"]

The state response with a stabilizing gain:

Wolfram Language code: LQRegulatorTrain[espec, IconizedObject[«wts»], {{0.1}}, IconizedObject[«tspec»], "StateResponse"]

The regulator may not converge to the optimal solution:

Wolfram Language code: espec = <|"InputModel" -> Function[{x, u, k}, 3 x + u], "InitialStateValues" -> {RandomReal[{-2, 2}]}|>;

Wolfram Language code: LQRegulatorTrain[espec, IconizedObject[«wts»], IconizedObject[«tspec»], {"FeedbackGains", "ConvergedQ"}]

The optimal solution:

Wolfram Language code: LQRegulatorGains[StateSpaceModel[{{{3}}, {{1}}}, SamplingPeriod -> 1], IconizedObject[«wts»]]//N

Try adjusting the initial gain:

Wolfram Language code: LQRegulatorTrain[espec, IconizedObject[«wts»], {{5}}, IconizedObject[«tspec»], {"FeedbackGains", "ConvergedQ"}]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

LQRegulatorTrain

Details

Examples

Basic Examples (1)

Scope (16)

Environments (11)

Properties (5)

Applications (1)

Chemical Systems (1)

Properties & Relations (2)

Possible Issues (2)

Text

CMS

APA

BibTeX

BibLaTeX

	"Device"	external device
	<\|…\|>	detailed system specification given as an Association

	k_max	number of simulations
	{b,k_max}	specify the batch length b as well

LQRegulatorTrain

Details

Examples

Basic Examples (1)

Scope (16)

Environments (11)

Properties (5)

Applications (1)

Chemical Systems (1)

Properties & Relations (2)

Possible Issues (2)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX