Optimal Control for One-link pendulum swing-up

Description

Trajectory Optimization

Controller Design

Controller Indexed by Time

Description:
Synthesis:
Result:

Controller Indexed by State

Description:
Synthesis:

Description

One-link pendulum swing-up

In one link pendulum swing-up a motor at the base of the pendulum swings a rigid arm from the downward stable equilibrium to the upright unstable equilibrium and balances the arm there. What makes this challenging is that the one step cost function penalizes the amount of torque used and the deviation of the current position from the goal. The controller must try to minimize the total cost of the trajectory. The one step cost function for this example is a weighted sum of the squared position errors (difference between current angles and the goal angles) and the squared torques, $latex2png equation$ , where 0.1 weights the position error relative to the torque penalty, and T is the time step of the simulation (0.01s). There are no costs associated with the joint velocity.

Trajectory Optimization

Trajectory optimization in ampl. Code here
Trajectory optimization by Matlab.Code here

Controller Design

Controller Indexed by Time

Description:

Controller indexed by time takes the form
$latex2png equation$
where $latex2png equation$ is the driving torque, $latex2png equation$ and $latex2png equation$ is the optimal trajectory, and $latex2png equation$ is the feedforward optimal torque.

Synthesis:

Controller's structure

Result:

Dynamic simulation where p=0 and v=0

Controller Indexed by State

Description:

The controller takes the form
$latex2png equation$
where $latex2png equation$ is the driving torque, $latex2png equation$ and $latex2png equation$ the current position and velocity. In order to get optimal control policy, we generate optimal trajectories from a grid of starting points and use the first $latex2png equation$ as the optimal control for the state at the starting point. Each trajectory is locally optimized using SNOPT. Information is exchanged between trajectories to enable convergence to globally optimal trajectories ¹.

Synthesis:

Optimal policy

Value function

Optimal trajectory and policy

1. Atkeson, C.G.; Stephens, B.J., "Random Sampling of States in Dynamic Programming," Systems, Man, and Cybernetics, Part B, IEEE Transactions on , vol.38, no.4, pp.924-929, Aug. 2008

Last Update: 2008-10-20-22:54:53

Home

Mail to me