Poster Template - Intelligent Systems Center

Poster Template - Intelligent Systems Center

Stochastic Optimal Control of Unknown Linear Networked Control
System in the Presence of Random Delays and Packet Losses
Faculty Advisor: Dr. Jagannathan Sarangapani, ECE Department

uk 1

r zk ,uk J

1
0
x 0
x 0 0.1818 2.6727

0
0
0

0 0.4545 31.1818

Investigate the effects of delays and packet losses on the stability of
the NCS with unknown dynamics

where

T
k

z Qz zk u Rz uk

H z
i

T
*
k 1

T
k 1

u

AzT Pi Bz

Rz BzT Pi Bz

4. Define the update law to tune the H matrix online in least-squares sense

Networked control can reduce the installation costs and increase
productivity through the use of wireless communication technology

1) Vectorize the H matrix: h vec H
2) Update law:

hi 1 arg min
hi 1

hiT1w

1) Stability:

zk d w zk , H i

dz k

ui* z k

T T

5. Develop the stochastic suboptimal control
K R B P B B P A H H
u K z
6. Convergence: when i , Q z , u Q z , u andH i H , K i K at the same time.
i

i *
k

z

1

T
z i

z

T
z i

i 1
uu

z

i
uz

Approximate dynamic programming (ADP) techniques intent to solve
optimal control problems of complex systems without the knowledge
of system dynamics in a forward-in-time manner.

k

i *
k

*

k

2. Set up stochastic Q-function:Q zk , uk E

3. Using the adaptive estimator to represent the

J k 1

ukT

T
u kT

H k z kT

Q-function:Q zk , uk wkT H k wk hkT wk

Figure 1 the wireless networked control system

The proposed approach for optimal controller design involves using a
combination of Q-learning and adaptive estimator (AE) whereas for
suboptimal controller design only Q-learning scheme will be utilized

The delays and packet losses are incorporated in the dynamic
model which will be used for the controller development

zk 1

Networked control system representation
B
A z B u , y C z
zk k

As

0
0
Azk

0

zk k

k

z k

Ipk 1 B1k

Ipk i Bik

0
Im

0

0

Im

0

Im

ik 1 i 1T

k
i

Ipk d 1 Bdk 1
Ip k B0k

0

Im

0

B 0
0
zk 0

0
0

ik

iT

and

u k K k z k H kuu

1

,

zk

xkT

T
u kT d 1

Figure 2 depicts a block diagram representation:

and

u k u k*

Figure 3 present the block diagram for the AE-based
stochastic optimal regulator of NCS
zk

uu 1
k

H

H kuz

u zk Hkuu

1

Plant

Sensor

Wireless Network
Delay
And
Packet losses

ca (t)

sc (t )

Ip (t )

Ip (t )

Delay
And
Packet losses

Azk and Bzk

Adaptive Estimator of
Q z k , uk function

Jk wkT H k wk

h kT w k
Cost Function
Network

Controller

Figure 2 Block diagram of Networked control system

-40
0

3.5

7
10.5
Time (Sec)
(b)

14

-40
0

17.5

3.5

7
10.5
Time (Sec)
(c)

14

17.5

System total costs with Q-learning
suboptimal control and Proposed AE optimal control
5
with unknown dynamics
x 10
15
Q-learning suboptimal control
Proposed AE optimal control

10

Control inputs with Q-learning suboptimal control and
Proposed AE optimal control with unknown dynamics
100
Q-learning suboptimal control
Proposed AE optimal control
50

5

0
-50
-100
-150

3.5

7
Time (Sec)
(a)

10.5

14

-200
0

3.5

7
Time (Sec)
(b)

10.5

14

Proposed Q-learning based suboptimal and AE-based
optimal control design for NCS with unknown dynamics in
presence of random delays and packet losses performs
superior than a traditional controller

Both Q-learning based suboptimal control and AE-based
optimal control can maintain NCS stable.

Proposed AE-based optimal control is more effective than
Proposed Q-learning based suboptimal control.

Hkuz zk

z k 1 Azk z k B zk u z k
Linear Network Control System with Unknown

-20

-30

7

0

n d 1 m

T

Actuator

AE-based Stochastic Optimal Control (2)

e AT s dsB 1 T ik 1 ik 1 ik iT ,

u kT 1

then

Jk J k*

-20

CONCLUSIONS

H kuz z k

k , z k 0 hk hk

-10

20

As shown in figure
6-(a),
proposed
AE-base optimal controller can minimize the cost-to
T
T
zi Qz zi ui Rz ui ) better than proposed Q-learning suboptimal controller. In
J E
go
( k
i k

Figure 6-(b), proposed AE-based optimal control can force NCS states converge to zero
quicker than Q-learning suboptimal control. It indicates proposed AE-based optimal
control is more effective than Q-learning suboptimal control.

5. Determine the AE stochastic optimal control input

0, if uk i was received during kT , k 1 T

1, if uk i was lost during kT , k 1 T

Ipk i

6. Convergence: when

i 1,2,..., d 1

C z C 0 0

2) Update

ehk r z k 1 , u k 1 hkT Wk 1

T
T
whereWk 1 wk wk 1 and r z k 1 , u k 1 z k 1Qz z k 1 u k 1 Rz u k 1
1

T
T
T
T

h

W

W

W

e

r
z
,
u
k

1
k
k
k
h
hk
k
k
law for time varying matrix H:

0

40

Figure 6 Optimal performance

where h is a constant, and 0 h 1

Networked Control System Model

1) Represent residual error:

3.5
5.25
Time (Sec)
(a)

0
0

4. Define the update law to tune the approximated H matrix

1.75

2) Optimality:

T

2
2
2
T
T
n d m l

w

w
,...,
w
,...,
w
w
,
w

h

vec
H
,
w

z
u
z
,
w

k1
k2
kl 1 kl
kl is the Kronecker
k
k
k
where k
and k
k k
product quadratic polynomial basis vector

-60

60

As shown in Figure 5, if we use a PID without considering delays and packet losses, the
NCS will be unstable(fig.5-(a)). However, when we implement proposed Q-learning
suboptimal and AE optimal control, the NCS can still maintain stable(Fig.5-(b),(c)).

1. When random delays and packet losses are considered, H matrix become
time-varying. However, we assume that it changes slowly.
z kT

e1
e2
e3
e4

10

e1
e2
e3
e4

Figure 5 Stability performance

AE-based Stochastic Optimal Control

ukT Rz u k

-20

0

*
k

z kT Qz z k

0

80

e1
e2
e3
e4

20

-80

i k

i

30

-40

State Regulation Errors with Proposed AE
Optimal control of NCS with unknown dynamics

State Regulation Errors with Q-learning
suboptimal control of NCS with unknown dynamics

20

where d w zk , H i zkT Qz zk ui zk T Rz ui zk Qi z k 1 , ui zk 1 and w zk z kT

The challenging problems in control of networked-based system are
network delay and packet losses. These effects do not only degrade
the performance of NCS, but also can destabilize the system.

Performance evaluation of proposed suboptimal and optimal control
State Regulation Error of NCS
with Delay and Packet Losses

2

0 x 0
0 x 1.8182

u
1 0

0 4.5455

After random delays and packet losses due to NCS, the original time-invariant system
was discretized and represented as a time-varying system zk 1 Azk zk Bzk uk , yk C z zk (Note:
since the random delays and packet losses are considered, the NCS model is not only
time varying , but also a function of time k)

T

3. Using mean values of the delays and packet losses instead of the random
delays and packet losses, then H matrix become time-invariant matrix.

BACKGROUND

u

T
k

i
Qz AzT Pi Az
H zu

i
T
H uu
Bz Pi Az

System total costs J(Xk)

z

T
*
k 1

T
k 1

*
i
k 1

i
H zz
H i i
H uz

Regulation Error Values

Develop an adaptive estimator (AE)-based stochastic optimal control

Consider the linear time-invariant inverted pendulum dynamics

Control Input

Qi 1 zk ,uk r zk ,uk min Qi zk 1 ,uk 1

2. Define the update law to tune the Q-function

Develop a Q-learning based stochastic suboptimal controller for an
unknown networked control system (NCS) with random delay and
packet losses;

Regulation Error Values

Q z k , u k E z kT Qz z k u kT Rz u k J k 1

1. Define the Q-function:

Simulation Results

Q-learning Stochastic Suboptimal Control

OBJECTIVES

Regulation Error Values

Student: Hao Xu, ECE Department

Figure 3 Stochastic optimal regulator block diagram

FUTURE WORK

Design suboptimal and optimal control for nonlinear
networked control systems (NNCS) with unknown
dynamics in presence of random delays and packet losses

Design a novel wireless network protocol to decrease the
effects of random delays and packet losses.

Optimize the NNCS globally from both control part and
wireless network part.

Recently Viewed Presentations

  • Hyperspectral Water Vapor Radiance Assimilation

    Hyperspectral Water Vapor Radiance Assimilation

    Stratospheric Water Vapor James Jung Contributions From: Fanglin Yang, Shrinivas Moorthi, Yu-Tai Hou Paul Van Delst, David Groff, Andrew Collard, Daryl Kleist NCEP Global Branch Meeting 7 October 2010 Outline Background Estimates/Results from research projects HALOE SAGE II Estimates from...
  • Designating and Determining Issues on Applications for Writs ...

    Designating and Determining Issues on Applications for Writs ...

    Judge Alcala dissented and said that Robbins "is entitled to relief on his application for a writ of habeas corpus on the ground that he was denied due process of law by the State's use of false testimony to obtain...
  • crysp.uwaterloo.ca

    crysp.uwaterloo.ca

    Last time Other malicious code Back doors Salami attacks Rootkits Interface illusions Keystroke logging Man-in-the-middle attacks Nonmalicious flaws Covert channels
  • isgillette.weebly.com

    isgillette.weebly.com

    "The Veldt" Response Questions. At what line does George start to believe there is a problem? Explain. Who is to blame for the way the Hadley family is now? Explain, using a quote from the story to support your answer....
  • Financial Aid Basics - sbac.edu

    Financial Aid Basics - sbac.edu

    Apply Early ! www.fafsa.ed.gov. Early applicants may get more money than late applicants! October 1 of each year is the earliestyou can apply for the following academic year which begins in August.
  • The Origin and Growth of Liberalism

    The Origin and Growth of Liberalism

    Great renaissance thinker. Known for paintings, scientific ideas, and inventions ... At the same time there was political struggles for a less authoritarian rule which challenged the status quo. As a result Classical liberalism emerged which was a political and...
  • Altruism - Suffolk City Public Schools

    Altruism - Suffolk City Public Schools

    Reciprocal altruism! However, when it is not in their best interest, and they think the other is about to "roll" on them, they will give their friend up. Conclusion: Reciprocal altruism only occurs when it is in the best interest...
  • Black Codes - Council Rock School District

    Black Codes - Council Rock School District

    Republicanswon a 3-1majority in both houses and gained control of every northern state. Elections of 1866. Citizens grew tired of Johnson's calls for leniency with such violence in the South. As a result, the Radical Republicans won an overwhelming victory...