On the ESO based reinforcement learning for pure feedback systems

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The control of pure feedback system, which is widely used but has non-affine property, has always been an important and challenging problem. In order to achieve precise tracking control of pure feedback system through improving the disturbance rejection ability of existing reinforcement learning algorithm, a reinforcement learning (RL) control strategy based on extended state observer (ESO) is proposed in this paper. In the proposed method, the extended state observer can reject the total disturbances and transform the pure feedback system which is in an input-output predictor from to overcome the non-causal problem into a cascade integral form. This allows the continuous reinforcement learning strategy of the actor-critic (AC) structure not to depend on the detailed model information, which makes it practically data-driven. It is worth noting that, in order to further improve the ability to track the changing reference trajectory, a novel curvature acceleration factor is proposed, which can adjust the learning speed of the reinforcement learning controller according to the curvature of the reference trajectory. The validity of the proposed algorithm is verified by the simulation results.
Original languageEnglish
Title of host publicationProceedings of the ASME Design Engineering Technical Conference
Place of Publicationusa
PublisherAmerican Society of Mechanical Engineers (ASME)[email protected]
Volume9
ISBN (Electronic)9780791858233
DOIs
StatePublished - Jan 1 2017
EventASME 2017 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC/CIE 2017 - Cleveland, United States
Duration: Aug 6 2017Aug 9 2017

Conference

ConferenceASME 2017 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC/CIE 2017
Country/TerritoryUnited States
CityCleveland
Period08/6/1708/9/17

Keywords

  • Curvature acceleration factor
  • Extended state observer
  • Pure feedback system
  • Reinforcement learning

Cite this