FANDOM


2020-02-17

Learning from demonstration

  • a robot can replace a human worker only, if he has human level AI
  • sliding mode = model predictive control
  • instead of calculating the complete forward model, only rewards are measured [1]
  • Phantom Auto: teleoperated car
  • productivity paradox "teleoperation"
  • Model Predictive Control for time delay teleoperation
  • robot programming != teleoperation
  • programming by demonstration -> trajectory planner
  • “multiple demonstrations” trajectory database
  • Multiple task learning, task*parameterized Gaussian mixture model (TP GMM)
  • parameterized trajectories
  • "Generative Adversarial Networks" "learning from demonstration"
  • learning from demonstration with spline interpolation

Chatbots

  • chatbots: AIML, deeplearning. Chatbot types: task-oriented/goal-oriented (restaurant reservation) or open-domain. [2]
  • AIML = chatbot description
  • QAKIS Question Answering grounded
  • chatbots are working with hypothesis anchoring
  • creating textadventures instead of chatbots
  • It seems that automatic evaluation of a dialoque is the weak point of current chatbots. Perhaps this explains why GPT2 is using a co-evolution strategy in which the generation of dialogues and the evaluation of speech is trained separate.
  • AIML describes a chatbot corpus
  • chatbot languages: ChatScript (2011) very powerful but huge size to download, AIML (2001) outdated
  • pddl chatbot -> dialoque plan: The agent comes up with a plan to solve its task based on the domain description provided.

References

  1. Gaina, Raluca D., Simon M. Lucas, and Diego Pérez-Liébana. "Tackling sparse rewards in real-time games with statistical forward planning methods." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.
  2. paper “Richárd Csáky: Deep Learning Based Chatbot Models, 2017”
Community content is available under CC-BY-SA unless otherwise noted.