Learning from demonstration
- a robot can replace a human worker only, if he has human level AI
- sliding mode = model predictive control
- instead of calculating the complete forward model, only rewards are measured 
- Phantom Auto: teleoperated car
- productivity paradox "teleoperation"
- Model Predictive Control for time delay teleoperation
- robot programming != teleoperation
- programming by demonstration -> trajectory planner
- “multiple demonstrations” trajectory database
- Multiple task learning, task*parameterized Gaussian mixture model (TP GMM)
- parameterized trajectories
- "Generative Adversarial Networks" "learning from demonstration"
- learning from demonstration with spline interpolation
- chatbots: AIML, deeplearning. Chatbot types: task-oriented/goal-oriented (restaurant reservation) or open-domain. 
- AIML = chatbot description
- QAKIS Question Answering grounded
- chatbots are working with hypothesis anchoring
- creating textadventures instead of chatbots
- It seems that automatic evaluation of a dialoque is the weak point of current chatbots. Perhaps this explains why GPT2 is using a co-evolution strategy in which the generation of dialogues and the evaluation of speech is trained separate.
- AIML describes a chatbot corpus
- chatbot languages: ChatScript (2011) very powerful but huge size to download, AIML (2001) outdated
- pddl chatbot -> dialoque plan: The agent comes up with a plan to solve its task based on the domain description provided.
- Gaina, Raluca D., Simon M. Lucas, and Diego Pérez-Liébana. "Tackling sparse rewards in real-time games with statistical forward planning methods." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.
- paper “Richárd Csáky: Deep Learning Based Chatbot Models, 2017”
Community content is available under CC-BY-SA unless otherwise noted.