Reinforcement Learning 5: Методы на основе политики агента
В этом видео разберемся с новой группой методов, которые основаны непосредственно на политике агента. Познакомимся с методом REINFORCE, рассмотрим комбинацию алгоритмов Actor Critic, основанных на значениях, похожих на Policy Gradient и Q-Learning.
In this video, we will understand a new group of methods that are based directly on the agent’s policy. Let’s get acquainted with the REINFORCE method, consider a combination of Actor Critic algorithms based on values similar to Policy Gradient and Q-Learning.
00:00:00 Начало видео
00:01:05 Deep Q-Network (DQN) method
00:03:26 Policy function
00:05:34 Policy Gradients method
00:17:14 Метод REINFORCE
00:23:51 Actor-Critic
00:25:05 A2C (Advantage Actor-Critic)
00:34:00 A3C (Asynchronous Advantage Actor-Critic)
00:45:40 Actor-Critic for continuous action spaces
00:53:25 Actor-Critic: Model
00:56:58 Actor-Critic: Policy and Training
01:10:07 Mountain Car Continuous
01:14:42 Actor-Critic: Гиперпараметры
Ukrainian IT-company. Machine Learning | Data Science | Artificial Intelligence
#artificialintelligence
#MachineLearning #ReinforcementLearning
#ИскусственныйИнтеллект #Машинноеобучение
8 views
5
2
9 years ago 01:47:00 48
Machine learning for neuroscience: HMMs, reinforcement learning, and deep learning
6 years ago 03:55:27 52
Reinforcement Learning Course - Full Machine Learning Tutorial
5 years ago 01:31:18 7
Reinforcement Learning | IJCAI Macao 2019
5 years ago 02:14:58 32
News Sentiment & Reinforcement Learning in Finance & Algorithmic Trading
12 months ago 01:00:19 15
MIT : Reinforcement Learning
7 years ago 00:36:04 73
How reinforcement learning works in Becca 7
8 years ago 01:27:30 152
MIT : Deep Reinforcement Learning for Motion Planning
3 years ago 00:44:27 2
Panel: The future of reinforcement learning
6 years ago 01:01:39 37
Deep Reinforcement Learning in Robotics with NVIDIA Jetson
2 years ago 00:08:40 40
AI Learns to Walk (deep reinforcement learning)
6 years ago 03:55:27 9
Reinforcement Learning Crash Course | Complete Deep Learning Course