强化学习资料

学习资料重要

相关博客:http://blog.csdn.net/dark_scope/article/details/8252969

专栏:http://blog.csdn.net/column/details/deeprl.html

增强学习课程 David Silver (有视频和ppt):http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html

最好的增强学习教材:

Reinforcement Learning: An Introduction:https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html

深度学习课程 (有视频有ppt有作业):https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/

深度增强学习的讲座都是David Silver的:

ICLR 2015 part 1 https://www.youtube.com/watch?v=EX1CIVVkWdE

ICLR 2015 part 2 https://www.youtube.com/watch?v=zXa6UFLQCtg

UAI 2015 https://www.youtube.com/watch?v=qLaDWKd61Ig

RLDM 2015 http://videolectures.net/rldm2015_silver_reinforcement_learning/

其他课程:

增强学习

Michael Littman: https://www.udacity.com/course/reinforcement-learning–ud600

AI(包含增强学习,使用Pacman实验)

Pieter Abbeel:https://www.edx.org/course/artificial-intelligence-uc-berkeleyx-cs188-1x-0#.VKuKQmTF_og

Deep reinforcement Learning:

Pieter Abbeel  http://rll.berkeley.edu/deeprlcourse/

高级机器人技术(Advanced Robotics):

Pieter Abbeel:http://www.cs.berkeley.edu/~pabbeel/cs287-fa15/

深度学习相关课程:

用于视觉识别的卷积神经网络(Convolutional Neural Network for visual network):http://cs231n.github.io/

机器学习 Machine Learning

Andrew Ng:

https://www.coursera.org/learn/machine-learning/

http://cs229.stanford.edu/

神经网络(Neural Network for Machine Learning)(2012年的)

Hinton:https://www.coursera.org/course/neuralnets

最新机器人专题课程Penn(2016年开课):https://www.coursera.org/specializations/robotics

2 论文资料

https://github.com/junhyukoh/deep-reinforcement-learning-papers

https://github.com/muupan/deep-reinforcement-learning-papers

这两个人收集的基本涵盖了当前deep reinforcement learning 的论文资料。

3 大牛情况:

DeepMind:http://www.deepmind.com/publications.html

Pieter Abbeel 团队:http://www.eecs.berkeley.edu/~pabbeel/

Satinder Singh:http://web.eecs.umich.edu/~baveja/

CMU 进展:http://www.cs.cmu.edu/~lerrelp/

Prefered Networks: (日本创业公司)

4 会议情况

Deep Reinforcement Learning Workshop NIPS 2015 : http://rll.berkeley.edu/deeprlworkshop/

 

深度学习研究总结:强化学习技术趋势与分析(经典论文)

ICLR 2017中和Deep Reinforcement Learning相关的论文我这边收集了一下,一共有30篇(可能有漏),大部分来自于DeepMind和OpenAI,可见DRL依然主要由DeepMind和OpenAI把持。

2 DeepMind的论文分析

[1] LEARNING TO COMPOSE WORDS INTO SENTENCES WITH REINFORCEMENT LEARNING

[2] LEARNING TO NAVIGATE IN COMPLEX ENVIRONMENTS

[3] LEARNING TO PERFORM PHYSICS EXPERIMENTS VIA DEEP REINFORCEMENT LEARNING

[4] PGQ: COMBINING POLICY GRADIENT AND Q- LEARNING

[5] Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC

[6] REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS

[7] SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY

[8] THE PREDICTRON: END-TO-END LEARNING AND PLANNING

3 OpenAI的论文分析(包含Sergey Levine的论文)

[9] #EXPLORATION: A STUDY OF COUNT-BASED EXPLORATION FOR DEEP REINFORCEMENT LEARNING

[10] GENERALIZING SKILLS WITH SEMI-SUPERVISED REINFORCEMENT LEARNING

[11] LEARNING INVARIANT FEATURE SPACES TO TRANS- FER SKILLS WITH REINFORCEMENT LEARNING

[12] LEARNING VISUAL SERVOING WITH DEEP FEATURES AND TRUST REGION FITTED Q-ITERATION

[13] MODULAR MULTITASK REINFORCEMENT LEARNING WITH POLICY SKETCHES

[14] STOCHASTIC NEURAL NETWORKS FOR HIERARCHICAL REINFORCEMENT LEARNING

[15] THIRD PERSON IMITATION LEARNING

[16] UNSUPERVISED PERCEPTUAL REWARDS FOR IMITATION LEARNING

[17] EPOPT: LEARNING ROBUST NEURAL NETWORK POLICIES USING MODEL ENSEMBLES 

[18] RL2: FAST REINFORCEMENT LEARNING VIA SLOW REINFORCEMENT LEARNING

4 其他论文

[19] COMBATING DEEP REINFORCEMENT LEARNING’S SISYPHEAN CURSE WITH INTRINSIC FEAR

[20] COMMUNICATING HIERARCHICAL NEURAL CONTROLLERS FOR LEARNING
ZERO-SHOT TASK GENERALIZATION

[21] DESIGNING NEURAL NETWORK ARCHITECTURES USING REINFORCEMENT LEARNING

[22] LEARNING TO PLAY IN A DAY: FASTER DEEP REIN- FORCEMENT LEARNING BY OPTIMALITY TIGHTENING

[23] LEARNING TO REPEAT: FINE GRAINED ACTION REPETITION FOR DEEP REINFORCEMENT LEARNING

[24] MULTI-TASK LEARNING WITH DEEP MODEL BASED REINFORCEMENT LEARNING

[25] NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING

[26] OPTIONS DISCOVERY WITH BUDGETED REINFORCE- MENT LEARNING

[27] REINFORCEMENT LEARNING THROUGH ASYNCHRONOUS ADVANTAGE ACTOR-CRITIC ON A GPU

[28] SPATIO-TEMPORAL ABSTRACTIONS IN REINFORCEMENT LEARNING THROUGH NEURAL ENCODING

[29] SURPRISE-BASED INTRINSIC MOTIVATION FOR DEEP REINFORCEMENT LEARNING

[30] TUNING RECURRENT NEURAL NETWORKS WITH REINFORCEMENT LEARNING