99°
  •  Home
  •  Archives
  •  About
Hot Tags : Reinforcement Learning
  •  Home
  •  Archives
  •  About

强化学习介绍

 2018-11-12
 memo

蒙特卡罗 VS TD学习方法

我们有两种学习方式:

  • 在episode的最后收集奖励并计算最大期望将来奖励:蒙特卡罗方法
  • 估计每一步的奖励:时序差分学习
Read More
Reinforcement Learning
  Categories
  • Algorithm
    (20)
  • BigData
    (1)
  • Database
    (1)
  • EAs
    (2)
  • memo
    (59)
  Hot Tags
BFS Binary Search C/C++ DE DFS Database Doubling Dynamic Programming Git Go Golang Graph theory Greedy Hadoop Haskell Heap Java Knapsack LCA Latex LeetCode Linux Machine Learning Markdown Matlab Microservice MySQL Natural Language Processing Number theory PSO Paper Prefix Sum Reinforcement Learning Rust Simulation Spanning tree String Union-Find Set Water Web Windows Programming

Blog content follows the Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License

Use Material X as theme, total visits 37387 times.
    of
    NEXT PREV