Posts by Tags

open source motion tracking

August 30, 2020

Early in my PhD I developed an open-source motion tracking system for mice - now for sale at LABmaker 😄! With the KineMouse wheel neuroscientists can reconstruct 3D pose while recording neural activity. The hackaday protocol describes how to build the system. This supplement contains additional info for motion tracking aficionados. Please be nice to your mice ❤️🐭❤️.

reinforcement learning (4/4): policy gradient

May 14, 2020

In parts 1-3 we found that learning the values of different states (or state-action pairs) made it easy to define good polices; we simply selected high valued states and actions. Policy gradient methods use a different approach: learn policies directly by optimizing their parameters to maximize reward. These techniques allow us to tackle more interesting problems consisting of large or continuous action and state spaces. The math is a bit heavier , but so is the payoff.

reinforcement learning (3/4): temporal difference learning

April 25, 2020

In part 1 we discussed dynamic programming and Monte Carlo reinforcement learning algorithms. These appear to be qualitatively different approaches; whereas dynamic programming is model-based and relies on bootstrapping, Monte Carlo is model-free and relies on sampling environment interactions. However, these approaches can be thought of as two extremes on a continuum defined by the degree of bootstrapping vs. sampling. Temporal difference is a model-free algorithm that splits the difference between dynamic programming and Monte Carlo approaches by using both bootstrapping and sampling to learn online.

reinforcement learning (2/4): value function approximation

April 12, 2020

The methods we discussed in part 1 are limited when state spaces are large and/or continuous. Value function approximation addresses this by using functions to approximate the relationship between states and their value. But how can we find the parameters $\mathbf{w}$ of our value function $\hat{v}(s, \mathbf{w})$? Gradient descent works nicely here, which gives us tons of flexibility in how we model value functions.

reinforcement learning (1/4): overview, dynamic programming, monte carlo

April 02, 2020

While quarantined in NYC I’ve finally worked through the classic text on reinforcement learning. This is the first of a 4 part summary of the text intended for those interested in learning RL who are not interested in staying in their apartment for three months to learn it .

open source motion tracking

August 30, 2020

Early in my PhD I developed an open-source motion tracking system for mice - now for sale at LABmaker 😄! With the KineMouse wheel neuroscientists can reconstruct 3D pose while recording neural activity. The hackaday protocol describes how to build the system. This supplement contains additional info for motion tracking aficionados. Please be nice to your mice ❤️🐭❤️.

reinforcement learning (4/4): policy gradient

May 14, 2020

In parts 1-3 we found that learning the values of different states (or state-action pairs) made it easy to define good polices; we simply selected high valued states and actions. Policy gradient methods use a different approach: learn policies directly by optimizing their parameters to maximize reward. These techniques allow us to tackle more interesting problems consisting of large or continuous action and state spaces. The math is a bit heavier , but so is the payoff.

reinforcement learning (3/4): temporal difference learning

April 25, 2020

In part 1 we discussed dynamic programming and Monte Carlo reinforcement learning algorithms. These appear to be qualitatively different approaches; whereas dynamic programming is model-based and relies on bootstrapping, Monte Carlo is model-free and relies on sampling environment interactions. However, these approaches can be thought of as two extremes on a continuum defined by the degree of bootstrapping vs. sampling. Temporal difference is a model-free algorithm that splits the difference between dynamic programming and Monte Carlo approaches by using both bootstrapping and sampling to learn online.

reinforcement learning (2/4): value function approximation

April 12, 2020

The methods we discussed in part 1 are limited when state spaces are large and/or continuous. Value function approximation addresses this by using functions to approximate the relationship between states and their value. But how can we find the parameters $\mathbf{w}$ of our value function $\hat{v}(s, \mathbf{w})$? Gradient descent works nicely here, which gives us tons of flexibility in how we model value functions.

reinforcement learning (1/4): overview, dynamic programming, monte carlo

April 02, 2020

While quarantined in NYC I’ve finally worked through the classic text on reinforcement learning. This is the first of a 4 part summary of the text intended for those interested in learning RL who are not interested in staying in their apartment for three months to learn it .

Richard Warren

Posts by Tags

machine learning

open source motion tracking

reinforcement learning (4/4): policy gradient

reinforcement learning (3/4): temporal difference learning

reinforcement learning (2/4): value function approximation

reinforcement learning (1/4): overview, dynamic programming, monte carlo

open source

open source motion tracking

reinforcement learning

reinforcement learning (4/4): policy gradient

reinforcement learning (3/4): temporal difference learning

reinforcement learning (2/4): value function approximation

reinforcement learning (1/4): overview, dynamic programming, monte carlo