Domain Randomization

Domain randomization is a technique used in Reinforcement Learning (RL) and Sim-to-Real transfer to improve the generalization of an RL agent. The idea is to train the agent in a simulated environment where various physical properties are randomized during training, so the policy becomes robust to variations.

What is Randomized?

Some examples of randomized parameters in domain randomization:

Why is it Useful?

For your dual-arm manipulation task, you could use domain randomization in:

Value Function and Q-Function

In Reinforcement Learning, we estimate how good a certain state or action is by using value functions.

1. Value Function (V-function)

2. Q-Function (Q-value function)

Difference Between V and Q

Concept Value Function (V-function) Q-Function (Q-function)
Definition Expected reward from state ss Expected reward from state-action pair (s,a)(s, a)
Input State ss State ss, Action aa
Output Expected total reward Expected total reward if taking action aa in state ss
Used in Policy-based methods (e.g., Actor-Critic) Q-learning, DDPG, SAC (off-policy methods)

Since your action space is joint velocities, your Q-function will evaluate how good different velocity commands are in a given state.



How It Relates to Your Task


No, Domain Randomization and Curriculum Learning are different concepts in reinforcement learning, though they can sometimes be used together.

1. Domain Randomization

2. Curriculum Learning