Modified the potential function, with datasets deliberately occupying fewer samples.
Random policy
Adaptive policy
Note: the adaptive policy uses random noise, but it seems to perform poorly when sampleNormal $e \sim N(0,1)$ noise. Further investigation needed.
Couple notes to self:
- When there are issues with "need to feed placeholders", it's likely b.c. you forgot to set var_list in optimizer.
- These simple models train faster on CPU than GPU.
No comments:
Post a Comment