Questions tagged [tf-agent]
43 questions
6
votes
1 answer
why is data from tf-agents buffer in random order
tl-dr version: why do the first 2 action/observations i take not line up with my first two objects in my replay buffer?
Do tf-agent replay buffers automatically shuffle data around?
by adding these prints im able to see what my first 2 steps look…

tgm_learn
- 61
- 7
2
votes
1 answer
Why is my DQN-agent's training so inefficient?
I am trying to train an agent to play tic-tac-toe perfectly as a second player (the first player walks randomly) with the DQN-agent from tf-agents, but my training is extremely slow.
For 100_000 steps, the model did not improve its results in any…

Karasic
- 29
- 3
2
votes
2 answers
Tf-agent Actor/Learner: TFUniform ReplayBuffer dimensionality issue - invalid shape of Replay Buffer vs. Actor update
I try to adapt the this tf-agents actor<->learner DQN Atari Pong example to my windows machine using a TFUniformReplayBuffer instead of the ReverbReplayBuffer which only works on linux machine but I face a dimensional issue.
[...]
---> 67…

Sch_Stef
- 31
- 4
2
votes
0 answers
Tensorflow, PyEnvironment: Given `time_step` does not match expected `time_step_spec`
I'm trying to set up a custom PyEnvironment and I get Given 'time_step' does not match expected 'time_step_spec error. I don't see where the the dtype specification is missing.
Here's the environment:
class TicTacToe(py_environment.PyEnvironment):
…

flowerboy
- 81
- 7
2
votes
1 answer
Error while importing tf_agents in google colab
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import abc
import tensorflow as tf
import numpy as np
import pandas as pd
from tf_agents.environments import py_environment
from…

AB Music Box
- 81
- 1
- 6
2
votes
1 answer
py_environment 'time_step' doesn't match 'time_step_spec' - but I can't spot the difference
I'm trying to create a custom tf-agents environment for trading. When I try to validate it by calling utils.validate_py_environment(environment, episodes=1), I'm getting a ValueError 'time_step' doesn't match 'time_step_spec' .
I've been trying to…

Top Snek
- 71
- 5
2
votes
1 answer
How to pass the batchsize for a custom environment in Tf-agents
I am using tf-agents library to build a contextual bandit.
For this I am building a custom environment.
I am creating a banditpyenvironment and wrapping it in the TFpyenvironment.
The tfpyenvironment automatically adds the batch size dimension (in…

tjt
- 620
- 2
- 7
- 17
1
vote
0 answers
How do I use all cores of my CPU in reinforcement learning with TF Agents?
I work with an RL algorithm. I'm using tensorflow and tf-agents and training a DQN. My problem is that only one core of the CPU is used when calculating the 10 episodes in the environment for data collection.
My training function looks like…

masterkey
- 65
- 4
1
vote
1 answer
What controls the second dimension of tf observations/ what a qnet accepts in its place?
Short version. I cant find the variable(s) that control either:
A) The 2nd dimension of a variable in a trajectory, eg the 3 in
Trajectory({'action':

tgmcroc
- 41
- 5
1
vote
1 answer
How to store tf-agents' trajectory object in big query from python and retrieve it back as the trajectory object
I wanted to save the trajectories from the tf-agents into a big query table and wanted to retrieve them back as needed into python again.
In the python dataframe, the trajectories are saved as trajectory object. But, I am not sure how to save these…

tjt
- 620
- 2
- 7
- 17
1
vote
1 answer
Error when saving model with tensorflow-agents
I am trying to save a model with tensorflow-agents. First I define the following:
collect_policy = tf_agent.collect_policy
saver = PolicySaver(collect_policy, batch_size=None)
and then save the model like this:
saver.save('my_directory/')
This…

Enrique
- 9,920
- 7
- 47
- 59
1
vote
1 answer
TF Agent taking the same action for all test states after training in Reinforcement Learning
I am trying to create a Custom PyEnvironment for making an agent learn the optimum hour to send the notification to the users, based on the rewards received by clicking on the notifications sent in previous 7 days.
After the training is complete,…

Sukrit Mehta
- 11
- 2
1
vote
0 answers
How to write a custom policy in tf_agents
I wanted to use the contextual bandit agents (LinearThompson Sampling agent) in the tf_Agents.
I am using a custom environment and my rewards are delayed by 3 days. Hence for training, the observations are generated from the saved historical tables…

tjt
- 620
- 2
- 7
- 17
1
vote
0 answers
Training agent using historical data in TF-agents
I am using contextual bandits algorithm in TF_agents.
Is there a way to train the agent using historical data (context, action, reward) in table, instead of using the replay buffer ?
The environment provides context and reward. Therefore I cam make…

tjt
- 620
- 2
- 7
- 17
1
vote
0 answers
TF-Agents _action_spec: how to define the correct shape for discrete action space?
Scenario 1
My custom environment has the following _action_spec:
self._action_spec = array_spec.BoundedArraySpec(
shape=(highestIndex+1,), dtype=np.int32, minimum=0, maximum=highestIndex, name='action')
Therefore my actions are…

Ling
- 449
- 6
- 21