sách gpt4 ai đã đi

python - 自定义环境的 Tensorflow 2.0 DQN 代理问题

In lại 作者:行者123 更新时间:2023-12-04 10:56:32 28 4
mua khóa gpt4 Nike

所以我一直在关注 DQN 代理示例/教程,并按照示例中的方式进行设置,唯一的区别是我构建了自己的自定义 python 环境,然后将其包装在 TensorFlow 中。然而,无论我如何塑造我的观察和行动规范,每当我给它一个观察并请求一个行动时,我似乎都无法让它发挥作用。这是我得到的错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] is not a matrix. Instead it has shape [10] [Op:MatMul]


layer_parameters = (10,) #10 layers deep, shape is unspecified

learning_rate = 1e-3 # @param {type:"number"}
train_step_counter = tf.Variable(0)

#instantiate agent

optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=learning_rate)

env = SumoEnvironment(self._num_actions,self._num_states)
env2 = tf_py_environment.TFPyEnvironment(env)
q_net= q_network.QNetwork(env2.observation_spec(),env2.action_spec(),fc_layer_params = layer_parameters)

print("Time step spec")

agent = dqn_agent.DqnAgent(env2.time_step_spec(),
optimizer = optimizer,


class SumoEnvironment(py_environment.PyEnvironment):

def __init__(self, no_of_Actions, no_of_Observations):

#this means that the observation consists of a number of arrays equal to self._num_states, with datatype float32
self._observation_spec = specs.TensorSpec(shape=(16,),dtype=np.float32,name='observation')
#action spec, shape unknown, min is 0, max is the number of actions
self._action_spec = specs.BoundedArraySpec(shape=(1,),dtype=np.int32,minimum=0,maximum=no_of_Actions-1,name='action')

self._state = 0
self._episode_ended = False


tf.Tensor([ 0. 0. 0. 0. 0. 0. 0. 0. -1. -1. -1. -1. 0. 0. 0. -1.], shape=(16,), dtype=float32)

我已经尝试试验 Q_Net 的形状和深度,在我看来,错误中的 [10] 与我的 q 网络的形状有关。将其层参数设置为 (4,) 会产生以下错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] is not a matrix. Instead it has shape [4] [Op:MatMul]

1 Câu trả lời

在您的 Python 环境中,您应该将 self._observation_spec 定义为类型 BoundedArraySpec Còn hơn là TensorSpec,然后是 tf_py_environment。 TFPyEnvironment(env) 将python环境转换为tensorflow环境。


关于python - 自定义环境的 Tensorflow 2.0 DQN 代理问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59141439/

28 4 0
Hồ sơ cá nhân

Tôi là một lập trình viên xuất sắc, rất giỏi!

Nhận phiếu giảm giá Didi Taxi miễn phí
Mã giảm giá Didi Taxi
Giấy chứng nhận ICP Bắc Kinh số 000000
Hợp tác quảng cáo: 1813099741@qq.com 6ren.com