OpenAI_gym的官網案例

Tags: openAI_gym

創建，渲染，隨機選擇動作

當然這只是gym的一個遊戲，還有一些如： MountainCar-v0, MsPacman-v0 (requires the Atari dependency), or Hopper-v1 (requires the MuJoCo dependencies). Environments all descend from the Env base class.

import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000):
    env.render()
    env.step(env.action_space.sample()) # take a random action

環境重置，返回動作，獎勵，狀態，是否終止

往環境輸入一個動作後返回，環境執行完該動作後的一些信息env.step(action)

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break

動作空間和狀態空間

打印動作空間和狀態空間：

Discrete(2)表示該環境的動作空間爲離散的動作空間（0,1）
Box(4,)表示該狀態空間是一個一維向量構成

import gym
env = gym.make('CartPole-v0')
print(env.action_space)
#> Discrete(2)
print(env.observation_space)
#> Box(4,)

同時可以獲取狀態空間的每一維度的最值

print(env.observation_space.high)
#> array([ 2.4       ,         inf,  0.20943951,         inf])
print(env.observation_space.low)
#> array([-2.4       ,        -inf, -0.20943951,        -inf])

gym提供了自定義的空間

from gym import spaces
space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}
x = space.sample()
assert space.contains(x)
assert space.n == 8

gym自帶所有的環境

返回所有環境

from gym import envs
print(envs.registry.all())

Charel_CHEN

發佈了46 篇原創文章 · 獲贊 34 · 訪問量 7萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

OpenAI_gym的官網案例

OpenAI_gym的官網案例

創建，渲染，隨機選擇動作

環境重置，返回動作，獎勵，狀態，是否終止

動作空間和狀態空間

gym自帶所有的環境

FeUdal Networks for Hierarchical Reinforcement Learning 閱讀筆記

Feature Pyramid Networks for Object Detection 閱讀筆記

ROIPoolingLayer源碼解析

ResNet-BN tensorflow源碼解析

DARLA: Improving Zero-Shot Transfer in Reinforcement Learning 閱讀筆記

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結