加强学习环境
rlenvs的Python项目详细描述
强化学习环境
这个软件包通过提供可用于测试RL算法的易于生成的RL环境来简化RL实验的生活。在
这项工作仍在进行中,然而,希望这将作为一个有用的功能,精确的RL实验,在一个可重复的,轻量和科学的方式。在
开始吧
安装
使用PyPi安装
pip3 install rlenvs
从源安装
^{pr2}$示例:
班迪特fromrlenvs.banditsimportMultiarmBernoulliBanditenv=MultiarmBernoulliBandit(arms=5)reward,observation,is_finished,internal_state=env.step(0)#picks arm 0
树MDP
fromrlenvs.mdpsimportBalancedDenseTreeDeterministicMDPenv=BalancedDenseTreeDeterministicMDP(branching=3,depth=5)#creates a tree with 3 choices each turn and a total of 5 turns.reward,observation,is_finished,internal_state=env.step(3)#picks arm 0
这样的环境是这样的:
文件:
概述:
总的来说,这个包提供了环境,其API与Deepmind和OpenAI提供的环境非常相似。(用于互操作性。)
这是每个环境提供的接口:
classBaseEnvironment(object):""" Implements the following methods inspired by both OpenAI gym and Deepmind Bsuite (dm_env). :initialise() -> observation, resets and initialises the environment and returns first observation: :step(action) -> reward(float), observation(Optional[Any]), is_finished(bool), state(Optional[Any]): :reset() -> "resets the environement": :undo() -> "goes to the previous state of the environment" reward, observation, is_finished(bool), sate(Optional[Any]): :go_to_state(state) -> "goes to a specific state of the environment" is_finished(bool): :seed(int) -> "sets the seed": :render() -> "renders the environment": :get_specs() -> returns the custom specs of the environment: """
故障排除/常见问题解答:
要求:(要求是什么):
在未来,这将有望是可配置的
python >= 3.6
networkx
graphviz
...
版权所有(C)-Nikolai Rozanov 2020至今
- 项目
标签: