Python easyagents-v1包_程序模块 - PyPI

为实践者重新开展学习。

easyagents-v1的Python项目详细描述

从业者强化学习（v1α）

Travis_Status

状态：在活动开发中，可能会发生中断性更改

EasyAgents logo

easyagents是一个高级强化学习api，用python编写，运行在 OpenAI gym使用 tf-Agents和OpenAI baselines。

如果

您正在寻找一种简单易行的方法开始强化学习
您已经实现了自己的环境，并希望尝试使用它
您需要混合和匹配不同的实现和算法

在科拉布身上试试：

Cartpole on colab （引言经典的强化学习示例（平衡手推车上的木棍）
Berater on colab （自定义环境和培训的示例。基于路线问题的健身房环境）
LineWorld on colab （实施您自己的环境，车间示例）

与Oliver Zeigermann合作。

v1的想法

指导原则

轻松训练、评估和调试（您自己的）健身房环境的策略而不是“设计新算法”

{STR 1 } $简单且一致：过“灵活和强大”

灵感来自keras：

所有算法使用相同的api
支持同一算法的不同实现

场景

简单的

agent = PpoAgent( "LineWorld-v0" )
agent.train( SingleEpisode() )
agent.train()
agent.save(...)
agent.load(...)
agent.play()

高级

agent = PpoAgent( "LineWorld-v0", fc_layers=(500,250,50) )
agent.train( train=[Fast(), ModelCheckPoint(), ReduceLROnPlateau(), TensorBoard()],
             play=[JupyterStatistics(), JupyterRender(), Mp4()],
             api=[AgentApi()] )

设计理念

使用前端/后端体系结构将“公共api”与具体实现分离（灵感来自scikit learn、matplotlib、keras）
可插拔后端
可通过回调扩展（受keras启发）用于培训、评估和监控的单独回调类型
可预先配置，特定于算法的训练和播放循环

安装

使用pip从pypi安装：

pipinstalleasyagents-v1

词汇

以下是强化学习空间中的术语列表，以口语的方式解释这些解释通常都是正确的，只是想传达一个大致的想法（如果你发现它们是错误的或者缺少一个术语：请让我知道，此外，列表仅包含实际用于此项目的术语）

term	explanation
action	A game command to be sent to the environment. Depending on the game engine actions can be discrete (like left/reight/up/down buttons or continuous like 'move 11.2 degrees to the right')
batch	a subset of the training examples. Typically the training examples are split into batches of equal size.
episode	1 game played. A sequence of (state,action,reward) from an initial game state until the game ends.
environment (aka game engine)	The game engine, containing the business logic for your problem. RL algorithms create an instance of the environment and play against it to learn a policy.
epoch	1 full training step over all examples. A forward pass followed by a backpropagation for all training examples (batchs).
iterations	The number of passes needed to process all batches (=#training_examples/batch_size)
observation (aka game state)	All information needed to represent the current state of the environment.
optimal policy	A policy that 'always' reaches the maximum number of points. Finding good policies for a game about which we know (almost) nothing else is the goal of reinforcement learning. Real-life algorithms typically don't find an optimal policy, striving for a local optimum.
policy (aka gaming strategy)	The 'stuff' we want to learn. A policy maps the current game state to an action. Policies can be very dump like one that randomly chooses an arbitrary action, independent of the current game state. Or they can be clever, like an that maximizes the reward over the whole game.
training example	a state together with the desired output of the neural network. For an actor network thats (state, action), for a value network (state, value).

不使用EasyAgents如果

您希望利用算法的特定于实现的优势
你想做分布式或并行强化学习

注意

该存储库正在积极开发中，处于早期阶段。因此，任何事情都可能（可能也应该）改变。
如果您在安装或使用easyagents方面有任何困难，请告诉我们。我们会尽力帮助你的
python/open source development/reinforcement learning/whatever中的任何想法、帮助、建议、评论等非常受欢迎。提前多谢了。

欢迎加入QQ群-->： 979659372

easyagents-v1 1.0a2

easyagents-v1的Python项目详细描述

从业者强化学习（v1α）

如果

v1的想法

指导原则

场景

设计理念

安装

词汇

不使用EasyAgents如果

注意

推荐PyPI第三方库

bib2web

quantx

ewp

vasctree

testpackagemlarre

zilla

cyclone-wtforms

django-ooyala

box-auth

django-heiglerplus

python-hwinfo

cornice_sphinx

vdt.versionplugin.puppetmodule

nesterless

PyFurStream

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

easyagents-v1 1.0a2

easyagents-v1的Python项目详细描述

从业者强化学习（v1α）

如果

v1的想法

指导原则

场景

设计理念

安装

词汇

不使用EasyAgents如果

注意

推荐PyPI第三方库

bib2web

quantx

ewp

vasctree

testpackagemlarre

zilla

cyclone-wtforms

django-ooyala

box-auth

django-heiglerplus

python-hwinfo

cornice_sphinx

vdt.versionplugin.puppetmodule

nesterless

PyFurStream

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签