基于精度的规则组合学习分类器系统
xcs-rc的Python项目详细描述
xcs-rc
基于精度的学习分类器系统与规则组合机制,python3的短期XCS-RC
,松散地基于martin butz的xcs java代码(2001)。阅读我的博士论文here获得完整的算法描述。
规则组合是一种采用归纳推理的新型函数,它取代了所有达尔文遗传操作,如变异和交叉。它可以处理binaries
和real
,比几个xcs实例更快地达到更好的正确率和总体大小。我以前比较它们的论文可以在here和here上找到。
相关链接
安装
pip install xcs-rc
初始化
import xcs_rc
agent = xcs_rc.Agent()
classic强化学习周期
# input: binary string, e.g., "100110" or decimal array
state = str(randint(0, 1))
# pick methods: 0 = explore, 1 = exploit, 2 = explore_it
action = agent.next_action(state, pick_method=1)
# determine reward and apply it, e.g.,
reward = agent.maxreward if action == int(state[0]) else 0.0
agent.apply_reward(reward)
部分可观测马尔可夫决策过程(pomdp)环境
# create env and agent
env = xcs_rc.MarkovEnv('maze4') # maze4 is built-in
env.add_agents(num=1, tcomb=100, xmax=50)
agent = env.agents[0]
for episode in range(8000):
steps = env.one_episode(pick_method=2) # returns the number of taken steps
数据分类
agent.train(X_train, y_train)
cm = agent.test(X_test, y_test) # returns the confusion matrix
preds, probs = agent.predict(X) # returns lists of predictions and probabilities
打印填充,将其保存到csv文件,或使用附加模式
agent.pop.print(title="Population")
agent.save('xcs_population.csv', title="Final XCS Population")
agent.save('xcs_pop_every_100_cycles.csv', title="Cycle: ###", save_mode='a')
最后,向填充插入规则
# automatically load the last set (important for append mode)
agent.load("xcs_population.csv", empty_first=True)
agent.pop.add(my_list_of_rules) # from a list of classifiers
主要参数
xcs-rc参数
tcomb
:组合周期,下一规则组合前的学习周期数prederrtol
:预测误差容限,删除不适当组合规则的阈值
如何设置
agent.tcomb = 50 # perform rule combining every 50 cycles
agent.predtol = 20.0 # combines rules whose prediction value differences <= 20.0
agent.prederrtol = 10.0 # remove if error > 10.0, after previously below it
原始xcs的参数
所有与突变和交叉相关的信息都被删除- 其他的则被保存和访问(例如,
agent.alpha = 0.15
)
结果
经典问题:multiplexer
和Markov environment
:
pygame学习环境中的flappy bird:
youtube:cartpole-v0来自openai健身房的基准测试: