如何在石榴中加入先验信息？换句话说：石榴是否支持增量学习？

>>> from pomegranate.distributions import BetaDistribution >>> # suppose a coin generated the following data, where 1 is head and 0 is tail >>> data1 = [0, 0, 0, 1, 0, 1, 0, 1, 0, 0] >>> # as usual, we fit a Beta distribution to infer the bias of the coin >>> model = BetaDistribution(1, 1) >>> model.summarize(data1) # compute sufficient statistics >>> # presume we have seen all the data available so far, >>> # we can now estimate the parameters >>> model.from_summaries() >>> # this results in the following model (so far so good) >>> model { "class" :"Distribution", "name" :"BetaDistribution", "parameters" :[ 3.0, 7.0 ], "frozen" :false } >>> # now suppose the coin is flipped a few more times, getting the following data >>> data2 = [0, 1, 0, 0, 1] >>> # we would like to update the model parameters accordingly >>> model.summarize(data2) >>> # but this fits only data2, overriding the previous parameters >>> model.from_summaries() >>> model { "class" :"Distribution", "name" :"BetaDistribution", "parameters" :[ 2.0, 3.0 ], "frozen" :false } >>> # however I want to get the result that corresponds to the following, >>> # but ideally without having to "drag along" data1 >>> data3 = data1 + data2 >>> model.fit(data3) >>> model # this should be the final model { "class" :"Distribution", "name" :"BetaDistribution", "parameters" :[ 5.0, 10.0 ], "frozen" :false }

1条回答

网友

1楼 · 发布于 2024-10-02 06:25:52

问题实际上是from_summaries。在Beta分布的情况下：self.summaries = [0, 0]。所有的from_summaries方法都是破坏性的。它们用分布中的参数替换摘要。总结可以随时更新，以获得更多的观察结果，但不能更新参数

我认为这是个糟糕的设计。最好将它们视为观察值的累加器，将参数视为派生的缓存值

如果您这样做：

model = BetaDistribution(1, 1)
model.summarize(data1)
model.summarize(data2)
model.from_summaries()
model

您会发现，它确实产生了与使用model.summarize(data1 + data2)相同的结果

相关问题更多 >

编程相关推荐

热门问题

热门文章