<p>可以使用featuretools创建自定义变量类型,该类型可与自定义基元一起使用,以生成所需的变换特征。你知道吗</p>
<blockquote>
<p>Note: The operation that you want to do is actually a transform primitive, not an aggregation primitive.</p>
</blockquote>
<p>使用您的示例,让我们创建一个自定义列表类型</p>
<pre class="lang-py prettyprint-override"><code>from featuretools.variable_types import Variable
class List(Variable):
type_string = "list"
</code></pre>
<p>现在,让我们使用新的列表类型创建一个自定义转换原语,并为包含列表变量类型的简单entityset生成特性。你知道吗</p>
<pre class="lang-py prettyprint-override"><code>from featuretools.primitives import make_trans_primitive
from featuretools.variable_types import Numeric
import pandas as pd
import featuretools as ft
def len_list(values):
return values.str.len()
LengthList = make_trans_primitive(function = len_list,
input_types = [List],
return_type = Numeric,
description="length of a list related instance")
# Create a simple entityset containing list data
data = pd.DataFrame({"id": [1, 2, 3],
"products": [ ['a', 'b', 'c'], ['a','c'], ['b'] ]})
es = ft.EntitySet(id="data")
es = es.entity_from_dataframe(entity_id="customers",
dataframe=data,
index="id",
variable_types={
'products': List # Use the custom List type
})
feature_matrix, features = ft.dfs(entityset=es,
target_entity="customers",
agg_primitives=[],
trans_primitives=[LengthList],
max_depth=2)
</code></pre>
<p>现在可以查看生成的特征,其中包括使用自定义变换原语的特征</p>
<pre class="lang-py prettyprint-override"><code>feature_matrix.head()
</code></pre>
<pre><code> LEN_LIST(products)
id
1 3
2 2
3 1
</code></pre>