我正在做一些NLP工作,我正在尝试使用groupby在lambda函数中执行post请求,并得到一个JSON对象响应,不幸的是,它会导致NaN
。我需要它在“分解”字段后添加字段
自定义功能:
def posTagger(text):
post = { "text": title }
endpoint = 'http://localhost:8001/api/postagger'
r = requests.post(endpoint, json=post)
r = r.json()
time.sleep(1)
return {"title": title, "result": r}
posTagger
返回值:
[
{
"text": "Contemporary Modern Soft Area Rugs Nonslip",
"terms": [
{
"text": "Contemporary",
"penn": "JJ",
"tags": [
"Adjective"
]
},
{
"text": "Modern",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Singular"
]
},
{
"text": "Soft",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Singular"
]
},
{
"text": "Area",
"penn": "NN",
"tags": [
"Singular",
"Noun",
"ProperNoun"
]
},
{
"text": "Rugs",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Plural"
]
},
{
"text": "Nonslip",
"penn": "NNP",
"tags": [
"ProperNoun",
"Noun",
"Singular"
]
}
]
}
]
数据帧
title = [
'Contemporary Modern Soft Area Rugs Nonslip Velvet Home Room Carpet Floor Mat Rug',
'Traditional Distressed Area Rug 8x10 Large Rugs for Living Room 5x8 Gray Ivory',
'Shaggy Area Rugs Fluffy Tie-Dye Floor Soft Carpet Living Room Bedroom Large Rug'
]
df = pd.DataFrame(title, columns=['title'])
df
# Initial dataframe:
# title
# 0 Contemporary Modern Soft Area Rugs Nonslip...
# 1 Traditional Distressed Area Rug 8x10 Large...
# 2 Shaggy Area Rugs Fluffy Tie-Dye Floor Soft...
下面是我使用的分组。应用:
df['result'] = pd.DataFrame(df.groupby(['title']).apply(lambda x: posTagger(x)))
df
# Resulting DataFrame after **.apply**:
# title result
# 0 Contemporary Modern Soft Area Rugs Nonslip Vel... NaN
# 1 Traditional Distressed Area Rug 8x10 Large Rug... NaN
# 2 Shaggy Area Rugs Fluffy Tie-Dye Floor Soft Car... NaN
下面是我使用.transform的分组:
df['result'] = pd.DataFrame(df.groupby(['title']).transform(lambda x: posTagger(x)))
df
# Resulting DataFrame after **.transform**:
# title result
# 0 Contemporary Modern Soft Area Rugs Nonslip Vel... {'title': ['Contemporary Modern Soft Area Rugs...
# 1 Traditional Distressed Area Rug 8x10 Large Rug... {'title': ['Contemporary Modern Soft Area Rugs...
# 2 Shaggy Area Rugs Fluffy Tie-Dye Floor Soft Car... {'title': ['Contemporary Modern Soft Area Rugs...
注意,.transform
的结果多次发送了相同的值<为什么
.apply
或.transform
来实现这一点更好吗李>
我将在这里讨论
apply()
,这里有一些考虑因素需要您仔细考虑对于当前函数,要获得该结果(即字典),可以使用编写的函数并更改代码以调用它。除非其他人是相同的,否则您不会真正根据标题进行分组,所以只需使用
apply()
而不使用groupby()
。这不会使字典爆炸。有很多方法可以考虑这一点现在,如果您确实想使用
groupby().apply()
,那么将数据帧组作为x发送,对其进行操作,然后返回x。这没有经过测试,但这是思考这个问题的一种方式相关问题 更多 >
编程相关推荐