在python中获取数据帧中的文本长度

2024-09-29 23:27:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框:

    Text                                             target
    #Coronavirus is a cover for something else. #5...   D
    Crush the One Belt One Road !! \r\n#onebeltonf...   B
    RT @nickmyer: It seems to be, #COVID-19 aka #c...   B
    @Jerusalem_Post All he knows is how to destroy...   B
    @newscomauHQ Its gonna show us all. We will al...   B

其中,文本是tweets,我试图获取文本列中每个字符串的计数,并将计数输入到数据框中。我已经试过了

d = pd.read_csv('5gCoronaFinal.csv')
d['textlength'] = [len(int(t)) for t in d['Text']]

但它总是给我这个错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-42-dabcab1de7b2> in <module>
----> 1 d['textlength'] = [len(t) for t in d['Text']]

<ipython-input-42-dabcab1de7b2> in <listcomp>(.0)
----> 1 d['textlength'] = [len(t) for t in d['Text']]

TypeError: object of type 'float' has no len()

我尝试过将t转换为整数,如下所示:

d['textlength'] = [len(int(t)) for t in d['Text']]

但它给了我一个错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-43-9ae56e5f7912> in <module>
----> 1 d['textlength'] = [len(int(t)) for t in d['Text']]

<ipython-input-43-9ae56e5f7912> in <listcomp>(.0)
----> 1 d['textlength'] = [len(int(t)) for t in d['Text']]

ValueError: invalid literal for int() with base 10: '#Coronavirus is a cover for something else. #5g is being rolled out and they are expecting lots to...what? Die from #60ghz +. They look like they are to keep the cold in? #socialdistancing #covid19 #

我需要帮助,谢谢


Tags: to数据textinforinputlenis
1条回答
网友
1楼 · 发布于 2024-09-29 23:27:40

可以使用str访问器进行矢量化字符串操作。在这种情况下,可以使用^{}^{}

df['Text_length'] = df.Text.str.split().str.len()

print(df)

                                                Text target  Text_length
0  #Coronavirus is a cover for something else. #5...      D            8
1  Crush the One Belt One Road !! \r\n#onebeltonf...      B            8
2      RT @nickmyer: It seems to be, #COVID-19 aka #      B            9
3     @Jerusalem_Post All he knows is how to destroy      B            8
4     @newscomauHQ Its gonna show us all. We will al      B            9

相关问题 更多 >

    热门问题