我有一个数据框:
Text target
#Coronavirus is a cover for something else. #5... D
Crush the One Belt One Road !! \r\n#onebeltonf... B
RT @nickmyer: It seems to be, #COVID-19 aka #c... B
@Jerusalem_Post All he knows is how to destroy... B
@newscomauHQ Its gonna show us all. We will al... B
其中,文本是tweets,我试图获取文本列中每个字符串的计数,并将计数输入到数据框中。我已经试过了
d = pd.read_csv('5gCoronaFinal.csv')
d['textlength'] = [len(int(t)) for t in d['Text']]
但它总是给我这个错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-42-dabcab1de7b2> in <module>
----> 1 d['textlength'] = [len(t) for t in d['Text']]
<ipython-input-42-dabcab1de7b2> in <listcomp>(.0)
----> 1 d['textlength'] = [len(t) for t in d['Text']]
TypeError: object of type 'float' has no len()
我尝试过将t转换为整数,如下所示:
d['textlength'] = [len(int(t)) for t in d['Text']]
但它给了我一个错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-9ae56e5f7912> in <module>
----> 1 d['textlength'] = [len(int(t)) for t in d['Text']]
<ipython-input-43-9ae56e5f7912> in <listcomp>(.0)
----> 1 d['textlength'] = [len(int(t)) for t in d['Text']]
ValueError: invalid literal for int() with base 10: '#Coronavirus is a cover for something else. #5g is being rolled out and they are expecting lots to...what? Die from #60ghz +. They look like they are to keep the cold in? #socialdistancing #covid19 #
我需要帮助,谢谢
可以使用} 和^{} :
str
访问器进行矢量化字符串操作。在这种情况下,可以使用^{相关问题 更多 >
编程相关推荐