我有一个句子列表,上面写着:
["Hello all, how are you doing?", "Hi all, wassup", "Namaste", "Bonjour, ca va", "Privet, kak dela?"...]
我想数一数每句话的字数,画一个直方图
当我计算单个项目时,如:
seq = []
seq.append(len(X_train[0].split()))
seq
它给了我一个很好的结果。但是,当我尝试用28个句子组成的整个hello列表序列时:
seq = [len(sentence.split()) for sentence in X_train]
我得到以下错误:
ttributeError Traceback (most recent call last)
<ipython-input-100-d9dec14bd2dd> in <module>()
----> 1 num_words = [len(sentence.split()) for sentence in X_train]
2 #pd.Series(seq_len).hist(bins = 30)
<ipython-input-100-d9dec14bd2dd> in <listcomp>(.0)
----> 1 num_words = [len(sentence.split()) for sentence in X_train]
2 #pd.Series(seq_len).hist(bins = 30)
AttributeError: 'float' object has no attribute 'split'
我不知道为什么。你能解释一下吗
谢谢
对于给定的示例,脚本运行良好,但在28项X_序列列表中似乎有一个浮点数,因此我建议在拆分之前将句子转换为字符串:
seq = [len(str(sentence).split()) for sentence in X_train]
相关问题 更多 >
编程相关推荐