将多个列表转换为数据帧

2024-09-30 02:17:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我创建了这些列表来存储我在for循环上生成的东西,该循环显示得更靠下一点

neutralScore = []
lightPosScore = []
middlePosScore = []
heavyPosScore = []
lightNegScore = []
middleNegScore = []
heavyNegScore = []

循环来了

score = float
convertStringToInt = get_sent_score_comp_df.apply(lambda x: float(x))
for score in convertStringToInt:
    if score <= 0.09 and score >= -0.09:
        neutralScore.append(score)
    elif score >= 0.091 and score <= 0.49:
        lightPosScore.append(score)

我想将我的情绪保存到这些列表中,然后加入它们,以便能够将它们转换为数据帧,并将它们存储在MySQl数据库中。 有没有一种优雅的方法可以做到这一点

scores = pandas.DataFrame(data=neutralScore).append(lightPosScore, middlePosScore, heavyPosScore).append(
        lightNegScore,
        middleNegScore,
        heavyNegScore).columns = ['heavyPosScore', 'middlePosScore', 'lightPosScore', 'neutralScore', 'lightNegScore', 'middleNegScore', 'heavyNegScore']

我知道,声明列列表需要单独完成,但到目前为止,代码看起来是这样的

到目前为止,我尝试了这个,但它没有工作,因为它返回:

Can only merge Series or DataFrame objects, a <class 'list'> was passed

这是可以理解的,但我现在想不出解决问题的办法


Tags: anddataframe列表forfloatscoreappendmiddleposscore
1条回答
网友
1楼 · 发布于 2024-09-30 02:17:02

问题并不完全清楚您希望最终的数据帧是什么样子。但我会这样做:

>>> import numpy as np
>>> import pandas as pd    
>>> data = np.random.normal(0, 1, size=10)
>>> df = pd.DataFrame(data, columns=['score'])    
>>> df
|    |     score |
| -:|     :|
|  0 | -0.440304 |
|  1 | -0.597293 |
|  2 | -1.80229  |
|  3 | -1.65654  |
|  4 | -1.14571  |
|  5 | -0.760086 |
|  6 |  0.244437 |
|  7 |  0.828856 |
|  8 | -0.136325 |
|  9 | -0.325836 |

>>> df['neutralScore'] = (df.score <= 0.09) & (df.score >= -0.09)
>>> df['lightPosScore'] = (df.score >= 0.091) & (df.score <= 0.49)

'heavyPosScore', 'middlePosScore', 'lightNegScore', 'middleNegScore', 'heavyNegScore'列类似

>>> df
|    |      score | neutralScore   | lightPosScore   |
| -:|     -:|:       -|:        |
|  0 | -0.475571  | False          | False           |
|  1 |  0.109076  | False          | True            |
|  2 |  0.809947  | False          | False           |
|  3 |  0.595088  | False          | False           |
|  4 |  0.832727  | False          | False           |
|  5 | -1.30049   | False          | False           |
|  6 |  0.245578  | False          | True            |
|  7 |  0.0998278 | False          | True            |
|  8 |  0.20592   | False          | True            |
|  9 |  0.372493  | False          | True            |

然后,您可以像这样轻松地筛选分数类型:

>>> df[df.lightPosScore]
|    |    score | neutralScore   | lightPosScore   |
| -:|    -:|:       -|:        |
|  0 | 0.415629 | False          | True            |
|  2 | 0.104852 | False          | True            |
|  4 | 0.39739  | False          | True            |

编辑

要有一个rating列,首先定义一个函数来给出您的评级,并将其应用于score列:

>>> def get_rating(score):
      if score <= 0.09 and score >= -0.09:
          return 'neutralScore'
      elif score >= 0.091 and score <= 0.49:
          return 'lightPosScore'
      else:
          return 'to be implemented'

>>> df['rating'] = df['score'].apply(get_rating)
>>> df
|    |      score | rating            |
| -:|     -:|:         |
|  0 | -0.190816  | to be implemented |
|  1 |  0.495197  | to be implemented |
|  2 | -1.20576   | to be implemented |
|  3 | -0.711516  | to be implemented |
|  4 | -0.0606396 | neutralScore      |
|  5 |  0.0452575 | neutralScore      |
|  6 |  0.154754  | lightPosScore     |
|  7 | -0.506285  | to be implemented |
|  8 | -0.896066  | to be implemented |
|  9 |  0.523198  | to be implemented |

相关问题 更多 >

    热门问题