我对Python很陌生(使用Anaconda的3.5版本)——以前在MATLAB中有过经验。非常感谢你的帮助。如果有更简单的方法,请告诉我。你知道吗
我从一些实验设备的pdf文件中读取并清理了一些数据,并将其添加到一个列表中:
>print(outputdata)
[[['2.37701'], ['-'], ['-'], ['-'], ['-'], ['18.95276'], ['5.07365e-1']], [['2.75613'], ['-'], ['-'], ['-'], ['-'], ['16.99642'], ['4.10023e-1']], [['1.80527'], ['-'], ['-'], ['-'], ['-'], ['20.75384'], ['4.58238e-1']], [['1.58721'], ['-'], ['-'], ['-'], ['-'], ['18.06942'], ['3.81128e-1']], [['1.98336'], ['-'], ['-'], ['-'], ['-'], ['18.20776'], ['3.64733e-1']], [['1.75710'], ['-'], ['-'], ['-'], ['-'], ['23.03760'], ['4.36234e-1']], [['1.58967'], ['-'], ['-'], ['-'], ['-'], ['21.43884'], ['3.88509e-1']], [['2.37701'], ['-'], ['-'], ['-'], ['-'], ['18.95276'], ['5.07365e-1']]]
我正在尝试从列表的每个元素中提取每个元素,并将其保存到一个新列表中。我还想清理数据,去掉括号和引号,保留数字。我需要对这个做一些操作,所以我计划转换成一个numpy数组,然后将它添加到一个DataFrame中,以便更容易地导出到Excel(我已经有了导出的代码)。每个列向量对应一个特定的标题:
Molecule = ["H2", "Ar", "Methane", "Ethane", "Ethylene", "Propane(C3H8)", "Propylene"]
以下是所需H2数据的示例:
2.37701
2.75613
1.80527
1.58721
1.98336
1.75710
1.58967
2.37701
我首先完成了这个任务:
outputdatalist = [x[0] for x in outputdata]
具有以下输出:
[['2.37701'], ['2.75613'], ['1.80527'], ['1.58721'], ['1.98336'], ['1.75710'], ['1.58967'], ['2.37701']]
然后呢
for row in outputdatalist:
print(' '.join(row)) # I need to append this on every iteration
我做这件事的不太成功的方法是做两倍(三倍?)for循环如下:
outputdatalist = []
for counter, elem in enumerate(Molecule):
for counter1, elem1 in enumerate(outputdata):
outputdatalist[counter] = [x[counter1] for x in outputdata]
然后将每个outputdatalist[i]转换为np数组,然后通过pd.数据帧比如说
pd.DataFrame({Molecule[i]: outputdatalist[i]})
您可以使用
nested list comprehension
,这似乎比使用apply
的解决方案更快:计时:(小数据帧)
相关问题 更多 >
编程相关推荐