我在数据框下面有这个。实际的数据是巨大的,并且有很多不确定因素。你知道吗
Date ID Code Value Value1 Value2 Value3
0 1945-12-30 H0010603 ZZZ008-2 zzz=ID AAC=10 NaN NaN
1 1945-12-30 H0010603 ZZZ008-2 zzz=ID AAC=01 NaN NaN
2 1945-12-30 H0010603 ZZZ008-2 NaN NaN VEC=1 NaN
3 1945-12-30 H0010603 ZZZ008-2 NaN NaN VEC=2 AAC= 1 A
4 1945-12-30 H0010603 ZZZ008-2 NaN NaN VEC=3 AAC= 1 A
这是最终的预期数据。你知道吗
Date ID Code zzz AAC VEC AAC.1
0 1945-12-30 H0010603 ZZZ008-2 ID 10 NaN NaN
1 1945-12-30 H0010603 ZZZ008-2 ID 01 NaN NaN
2 1945-12-30 H0010603 ZZZ008-2 NaN NaN 1 NaN
3 1945-12-30 H0010603 ZZZ008-2 NaN NaN 2 1 A
4 1945-12-30 H0010603 ZZZ008-2 NaN NaN 3 1 A
我需要用单元格中的值来更新列名。你知道吗
df = pd.read_excel(xlPath, 0)
writer = pd.ExcelWriter(xlPath,
engine='xlsxwriter',
date_format='mm/dd/yyy',
datetime_format='mm/dd/yyyy')
df = df.fillna('')
for ColumnName, values in df.iteritems():
for index, value in enumerate(values):
if '=' in str(value):
df.set_value(index, ColumnName, str(value).split('=')[1])
NewColumnName = str(value).split('=')[0]
df.rename(columns={ColumnName: NewColumnName}, inplace=True)
df.to_excel(writer, index=False)
writer.save()
但由于一列被重复,这是错误的。 所以,我想,我可以循环通过df,得到给定列中的第一个有效值,并将其放入一个列表中。你知道吗
AllColumns = list(df.columns.values)
NewColNameList = []
for ColumnName, values in df.iteritems():
a = 0
for index, value in enumerate(values):
while a < len(values):
if '=' in str(value):
if value != '':
print(index, values)
NewColNameList.append(value)
break
a += 1
print(NewColNameList)
但我的While循环没有想象的那么强。任何帮助获得所需的数据框是感激的。你知道吗
IIUC公司:
数据集:
解决方案:
结果:
相关问题 更多 >
编程相关推荐