我怎样才能去掉某些列,并用pandas将它写到另一个文件中?

2024-09-27 07:22:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我的文件是一个文本文件,看起来像:

label dataset sw sf
1H 1H_2
NOESY_F1eF2e.nv
4807.69238281 4803.07373047
600.402832031 600.402832031
1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9
0 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
1 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
2 {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
3 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
4 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
5 {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
6 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
7 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
8 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
9 {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0

我的代码应该取第一、第二、第八和第九列,并将它们写出一个文本文件。但我希望第1列和第8列合并,第2列和第9列合并为一列,然后我希望删除所有重复项。我还想添加第三列,让它每行输出“0.03”。你知道吗

这是当前代码:

import pandas as pd

result={}
df = pd.read_csv("peaks_ee.xpk", sep=" ", skiprows=5)

shift1 = df["1H.P"]
shift2 = df["1H_2.P"]

mask = ((shift1>5.1) & (shift1<6)) & ((shift2>7) & (shift2<8.25))

result = df[mask]
result = result[["1H.L","1H.P","1H_2.L","1H_2.P"]]

for col in result.columns:
    if col == ("1H.L") or col==( "1H_2.L"):
         result[col]=result[col].str.strip("{} ")

res = pd.lreshape(df, {'atom_name':['1H.L','1H_2.L'], 'ppm':['1H.P','1H_2.P']}).drop_duplicates()
res['new']=0.3

result.drop_duplicates(keep='first',inplace=True)

tclust_atom=open("tclust_ppm.txt","w+")

res.to_string(tclust_atom, header=False)

tclust_atom.close()

我希望所需的输出如下所示:

1.H1'  5.82020 0.3
2.H8  7.61004 0.3  
1.H8  8.13712 0.3
2.H1'  5.90291 0.3   
4.H1'  5.74125 0.3   
3.H6  7.53261 0.3
3.H1'  5.54935 0.3   
4.H8  7.49932 0.3
3.H1'  5.54935 0.3  
3.H6  7.53261 0.3 
6.H1'  5.54297 0.3   
5.H6  7.72158 0.3

但是现在有了这个代码,我的输出是:

0    0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0   {1.H1'}  5.82020  0.3
1    0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0    {2.H8}  7.61004  0.3
2    0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0    {1.H8}  8.13712  0.3
5    0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0   {2.H1'}  5.90291  0.3
10   0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0    {3.H6}  7.53261  0.3
11   0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0   {4.H1'}  5.74125  0.3
12   0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0   {3.H1'}  5.54935  0.3
13   0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0    {4.H8}  7.49932  0.3
26   0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0    {5.H6}  7.72158  0.3
27   0.1  ++  {0.0}  {}  0.05  0.1  ++  {0.0}  {}  0.05  {}  0  0  0  100.0  0  0.0   {6.H1'}  5.54297  0.3

最后三列是我想要的,但是我怎样才能去掉其他列,在下面这个列中:

{1.H1'}  
{2.H8}   
{1.H8}  
{2.H1'}  
{4.H1'}    
{3.H6}  
{3.H1'}    
{4.H8}  
{3.H1'}    
{3.H6}   
{6.H1'}     
{5.H6} 

我怎样才能摆脱卷发背带?你知道吗


Tags: 代码dfresmaskcolresulth1pd
1条回答
网友
1楼 · 发布于 2024-09-27 07:22:39

你可以用这个方法

df = pd.read_csv("peaks_ee.xpk", sep=" ", skiprows=5)

#Create two dataframes with desired rows, by column    
df1 = df.copy()[['1H.L','1H.P']]
df2 = df.copy()[['1H_2.L','1H_2.P']]

#retain same names
df2.rename(columns={'1H_2.L' : '1H.L', '1H_2.P' : '1H.P'},inplace=True)

#stack dataframes
df = pd.concat([df1,df2])

# Conditionally delete
df = df[(df['1H.P'] <= 6) & (df['1H.P'] >= 5)]

#Remove Curly Braces
df['1H.L'] = df['1H.L'].apply(lambda row: row.strip('{}'))

#Add column of 0.3
df['new'] = 0.3

#Drop duplicates
df.drop_duplicates(keep='first',inplace=True)

希望这有帮助

相关问题 更多 >

    热门问题