需要DataFram。替换PythonPandas的帮助

2024-10-01 04:49:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我是Python新手,尝试使用DataFrame替换字符串。替换“pandas”,但面临问题。选项卡按如下方式分隔文本文件:

                            RepStr         KeyStr                ValStr
0                        S Connery      S Connery          Sean Connery
1                       S. Connery     S. Connery          Sean Connery
2                      Connery, S.    Connery, S.          Sean Connery
3        Connery, S; Blofeld, E.S.     Connery, S          Sean Connery
4   Connery; Moore, R.; ES Blofeld        Connery          Sean Connery
5                        R Moore R    Moore Roger                 Moore
6                         R. Moore       R. Moore           Roger Moore
7                        Moore, R.      Moore, R.           Roger Moore
8   ES Blofeld; Connery; Moore, R.     ES Blofeld  Ernst Stavro Blofeld
9            E.S. Blofeld; Connery   E.S. Blofeld  Ernst Stavro Blofeld
10                    E.S. Blofeld   E.S. Blofeld  Ernst Stavro Blofeld
11                         Blofeld        Blofeld  Ernst Stavro Blofeld
12          Blofeld, E.S.; Connery  Blofeld, E.S.  Ernst Stavro Blofeld

我试图在“RepStr”列中替换作为变量的“KeyStr”列和“ValStr”列中的“Key:Value”对的匹配项。它适用于整个单元格的直接值。

import pandas as pd
pipe_data = pd.read_csv('/content/sample_data/NStd.txt', sep='\t')
NStd = pd.DataFrame(pipe_data)
NStd.replace(to_replace={'RepStr':{'KeyStr': 'ValStr'}}, inplace=True)
NStd

如何获得我想要的结果?


Tags: dataframedataespdseanrogermoorevalstr
1条回答
网友
1楼 · 发布于 2024-10-01 04:49:12

创建替换序列s,然后使用带有可选参数regex=TrueSeries.replaceRepStr中的值替换为s中的相应值:

s = df.set_index('KeyStr')['ValStr']
s.index = r'(?:(?<=;\s)|(?<=^))' + s.index + r'(?=;|$)'
df['RepStr'] = df['RepStr'].replace(s, regex=True)

0                                        Sean Connery
1                                        Sean Connery
2                                        Sean Connery
3                  Sean Connery; Ernst Stavro Blofeld
4     Sean Connery; Roger Moore; Ernst Stavro Blofeld
5                                           R Moore R
6                                         Roger Moore
7                                         Roger Moore
8     Ernst Stavro Blofeld; Sean Connery; Roger Moore
9                  Ernst Stavro Blofeld; Sean Connery
10                               Ernst Stavro Blofeld
11                               Ernst Stavro Blofeld
12                 Ernst Stavro Blofeld; Sean Connery
Name: RepStr, dtype: object

相关问题 更多 >