使用数据帧中的值更新字符串中的值

2024-09-29 00:19:09 发布

您现在位置:Python中文网/ 问答频道 /正文

给定以下数据帧:

df = pd.DataFrame({'term' : ['analys','applic','architectur','assess','item','methodolog','research','rs','studi','suggest','test','tool','viewer','work'],
               'newValue' : [0.810419, 0.631963 ,0.687348, 0.810554, 0.725366, 0.742715, 0.799152, 0.599030, 0.652112, 0.683228, 0.711307, 0.625563,  0.604190, 0.724763]})

df = df.set_index('term')

print(df)

             newValue
term                 
analys       0.810419
applic       0.631963
architectur  0.687348
assess       0.810554
item         0.725366
methodolog   0.742715
research     0.799152
rs           0.599030
studi        0.652112
suggest      0.683228
test         0.711307
tool         0.625563
viewer       0.604190
work         0.724763

我试图用数据框中的值更新每个“^”后面的字符串中的值。你知道吗

(analysi analys^0.8046919107437134 studi^0.6034331321716309 framework methodolog^0.7360332608222961 architectur^0.6806665658950806)^0.0625 (recommend suggest^0.6603200435638428 rs^0.5923488140106201)^0.125 (system tool^0.6207902431488037 applic^0.610009491443634)^0.25 (evalu assess^0.7828741073608398 test^0.6444937586784363)^0.5

此外,这应该是关于相应的词,这样我得到:

(analysi analys^0.810419 studi^0.652112 framework methodolog^0.742715 architectur^0.687348)^0.0625 (recommend suggest^0.683228 rs^0.599030)^0.125 (system tool^0.625563 applic^0.631963)^0.25 (evalu assess^0.810554 test^0.711307)^0.5

提前感谢您的帮助!你知道吗


Tags: 数据testdftoolitemsuggestrsterm
1条回答
网友
1楼 · 发布于 2024-09-29 00:19:09

我能想出的最好办法是分多个阶段来做。你知道吗

首先,获取旧字符串并提取所有要替换的值。这可以通过正则表达式来实现。你知道吗

old_string = "(analysi analys^0.8046919107437134 studi^0.6034331321716309 framework methodolog^0.7360332608222961 architectur^0.6806665658950806)^0.0625 (recommend suggest^0.6603200435638428 rs^0.5923488140106201)^0.125 (system tool^0.6207902431488037 applic^0.610009491443634)^0.25 (evalu assess^0.7828741073608398 test^0.6444937586784363)^0.5"

pattern = re.compile(r"(\w+\^(0|[1-9]\d*)(\.\d+)?)")
# pattern.findall(old_string) returns a list of tuples,
# so we need to keep just the outer capturing group for each match.
matches = [m[0] for m in pattern.findall(old_string)]
print("Matches:", matches)

下一部分,我们制作两本词典。一个是要替换为整个值的值的前缀(单词部分,在^之前)的字典。我们使用它来创建第二个字典,从要替换的值到新值(从数据帧)。你知道吗

prefix_dict = {}
for m in matches:
    pre, post = m.split('^')
    prefix_dict[pre] = m
print("Prefixes:", prefix_dict)

matches_dict = {}
for i, row in df.iterrows(): # df is the dataframe from the question
    if i in prefix_dict:
        old_val = prefix_dict[i]
        new_val = "%s^%s" % (i, row.newValue)
        matches_dict[old_val] = new_val
print("Matches dict:", matches_dict)

完成后,我们可以循环使用old value>;new value dictionary中的项,并替换输入字符串中的所有旧值。你知道吗

new_string = old_string
for key, val in matches_dict.items():
    new_string = new_string.replace(key, val)
print("New string:", new_string)

相关问题 更多 >