使用数据帧中的值更新字符串中的值

df = pd.DataFrame({'term' : ['analys','applic','architectur','assess','item','methodolog','research','rs','studi','suggest','test','tool','viewer','work'], 'newValue' : [0.810419, 0.631963 ,0.687348, 0.810554, 0.725366, 0.742715, 0.799152, 0.599030, 0.652112, 0.683228, 0.711307, 0.625563, 0.604190, 0.724763]}) df = df.set_index('term') print(df) newValue term analys 0.810419 applic 0.631963 architectur 0.687348 assess 0.810554 item 0.725366 methodolog 0.742715 research 0.799152 rs 0.599030 studi 0.652112 suggest 0.683228 test 0.711307 tool 0.625563 viewer 0.604190 work 0.724763

(analysi analys^0.8046919107437134 studi^0.6034331321716309 framework methodolog^0.7360332608222961 architectur^0.6806665658950806)^0.0625 (recommend suggest^0.6603200435638428 rs^0.5923488140106201)^0.125 (system tool^0.6207902431488037 applic^0.610009491443634)^0.25 (evalu assess^0.7828741073608398 test^0.6444937586784363)^0.5

(analysi analys^0.810419 studi^0.652112 framework methodolog^0.742715 architectur^0.687348)^0.0625 (recommend suggest^0.683228 rs^0.599030)^0.125 (system tool^0.625563 applic^0.631963)^0.25 (evalu assess^0.810554 test^0.711307)^0.5

1条回答

网友

1楼 · 发布于 2024-09-29 00:19:09

我能想出的最好办法是分多个阶段来做。你知道吗

首先，获取旧字符串并提取所有要替换的值。这可以通过正则表达式来实现。你知道吗

old_string = "(analysi analys^0.8046919107437134 studi^0.6034331321716309 framework methodolog^0.7360332608222961 architectur^0.6806665658950806)^0.0625 (recommend suggest^0.6603200435638428 rs^0.5923488140106201)^0.125 (system tool^0.6207902431488037 applic^0.610009491443634)^0.25 (evalu assess^0.7828741073608398 test^0.6444937586784363)^0.5"

pattern = re.compile(r"(\w+\^(0|[1-9]\d*)(\.\d+)?)")
# pattern.findall(old_string) returns a list of tuples,
# so we need to keep just the outer capturing group for each match.
matches = [m[0] for m in pattern.findall(old_string)]
print("Matches:", matches)

下一部分，我们制作两本词典。一个是要替换为整个值的值的前缀（单词部分，在^之前）的字典。我们使用它来创建第二个字典，从要替换的值到新值（从数据帧）。你知道吗

prefix_dict = {}
for m in matches:
    pre, post = m.split('^')
    prefix_dict[pre] = m
print("Prefixes:", prefix_dict)

matches_dict = {}
for i, row in df.iterrows(): # df is the dataframe from the question
    if i in prefix_dict:
        old_val = prefix_dict[i]
        new_val = "%s^%s" % (i, row.newValue)
        matches_dict[old_val] = new_val
print("Matches dict:", matches_dict)

完成后，我们可以循环使用old value>；new value dictionary中的项，并替换输入字符串中的所有旧值。你知道吗

new_string = old_string
for key, val in matches_dict.items():
    new_string = new_string.replace(key, val)
print("New string:", new_string)

相关问题更多 >

编程相关推荐

热门问题

热门文章