将打印转换为数据帧

2024-05-20 02:44:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有密码,指纹看起来很奇怪。我想把它修好

*指纹

     Matching      Score
0  john carry  73.684211
       Matching  Score
0  alex midlane   80.0
       Matching  Score
0  alex midlane   50.0
      Matching      Score
0  robert patt  53.333333
      Matching      Score
0  robert patt  70.588235
      Matching  Score
0  david baker  100.0

*我需要这个格式

  | Matching     |   Score    |
  | ------------ | -----------|
  | john carry   |  73.684211 |
  | alex midlane |  80.0      |
  | alex midlane |  50.0      |
  | robert patt  |  53.333333 |
  | robert patt  |  70.588235 |
  | david baker  |  100.0     |

*我的代码

import numpy as np
import pandas as pd
from rapidfuzz import process, fuzz
df = pd.DataFrame({
    "NameTest": ["john carry", "alex midlane", "robert patt", "david baker", np.nan, np.nan, np.nan],
    "Name": ["john carrt", "john crat", "alex mid", "alex", "patt", "robert", "david baker"]
})

NameTests = [name for name in df["NameTest"] if isinstance(name, str)]

for Name in df["Name"]:
    if isinstance(Name, str):
        match = process.extractOne(
            Name, NameTests,
            scorer=fuzz.ratio,
            processor=None,
            score_cutoff=10)
        data = {'Matching': [match[0]],
                'Score': [match[1]]}
    df1 = pd.DataFrame(data)

    print(df1)

我试过很多方法。但是有相同的指纹

谢谢你的建议


Tags: nameimportnpjohnrobertpd指纹david
2条回答

在每个循环中创建一个新的数据帧。您可以将结果存储在全局dict中,并在循环后从该dict创建数据帧

data = {'Matching': [], 'Score': []}

for Name in df["Name"]:
    if isinstance(Name, str):
        match = process.extractOne(
            Name, NameTests,
            scorer=fuzz.ratio,
            processor=None,
            score_cutoff=10)
        data['Matching'].append(match[0])
        data['Score'].append(match[1])

df1 = pd.DataFrame(data)

您需要一个数组或列表来保存所有数据(我使用数组),因为您在每个循环中创建了一个数据帧

data = []
for Name in df["Name"]:
    if isinstance(Name, str):
        match = process.extractOne(
            Name, NameTests,
            scorer=fuzz.ratio,
            processor=None,
            score_cutoff=10)
        print(match[0])
        data.append({'Matching': match[0],
                'Score': match[1]})
        
df1 = pd.DataFrame(data)
print(df1)

这是输出

enter image description here

相关问题 更多 >