在合并的df中执行相当于vlookup的操作

2条回答

网友

1楼 · 编辑于 2024-09-29 23:23:19

根据您的示例，我将假设所有new_ID条目都是数字，除非没有匹配项

所以，如果您的数据帧看起来像这样（假设第2列有任何值，我不知道，所以我将0）

^{tb1}$

接下来，我们可以通过使用str.isnumeric()查看new_id列是否包含一个数字来检查它是否有id

has_id =df1.new_ID.str.isnumeric()
has_id>>>

0     True
1     True
2     True
3    False
4     True
Name: new_ID, dtype: bool

最后我们将使用where() 这是做什么的？它接受我们传递的has_idbollean过滤器的第一个参数cond，并检查它的True还是False。如果为true，则保持原始值；如果为false，则转到other中的参数，在本例中，我们将该参数指定给数据帧的第二列

df1.where(has_id, df.iloc[:,1], axis=0)>>>
  new_ID old_df_2
0   1   0
1   2   0
2   3   0
3   4   4

网友

2楼 · 编辑于 2024-09-29 23:23:19

欢迎使用Python。在熊猫中，你要做的是一项简单的任务。一个pandas Dataframe的每一列都是一个序列对象；基本上是一个值列表。您正在尝试查找哪些行号（又称索引）满足以下条件：new_id == "no match was found"。这可以通过将列从数据帧中拉出并应用lambda函数来实现。我建议将这段代码粘贴到一个新文件中，然后四处看看它是如何工作的

import pandas as pd

# Create test data frame
df = pd.DataFrame(columns=('new_id','old_id'))
df.loc[0] = (1, None)
df.loc[1] = ("no match", 4)
df.loc[2] = (3, None)
df.loc[3] = ("no match", 4)
print("\nHere is our test dataframe:")
print(df)

print("\nShow the values of the 'new_id' that meet our criteria:")
print(df['new_id'][lambda x: x == "no match"])

# Pull the index from these rows
indeces = df['new_id'][lambda x: x == "no match"].index.tolist()
print("\nIndeces:\n", indeces)

print("\nShow only the rows of the data frame that match 'indeces':")
print(df.loc[indeces]['old_id'])

关于此代码的几个注意事项：

df.loc[]指数据帧的特定行df.loc[2]指第三行（因为熊猫数据帧通常是zero-indexed）
这里的lambda函数分别获取列表（或系列对象）的每个值，并将这些值逐个插入到函数中。在本例中，我们将“new_id”的每个值引用为“x”，然后检查x == "no match"。将括号[]放在其周围会将输出转换为列表。因此，在这种情况下，[lambda x: x == "no_match"]的输出将是一个真值或假值列表。然后将该列表应用于我们的Series对象，以便只返回带有True的行
应用lambda函数.index.tolist()后，将Series对象转换为其索引列表

相关问题更多 >

编程相关推荐

热门问题

热门文章