根据python中的用户输入值筛选数据帧

2024-10-06 04:32:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要用python编写一个脚本,提示用户选择id为19876/20807/13978/49999的。然后根据最多id列和它们的值接受用户输入,并返回数据帧中的相应行

例如,如果用户想要过滤19876的数据或其他任何内容,他必须写入这些id,df2应该只包含关于它的数据

Df1

    movie_ref                              year      id
0   Captain America: The First Avenger     1942      nan
0   Avengers: Age of Ultron                2015      nan
0   Avengers: Infinity War                 2017      nan
0   Avengers: Endgame                      2018      nan

Df2

      id         movie_ref                              year
0    19876       Captain America: The First Avenger     1942
0    20807       Avengers: Age of Ultron                2015
0    13978       Avengers: Infinity War                 2017
0    49999       Avengers: Endgame                      2018

我试图用python和pandas库创建一些东西

import pandas as pd


d = {'movie_name': ['Captain America: The First Avenger', 'Avengers: Age of Ultron', 'Avengers: Infinity War', 'Avengers: Endgame'],
 'correct_id': [ 'N/A','N/A','N/A', 'N/A'],'year':[1942,2015,2017,2018]}
d1 = {'movie_ref': ['Captain America: The First Avenger','Avengers: Age of Ultron', 'Avengers: Infinity War', 'Avengers: Endgame'], 
'id': ['19876', '20807','13978','49999'],'year':[1942,2015,2017,2018]}

df1 = pd.DataFrame(data=d)
df2 = pd.DataFrame(data=d1)
print(df1)
print(df2)

filter_data = int(input('select movie writing the id: '))

filtered=(df2.loc[df2['id'] == filter_data])
print(filtered)
    

我得到了一个输出:

select movie writing the id: 49999
Empty DataFrame
Columns: [movie_ref, id, year]
Index: []

预期产出

     movie_ref          id     year
0     Avengers: Endgame  49999  2018

然后我想取id 49999并替换Df1中的nan 最终输出:

  movie_ref                              year        id
0   Captain America: The First Avenger     1942      nan
0   Avengers: Age of Ultron                2015      nan
0   Avengers: Infinity War                 2017      nan
0   Avengers: Endgame                      2018      49999

 

Tags: oftherefidagenanmovieyear
2条回答

首先提示用户

print(f"Select one of {df['id'].values}")
id = int(input())

然后只需使用布尔索引来获取行

print(df[df['id'] == id]) 

根据已编辑的代码,df2中的id列是一个字符串,但您正在将输入数据作为int与之进行比较。所以你必须把它改成

filter_data = input('select movie writing the id: ')

filtered=(df2.loc[df2['id'] == filter_data])
print(filtered)

           movie_ref     id  year
3  Avengers: Endgame  49999  2018

现在,要替换id,您可以

df1.loc[df1['movie_name'].eq(filtered['movie_ref']), 'correct_id'] = filtered['id']

print(df1)

    movie_name  correct_id  year
0   Captain America: The First Avenger  N/A 1942
1   Avengers: Age of Ultron N/A 2015
2   Avengers: Infinity War  N/A 2017
3   Avengers: Endgame   49999   2018

相关问题 更多 >