这就是我试图用我的代码实现的:我有一个包含网球运动员姓名的当前csv文件,我想在排名中显示新球员后将其添加到该文件中。我的脚本遍历排名并创建一个数组,然后从csv文件导入名称。它应该查看哪些名称不在后者中,然后提取这些名称的在线数据。然后,我只想将新行追加到旧CSV文件的末尾。我的问题是新行是用玩家的名字索引的,而不是跟随旧文件的索引。知道为什么会这样吗?还有,为什么要添加未命名的列
def get_all_players():
# imports names of players currently in the atp rankings
current_atp_ranking = check_atp_rankings()
current_player_list = current_atp_ranking['Player']
# clean up names in case of white spaces
for i in range(0, len(current_player_list)):
current_player_list[i] = current_player_list[i].strip()
# reads the main file and makes a dataframe out of it
current_file = 'ATP_stats_new.csv'
df = pd.read_csv(current_file)
# gets all the names within the main file to see which current ones aren't there
names_on_file = list(df['Player'])
# cleans up in case of any white spaces
for i in range(0, len(names_on_file)):
names_on_file[i] = names_on_file[i].strip()
# Removing Nadal for testing purposes
names_on_file.remove("Rafael Nadal")
# creating a list of players in current_players_list but not in names_on_file
new_player_list = [x for x in current_player_list if x not in names_on_file]
# loop through new_player_list
for player in new_player_list:
# delay to avoid stopping
time.sleep(2)
# finding the player's atp link for profile based on their name
atp_link = current_atp_ranking.loc[current_atp_ranking['Player'] == player, 'ATP_Link']
atp_link = atp_link.iloc[0]
# make a basic dictionary with just the player's name and link
player_dict = [{'Name': player, 'ATP_Link': atp_link}]
# enter the new dictionary into the existing main file
df.append(player_dict, ignore_index=True)
# print dataframe to see how it looks before exporting
print(df)
# export dataframe into current file
df.to_csv(current_file)
这是文件最初的样子:
Unnamed: 0 Player ... Coach Turned_Pro
0 0 Novak Djokovic ... NaN NaN
1 1 Rafael Nadal ... Carlos Moya, Francisco Roig 2001.0
2 2 Roger Federer ... Ivan Ljubicic, Severin Luthi 1998.0
3 3 Daniil Medvedev ... NaN NaN
4 4 Dominic Thiem ... NaN NaN
... ... ... ... ... ...
1976 1976 Brian Bencic ... NaN NaN
1977 1977 Boruch Skierkier ... NaN NaN
1978 1978 Majed Kilani ... NaN NaN
1979 1979 Quentin Gueydan ... NaN NaN
1980 1980 Preston Brown ... NaN NaN
这就是新行的外观:
1977 1977.0 ... NaN
1978 1978.0 ... NaN
1979 1979.0 ... NaN
1980 1980.0 ... NaN
Rafael Nadal NaN ... 2001
您的代码中有一些关键部分缺失,这些部分是准确回答问题所必需的。根据您发布的内容,有两个想法:
导入您的CSV文件
您以前的csv文件可能与索引一起保存。确保上次在第一个csv列中使用时,csv文件内容没有数据帧索引。保存时,请执行以下操作:
当您像这样加载文件时
它将自动分配索引号,并且不会有重复的列
列顺序错误
不确定
atp_link
以什么顺序接收什么信息。从您提供的内容来看,它似乎返回了两列:“Coach”和“Turning Pro”我建议您在从
atp_link
中提取信息后,为每个要添加的新玩家创建一个列表(而不是dict)。因此,如果您正在添加纳达尔,您将根据信息为每个新玩家创建一个信息列表。纳达尔的信息列表如下所示:然后将列表附加到数据帧,如下所示:
希望这有帮助
相关问题 更多 >
编程相关推荐