按表中的行值获取列的子集

userId movie1 movie2 movie3 movie4 movie5 movie6 0 4.1 NaN 1.0 NaN 2.1 NaN 1 3.1 1.1 3.4 1.4 NaN NaN 2 2.8 NaN 1.7 NaN 3.0 NaN 3 NaN 5.0 NaN 2.3 NaN 2.1 4 NaN NaN NaN NaN NaN NaN 5 2.3 NaN 2.0 4.0 NaN NaN

userId movie1 movie3 movie5 0 4.1 1.0 2.1 1 3.1 3.4 NaN 2 2.8 1.7 3.0 3 NaN NaN NaN 4 NaN NaN NaN 5 2.3 2.0 NaN

3条回答

网友

1楼 · 编辑于 2024-05-05 03:33:39

您可以使用.loc访问器选择特定的userId，然后使用notna创建布尔掩码，指定不包含NaN值的列，最后使用此布尔掩码筛选列：

userId = 0 # specify the userid here
df_user = df.loc[:, df.loc[userId].notna()]

详细信息：

>>> df.loc[userId].notna()

movie1     True
movie2    False
movie3     True
movie4    False
movie5     True
movie6    False
Name: 0, dtype: bool

>>> df.loc[:, df.loc[userId].notna()]

        movie1  movie3  movie5
userId                        
0          4.1     1.0     2.1
1          3.1     3.4     NaN
2          2.8     1.7     3.0
3          NaN     NaN     NaN
4          NaN     NaN     NaN
5          2.3     2.0     NaN

网友

2楼 · 编辑于 2024-05-05 03:33:39

将感兴趣的userId声明并loc放入一个新的df中，只保留相关列

然后pd.concat将新的df与其他用户ID一起保存，并保留所选用户ID的列（电影）：

user = 0 # set your userId

a = df.loc[[user]].dropna(axis=1)
b = pd.concat([a, (df.drop(a.index))[[i for i in a.columns]]])

其中打印：

b
        movie1  movie3  movie5
userId                        
0         4.10    1.00    2.10
1         3.10    3.40     NaN
2         2.80    1.70    3.00
3          NaN     NaN     NaN
4          NaN     NaN     NaN
5         2.30    2.00     NaN

注意，我已经将index设置为您指定的userId

网友

3楼 · 编辑于 2024-05-05 03:33:39

另一种方法：

import pandas as pd

user0 = df.iloc[0,:]       #select the first row
flags = user0.notna()      #flag the non NaN values
flags = flags.tolist()     #convert to list instead of series
newdf = df.iloc[:,flags]   #return all rows, and the columns where flags are true

相关问题更多 >

编程相关推荐

热门问题

热门文章