如何避免在《Pandas》的专栏中出现“NaN”？

df_new= {'X':[], 'Y':[], 'Property of X (units)':[], 'Property of Y (units)':[]} #setup dict df_new= pd.DataFrame.from_dict(df_new) #dict to df for df_choice.column in df_choice.columns: #list columns in previous dataset print(df_choice.column) state = input('Is this sample considered X or Y? Input X or Y, or quit to exit the loop') #ask if column in previous dataset falls under X or Y if state == 'quit': break elif state == 'x': df_new['Property of X (units)'] = df_choice[df_choice.column] #takes data from old dataframe into new df_new['X'] = 'df_choice.column' #fills column X with column name from df_choice elif state == 'y': df_new['Y'] = 'df_choice.column' df_new['Property of Y (units)'] = df_choice[df_choice.column] else: print('Not a valid response') df_new #prints new df

+-----+------------+----------------+----------------+ | X | Y | Property of X | Property of Y | +-----+------------+----------------+----------------+ | NaN | Sample123 | 4 | 3 | | NaN | Sample123 | 5 | 4 | | NaN | Sample123 | 3 | 6 | | NaN | Sample123 | 4 | 1 | +-----+------------+----------------+----------------+

+-----------+------------+----------------+----------------+ | X | Y | Property of X | Property of Y | +-----------+------------+----------------+----------------+ | SampleABC | Sample123 | 4 | 3 | | SampleABC | Sample123 | 5 | 4 | | SampleABC | Sample123 | 3 | 6 | | SampleABC | Sample123 | 4 | 1 | +-----------+------------+----------------+----------------+

+-----------+-------+-----------+ | Sample | Type | Property | +-----------+-------+-----------+ | SampleABC | X | 4 | | SampleABC | X | 5 | | SampleABC | X | 3 | | SampleABC | X | 4 | | Sample123 | Y | 3 | | Sample123 | Y | 4 | | Sample123 | Y | 6 | | Sample123 | Y | 1 | +-----------+-------+-----------+

1条回答

网友

1楼 · 发布于 2024-09-25 16:21:54

插入空数据帧时，应将列名作为向量传递，而不是单个字符串

可以这样想：通过向空数据帧传递单个字符串，您正在空数据帧中创建一列。但是熊猫怎么知道这个列应该有多长呢

for循环中的变量也有一个令人困惑的名称：“df_choice.column”中的点看起来像是在访问数据帧

综合起来：

for colname in df_choice.columns:

    #...#
    elif state  == 'x':
        #takes data from old dataframe into new
        df_new['Property of X (units)'] = df_choice[colname] 

        #fills column X with column name from df_choice
        df_new['X'] = np.repeat(colname, df_choice.shape[0])

    elif state == 'y':
        df_new['Y'] = np.repeat(colname, df_choice.shape[0])
        df_new['Property of Y (units)'] = df_choice[colname]

请注意，我也替换了“Y”变量的行，以防它出现在“X”之前

要使用np.repeat，请导入库

import numpy as np

相关问题更多 >

编程相关推荐

热门问题

热门文章