Pandas_-Python中的合并求和与排除

2024-09-30 20:33:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试使用这个Python脚本合并重复的行。我将一列用逗号分隔,然后对其余的列求和,最后使用pandas删除重复项,但我需要排除一些行作为sum。例如,我不希望多边形面积和总面积为和。我该怎么办?在

import pandas as pd

output = r'C:dummy'

    fieldlist = ["FID","total_area","POLY_AREA", "PERCENTAGE","C5_3","M1_4","M1_4_R6A","M1_4_R6B", "M1_4_R7A", "M1_5_R10",
                 "M1_5_R7_3","M1_5_R9","M1_6_R10","PARK","R6A", "R6B", "R7A"]

    #Create dataframe from cursor
    df = pd.DataFrame.from_records(data=arcpy.da.SearchCursor('calculations', fieldlist), columns = fieldlist)

    #Create a new dataframe of FIDS and comma-separated percentages
    df1 = df.groupby("FID")["PERCENTAGE"].apply(lambda x: ", ".join(x.astype(str))).reset_index()

    #Create a new dataframe of sums per FID
    df2 = df.groupby("FID").sum()
    df2.drop("PERCENTAGE", axis=1, inplace=True)

    #Merge/join them together and export as csv
    df1.merge(df2, left_on="FID", right_index=True).to_csv(path_or_buf=output, index=False)

Tags: dataframepandasdfoutputindexascreatepd
2条回答

用这个代替你的东西就行了。在

 #Create a new dataframe of FIDS and comma-separated percentages
df1 = df.groupby(["FID","total_area","POLY_AREA"])["PERCENTAGE"].apply(lambda x: ", ".join(x.astype(str))).reset_index()

#Create a new dataframe of sums per FID
df2 = df.groupby("FID").sum()
df2.drop(["total_area","POLY_AREA","PERCENTAGE"], axis=1, inplace=True)

在创建df2时,可以尝试获取列的子集,以便排除不需要的内容。具体来说,尝试如下创建df2:

df2_cols = [col for col in fieldlist if col not in ['FID', 'total_area', 'POLY_AREA']]
df2 = df.groupby("FID")[df2_cols].sum()

您也可以在创建合并的df之后,drop不需要的列。在

相关问题 更多 >