python中列之间存在任何NaN值时如何处理脚本

2024-07-03 02:37:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图处理一个脚本,其中我试图查找列之间的月份,该脚本工作正常,但每当任何字段为空时,它都会给出一个错误

如果任何NaN值介于两者之间,它必须跳过并移动到下一行

如何解决此错误:

输入数据:

Month1    Month2     Month_list
Mar2020   Dec2020
Nov2020   Jan2021
NaN       NaN
Sep2020   Feb2021
Oct2020   Dec2020
NaN       NaN
Dec2020   Mar2021

预期输出:

预期产量

Month1    Month2     Month_list

Mar2020   Sep2020    Mar2020,Apr2020,May2020,Jun2020,Jul2020,Aug2020,Sep2020
Nov2020   Jan2021    Nov2020,Dec2020,Jan2021
NaN       NaN        NaN
Sep2020   Feb2021    Sep2020,Oct2020,Nov2020,Dec2020,Jan2021,Feb2021
Oct2020   Dec2020    Oct2020,Nov2020,Dec2020
NaN       NaN        NaN
Dec2020   Mar2021    Dec2020,Jan2021,Feb2021,Mar2021

代码:

def get_date_list(x):
    return ",".join(
        item.strftime("%b %Y")
        for item in pd.date_range(x['Month1'], x['Month2'], freq="MS")
    )
    
df['Month_list'] = df.apply(lambda x: get_date_list(x), axis=1)

Error:ValueError:无论是start还是end都不能是NaT


Tags: 脚本datenanlistmonthmonth2month1jan2021
3条回答

您需要排除带有NaNvlaues的列,一种方法如下:

pd.concat([df, df.dropna().apply(lambda x: get_date_list(x), axis=1).to_frame('Months_List')], axis=1)

输出

Out[169]: 
    Month1      Month2                                        Months_List
0  Mar2020     Dec2020  Mar 2020,Apr 2020,May 2020,Jun 2020,Jul 2020,A...
1  Nov2020     Jan2021                         Nov 2020,Dec 2020,Jan 2021
2      NaN         NaN                                                NaN
3  Sep2020     Feb2021  Sep 2020,Oct 2020,Nov 2020,Dec 2020,Jan 2021,F...
4  Oct2020     Dec2020                         Oct 2020,Nov 2020,Dec 2020
5      NaN         NaN                                                NaN
6  Dec2020     Mar2021                Dec 2020,Jan 2021,Feb 2021,Mar 2021
7  Dec2020     Mar2021                Dec 2020,Jan 2021,Feb 2021,Mar 2021

尝试:

df["Month_list"] = df.loc[
    df[["Month1", "Month2"]].notna().all(axis=1), ["Month1", "Month2"]
].apply(lambda x: get_date_list(x), axis=1)
print(df)

印刷品:

    Month1   Month2                                         Month_list
0  Mar2020  Dec2020  Mar 2020,Apr 2020,May 2020,Jun 2020,Jul 2020,A...
1  Nov2020  Jan2021                         Nov 2020,Dec 2020,Jan 2021
2      NaN      NaN                                                NaN
3  Sep2020  Feb2021  Sep 2020,Oct 2020,Nov 2020,Dec 2020,Jan 2021,F...
4  Oct2020  Dec2020                         Oct 2020,Nov 2020,Dec 2020
5      NaN      NaN                                                NaN
6  Dec2020  Mar2021                Dec 2020,Jan 2021,Feb 2021,Mar 2021

IIUC,您的函数需要if else块:

def get_date_list(x):
    if not pd.isna(x['Month1']):
        return ",".join(
        item.strftime("%b %Y")
        for item in pd.date_range(x['Month1'], x['Month2'], freq="MS") 
        )
    return np.nan
df['Month_list'] = df.apply(lambda x: get_date_list(x), axis=1)

print(df)

    Month1   Month2                                         Month_list
0  Mar2020  Dec2020  Mar 2020,Apr 2020,May 2020,Jun 2020,Jul 2020,A...
1  Nov2020  Jan2021                         Nov 2020,Dec 2020,Jan 2021
2      NaN      NaN                                                NaN
3  Sep2020  Feb2021  Sep 2020,Oct 2020,Nov 2020,Dec 2020,Jan 2021,F...
4  Oct2020  Dec2020                         Oct 2020,Nov 2020,Dec 2020
5      NaN      NaN                                                NaN
6  Dec2020  Mar2021                Dec 2020,Jan 2021,Feb 2021,Mar 2021

相关问题 更多 >