如果出现错误,则执行其他操作字符串拆分

2024-06-25 06:05:25 发布

您现在位置:Python中文网/ 问答频道 /正文


df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A'))
df['StrikePrice'] = df.A.str.split('(\d+)').apply(lambda x: x[3])
df['CallPut'] = df.A.str[-2:]
print(df.head())

我希望在上面的数据框中拆分字符串,如下所示

BERGEPAINT20FEB550PE -> BERGEPAINT, 550, PE

BANKNIFTY2020631300CE -> BANKNIFTY, 31300, CE

BANKNIFTY2020631300PE -> BANKNIFTY, 31300, PE

但是有一个错误


Tags: columnslambdadataframedflistpdsplitapply
3条回答

也许这就是你想要的:

s = df['A'].str.split('(\d+)').apply(lambda x: [x[0], x[-2][-5:], x[-1]])
s.apply(lambda x: pd.Series(x)).rename(columns={0: 'A', 1: 'StrikePrice', 2: 'CallPut'})

            A StrikePrice CallPut
0  BERGEPAINT         550      PE
1   BANKNIFTY       31300      CE
2   BANKNIFTY       31300      PE

用给定的数据试试这个,使用regex'或'表达式进行拆分。按5位数或2位数拆分:

df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A'))
df['StrikePrice'] = df.A.str.split('(\d{5}|\d{2})').str[-2]
df['CallPut'] = df.A.str[-2:]
df['Name'] = df.A.str.split('(\d+)').str[0]
print(df.head())

输出:

                       A StrikePrice CallPut        Name
0   BERGEPAINT20FEB550PE          55      PE  BERGEPAINT
1  BANKNIFTY2020631300CE       31300      CE   BANKNIFTY
2  BANKNIFTY2020631300PE       31300      PE   BANKNIFTY

假设您不想要的部分(“20FEB”、“20206”、“20206”)都以20开头,并且由5个字符组成,那么您可以使用:

df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A'))
df['Toto'] = df.A.apply(lambda x: x[:x.index("20")])
df['StrikePrice'] = df.A.apply(lambda x: x[x.index("20")+5:-2])
df['CallPut'] = df.A.str[-2:]
print(df)

输出:

                       A        Toto StrikePrice CallPut
0   BERGEPAINT20FEB550PE  BERGEPAINT         550      PE
1  BANKNIFTY2020631300CE   BANKNIFTY       31300      CE
2  BANKNIFTY2020631300PE   BANKNIFTY       31300      PE

相关问题 更多 >