我有下面的SampleDf这样的数据,我试图创建一个代码,从每个字符串中提取第一个“Avg”、“Sum”或“Count”,并将其放入一个新的列“Agg”。我下面的代码几乎做到了这一点,但它有一个层次结构。所以在下面的代码中,如果Count在Sum之前,它仍然将Sum放在Agg列中。我有一个OutputDf下面显示我希望得到什么
Sample Data:
SampleDf=pd.DataFrame([['tom',"Avg(case when Value1 in ('Value2') and [DateType] in ('Value3') then LOS end)"],['bob',"isnull(Sum(case when XferToValue2 in (1) and DateType in ('Value3') and [Value1] in ('HM') then Count(LOS) end),0)"]],columns=['ReportField','OtherField'])
Sample Output:
OutputDf=pd.DataFrame([['tom',"Avg(case when Value1 in ('Value2') and [DateType] in ('Value3') then LOS end)",'Avg'],['bob',"isnull(Sum(case when XferToValue2 in (1) and DateType in ('Value3') and [Value1] in ('HM') then Count(LOS) end),0)",'Sum']],columns=['ReportField','OtherField','Agg'])
Code:
import numpy as np
SampleDf['Agg'] = np.where(SampleDf.SQLTranslation.str.contains("Sum"),"Sum",
np.where(SampleDf.SQLTranslation.str.contains("Count"),"Count",
np.where(SampleDf.SQLTranslation.str.contains("Avg"),"Avg","Nothing")))
对这个问题快速而肮脏的尝试就是编写一个返回的函数:
-任何感兴趣的术语,如[“Avg”、“Sum”、“Count]”,如果它出现在字符串中,则首先出现
-或者
None
,如果没有:字符串中的条件证明:
如果术语不在字符串中,则证明:
相关问题 更多 >
编程相关推荐