我整理了我的CSV文件做了一些计算。Python 2.7版
import pandas as pd
df = pd.read_csv('Cliente_x_Pais_Sitio.csv', sep=',')
df1 = df.sort_values(by=['Cliente','Auth_domain','Sitio',"Country"])
df1.to_csv('test.csv')
CSV数据(test.csv
):
Cliente,Fecha,Auth_domain,Sitio,Country,ECPM_medio
FF,15/12/2017,@ff,ff_Color,Afganistán,0.53
FF,15/01/2018,@ff,ff_Color,Afganistán,0.5
FF,15/01/2017,@ff,ff_Color,Alemania,0.34
FF,15/12/2017,@ff,ff_Color,Alemania,0.38
FF,15/01/2018,@ff,ff_Color,Alemania,0.37
我需要的是:
if (15/12/2017 ECPM) ≤ (15/01/2018 ECPM):
if ((15/12/2017 ECPM)*0.8) ≥ (15/01/2017 ECPM):
r = (15/01/2017 ECPM)
else:
r = ((15/12/2017 ECPM)*0.8)
else:
if (15/01/2018 ECPM) ≥ (15/01/2017 ECPM):
r = (15/01/2017 ECPM)
else:
r = (15/01/2018 ECPM)
在填写真实数据时,前两行是:
if 0.53 ≤ 0.5:
if 0.5 ≥ 0: #if we don't have the cell value I would like to add a 0 True
r = 0.5
记住,我有10000多行,我需要一个多表格
新的CSV应该显示:
Cliente,Auth_domain,Sitio,Country,Recomendation_ECPM
FF,@ff,ff_Color,Afganistán,0.5
FF,@ff,ff_Color,Alemania,0.34
我不确定我有没有正确的答案
setval
或compare_val
中的返回值逻辑但不管这些管道使用排序、分组和转换。因为我们要将边与
nan
(首先是shift(-1)
,最后是shift(1)
)进行比较,所以我们必须在最后删除它们。你知道吗相关问题 更多 >
编程相关推荐