突出显示基于多个条件的值

2024-09-29 19:27:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我想对等于或超过指导值的值(单元格)进行着色处理(当前在字典中,键(化学符号):值(指导限值)。我希望它能够处理多组指导原则,并根据超出的最高值指导原则对单元格进行着色。下表显示了所需的输出,指导原则指定了颜色,每个指导原则的超出值指定了相同的颜色

enter image description here

这显示了三种不同指南(ANZECC、WHO和美国环境保护局)对铜、镉、SO4和锌的限值。结果如下(现场1-3),并根据其突出显示(如有)它们超过了指导原则。因此,对于第一个参数,Cu,场地2仅超过美国EPA值,因此为黄色阴影。场地1所有指导原则中最高的是WHO值,因此为蓝色阴影。场地3也超过了所有指导原则,最大值为WHO,因此为蓝色阴影。场地1的Cd超过美国EPA和ANZECC(两者相同)同样是绿色阴影(但可能是耶洛斯阴影,因为美国环境保护局的限制是相同的)。场地2没有超过任何指导方针,因此没有阴影。场地3超过所有指导方针,Cd的最高值是WHO,蓝色阴影也是。以此类推

我可以使用下面的代码根据一条准则对超标进行着色

#example dataframe
df=pd.DataFrame(({'Cu':[0.004, 0.0017, 0.1],  'Cd': ['0.001','0.0005',1], 'SO4': [700,"450",1500],'Zn': ['0.15','0.1','0.25'],}))
cols=df.columns

给予

Out[7]:
      Cd      Cu   SO4    Zn
0   0.001  0.0040   700  0.15
1  0.0005  0.0017   450   0.1
2       1  0.1000  1500  0.25

#guideline values
WHO=  {'Cd' :0.002 ,'Cu' :0.003 ,'SO4':"NaN",'Zn' :0.1}
ANZECC=  {'Cd' :0.001 ,'Cu' :0.002 ,'SO4':1000.0,'Zn' :0.2}
US_EPA=  {'Cd' :0.001 ,'Cu' :0.0015 ,'SO4':500.0,'Zn' :0.01}

#highlight the relevant cells that are greater than the guideline
def fill_exceedances(x):
    color = 'orange'
    #get columns which are in keys of dict
    c = x.columns.intersection(ANZECC.keys())
    #filter columns and rename by dict
    df2 = x[c].rename(columns=ANZECC)
    #create boolean mask only for matched columns and compare
    mask = df2.astype(float).values < df2.columns[None,:].values
    #new DataFrame filled by no color
    df1 = pd.DataFrame('', index=x.index, columns=c)
    #set color by mask and add missing non matched columns names by reindex
    df1 = (df1.where(mask, 'background-color: {}'.format(color))
              .reindex(columns=x.columns, fill_value=''))

    return df1

df.style.apply(fill_exceedances, axis=None).to_excel('styled.xlsx', engine='openpyxl')

这为澳新银行ECC指南提供了:

enter image description here

我不知道如何应用连续的样式,因为如果我重复这个过程,以前的样式就会丢失。我觉得我用了错误的方法来处理这个问题,从最低限度开始的连续样式将不起作用

更新:我已经把指南放在一本字典里,可以根据使用中的指南对它们进行排序并指定颜色,但我仍然无法确定如何继续

guidelines={"CEPA_FW":  {'Sulphate':1000,'Cd' :0.01 ,'Cu' :1.0 ,'Pb' :0.005 ,'Zn' :5.0}, 
 "ANZECC_Stock":{'Sulphate':1000,'Cd' :0.01,'Cu' :0.4, 'Zn' :20},
 "ANZECC_FW":  {'Sulphate':1000,'Cd' :0.0002 ,'Cu' :0.0014 ,'Pb' :0.0034 ,'Zn' :0.008}}


df=pd.DataFrame.from_records(guidelines).T
df

给出:

Out[3]:
                  Cd      Cu      Pb  Sulphate      Zn
ANZECC_FW     0.0002  0.0014  0.0034    1000.0   0.008
ANZECC_Stock  0.0100  0.4000     NaN    1000.0  20.000
CEPA_FW       0.0100  1.0000  0.0050    1000.0   5.000

然后对指南进行排序并指定颜色,包括:

fill_color={'ANZECC_FW':'blue',
'CEPA_FW' :'green',
'ANZECC_Stock': 'yellow'}

for i in df.columns:
    col=df[i]
    col=col.sort_values(ascending=True)
    #print col,i, col.idxmax()
    #print i
    #print col
    count=0
    for val in col:

        if val>0:

            print i,val,col.index[count],fill_color[col.index[count]]

            count+=1
        else:
            continue

给出:


Cd 0.0002 ANZECC_FW blue
Cd 0.01 ANZECC_Stock yellow
Cd 0.01 CEPA_FW green
Cu 0.0014 ANZECC_FW blue
Cu 0.4 ANZECC_Stock yellow
Cu 1.0 CEPA_FW green
Pb 0.0034 ANZECC_FW blue
Pb 0.005 CEPA_FW green
Sulphate 1000.0 ANZECC_FW blue
Sulphate 1000.0 ANZECC_Stock yellow
Sulphate 1000.0 CEPA_FW green
Zn 0.008 ANZECC_FW blue
Zn 5.0 CEPA_FW green
Zn 20.0 ANZECC_Stock yellow

所以我有所有的信息,我只需要循环这些信息,以便数据中的每一列(如果cell value>;=list value,指定填充颜色)都可以工作,但是我一直在研究如何应用它


Tags: columnsdfstockcdcolcolor指导阴影
2条回答

首先,你的问题中有几个问题:

  1. 您的数据不一致:为生成df而提供的代码包含一些混合了浮点数的字符串。如果数据中确实存在这种情况,则首先应将all转换为float:

    df=df.应用(pd.到数值)

  2. 你的guidelines在整个帖子中都会发生变化。例如:

你从指导方针开始

WHO=  {'Cd' :0.002 ,'Cu' :0.003 ,'SO4':"NaN,'Zn' :0.1}       # string `NaN` again?
ANZECC=  {'Cd' :0.001 ,'Cu' :0.002 ,'SO4':1000.0,'Zn' :0.2}
US_EPA=  {'Cd' :0.001 ,'Cu' :0.0015 ,'SO4':500.0,'Zn' :0.01}

然后,稍后构建guidelines数据帧,您一定也错误地将其命名为df,如下所示:

# no string `NaN`
# Sulphate or SO4
# different standard names
guidelines={"CEPA_FW":  {'Sulphate':1000,'Cd' :0.01 ,'Cu' :1.0 ,'Pb' :0.005 ,'Zn' :5.0}, 
 "ANZECC_Stock":{'Sulphate':1000,'Cd' :0.01,'Cu' :0.4, 'Zn' :20},
 "ANZECC_FW":  {'Sulphate':1000,'Cd' :0.0002 ,'Cu' :0.0014 ,'Pb' :0.0034 ,'Zn' :0.008}}

这与前者完全不同


也就是说,让我们根据您的图片构建一个新的guidelinesfill_color

guidelines = {'WHO' :  {'Cd' :0.002 ,'Cu' :0.003 ,'SO4':np.NaN,'Zn' :0.1},
              'ANZECC' : {'Cd' :0.001 ,'Cu' :0.002 ,'SO4':1000.0,'Zn' :0.2},
              'US_EPA' :  {'Cd' :0.001 ,'Cu' :0.0015 ,'SO4':500.0,'Zn' :0.01}
              }
guidelines = pd.DataFrame(guidelines).T

fill_color = {'WHO':'teal',
              'US_EPA' :'yellow',
              'ANZECC': 'green'}

下面是函数:

def hightlight(col):
    name = col.name

    # extract the threshold and sort decreasingly
    thresh = guidelines[name].sort_values(ascending=False)

    # compare each value in column to each threshold
    compare = np.greater_equal.outer(col.values, thresh.values)

    # if any threshold is exceeded
    exceed_thresh = compare.any(1)

    # and where it is exceeded
    exceed_idx = np.argmax(compare, axis=1)

    # extract the standards that has is passed
    standards = np.where(exceed_thresh, thresh.index[exceed_idx], '')

    # format strings
    return [f'background-color:{fill_color[s]}' if s else '' for s in standards ]

df.style.apply(hightlight)

输出:

enter image description here

看来你可以这样做:

编辑:

import pandas as pd, numpy as np
data=pd.DataFrame(({'Cd': [0.001,0.0005,1],'Cu':[0.004, 0.0017, 0.1],  'SO4': [700,450,1500],'Zn': [0.15,0.1,0.25],}))

guidelines = {'WHO' :  {'Cd' :0.002 ,'Cu' :0.003 ,'SO4':np.NaN,'Zn' :0.1},
              'ANZECC' : {'Cd' :0.001 ,'Cu' :0.002 ,'SO4':1000.0,'Zn' :0.2},
              'US_EPA' :  {'Cd' :0.001 ,'Cu' :0.0015 ,'SO4':500.0,'Zn' :0.01}
              }
guidelines = pd.DataFrame(guidelines).T

fill_color = {'WHO':'teal','ANZECC': 'red','US_EPA' :'yellow'}

def highlight(x):
  if x.name in guidelines.columns:
    style = []
    guide = guidelines[x.name]
    for i in x:
      condition = (guide<=i)
      _guide = guide[condition]
      try:
        match = _guide.index[np.argmax(np.array(_guide))]
        color = fill_color[match]
        style += [f"background-color: {color}"]
      except:
        style += ['']
    return style
  else:
    return ['']*len(x)

data.style.apply(highlight)

看起来是这样的(您可以继续添加进一步的格式/更改颜色等): pandas styling

相关问题 更多 >

    热门问题