擅长:python、mysql、java
<p>这就是我要做的</p>
<pre><code>import pandas as pd
from StringIO import StringIO
strDf = """Genes,Sub-Gene,Type,Reference
1,SG1,type3,0
1,SG1,type1,1
1,SG2,type7,0
1,SG2,type3,0
1,SG2,type9,0
1,SG2,type9,1
2,SG1,type3,1
2,SG1,type7,0"""
data = pd.read_csv(StringIO(strDf))
pp = data.groupby(['Genes','Sub-Gene']).apply(lambda x:(x[x['Reference']==1])['Type'])
for k,v in pp.iterkv():
data.loc[(data['Genes']==k[0]) & (data['Sub-Gene']==k[1]),'TrueType']=v
</code></pre>
<p>导致</p>
<pre><code> Genes Sub-Gene Type Reference TrueType
0 1 SG1 type3 0 type1
1 1 SG1 type1 1 type1
2 1 SG2 type7 0 type9
3 1 SG2 type3 0 type9
4 1 SG2 type9 0 type9
5 1 SG2 type9 1 type9
6 2 SG1 type3 1 type3
7 2 SG1 type7 0 type3
</code></pre>