擅长:python、mysql、java
<p>我不太清楚为什么第一排的计数应该是2。你能试着绕过这个问题吗:</p>
<pre><code>import pandas as pd
feature = ["1_1_1"]*6 +["1_1_2"]*3
gene = ["NRAS"]*3+["KRAS"]*3+["NRAS"]*3
target = ["AATTGG","TTGGCC", "AATTGG"]+ ["GGGGTT"]*3 + ["CCTTAA", "GGGGTT", "AATTGG"]
pos = [60,6,20,0,0,0,2,8,60]
df = pd.DataFrame({"feature":feature,
"gene":gene,
"target":target,
"pos":pos})
df.groupby(["feature", "gene"])\
.apply(lambda x:len(x.drop_duplicates(["target", "pos"])))
</code></pre>