我有一个带有字符串的dataframe列,类似于:
'TCCTGTAAATCAAAGGCCAAGRG'
,'GNGCNCCNGAYATRGCNTTYCC'
,'GATTTCTCTYCCTGTTCTTGCA'
我有一份信的清单:
SNPs={}
SNPs["Y"] = ['C', 'T']
SNPs["R"] = ['A', 'G']
SNPs["N"] = ['C', 'G', 'A', 'T']
每个R都需要换成A/G等等
例如:TCCTGTAAATCAAAGGCCAAGRG
对TCCTGTAAATCAAAGGCCAAGAG
和TCCTGTAAATCAAAGGCCAAGGG
的更改
我想要所有的排列和组合,结果在另一列
请帮我做同样的事情
import re, itertools
text = "GNGCNCCNGAYATRGCNTTYCC"
def getList(dict):
return list(dict.keys())
lsources = getList(SNPs)
ldests = []
for source in lsources:
ldests.append(SNPs[source])
#print(ldests)
# Generate the various pairings
for lproduct in itertools.product(*ldests):
#print(lproduct)
for i in text:
output = i
for src, dest in zip(lsources, lproduct):
# Replace each term (you could optimise this using a single re.sub)
output = output.replace("%s" % src, dest)
print(output)
这是我的代码..但是我没有得到想要的输出
试试这个:
更新:(在数据帧上运行)
输出:
相关问题 更多 >
编程相关推荐