数据帧列添加循环问题

2024-05-02 21:06:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我创建了一个程序,从原始序列创建100个变体。该计划的主要目标是创建包含100个核苷酸的DNA序列的100个变体。为了创建变体,我们有一个原始序列(它有一个克隆供进一步修改)。并且一个随机数的突变(在1-5之间)在原始序列上持续存在。例如,如果原始链是1,2,3,4,则变体可以是1,3,3,4或1,4,2,3;你明白了。我还添加了一个if-elif结构,它可以使突变的核苷酸与原来的相同。我想你对DNA有一点基本的了解,所以我不会过度解释。但如果有问题的话,我很乐意回答

我想制作100个变体,并将每个变体序列放在一个单独的数据帧列中。我的问题是,程序创建了一个变体,其他99个变体与该变体相同。我不明白为什么。如果您能建议对代码进行一些修改以避免重复,我将不胜感激。 下面是代码和代码结果

守则:

import pandas
import random
import pandas as pd


someshit = [4, 4, 2, 4, 4, 1, 3, 2, 1, 1, 2, 3, 4, 3, 3, 4, 1, 3, 4, 4, 3, 2, 4, 4, 
        4, 3, 3, 1, 3, 1, 4, 1, 3, 4, 2, 4, 2, 3, 3, 3, 1, 2, 1, 3, 2, 3, 2, 4, 
        4, 3, 4, 4, 4, 3, 2, 1, 4, 3, 4, 4, 2, 2, 2, 1, 2, 2, 1, 1, 4, 2, 1, 4, 3,
        3, 2, 4, 4, 1, 1, 2, 1, 4, 1, 4, 4, 3, 3, 1, 3, 2, 3, 3, 1, 4, 1, 2, 2, 3, 2, 4]  
r = 0
nucs = [1,2,3,4]
list2 = []
list3 = []
i = 0
j = 0
n = len(someshit)   #LENGTH OF THE SEQUENCE
 #RANDOM MUTATION COUNT

result = None
df = pd.DataFrame({ 0 : someshit })   
    
shitass = someshit #shitass = someshit clone
onehundo = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,
            38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,
            73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99]
hundred = onehundo #hundred  = onehundo clone
while i < 99:
    x = random.randint(1,5) #RANDOM MUTATION COUNT
    while j < x:
        y = random.choice(hundred)
    
        if shitass[y] == 4:
            nucs.remove(4)
            z = random.choice(nucs)  #z = the new mutated nucleotide
            shitass[y] = z

        elif shitass[y] == 3:
            nucs.remove(3)
            z = random.choice(nucs)  #z = the new mutated nucleotide
            shitass[y] = z

        elif shitass[y] == 2:
            nucs.remove(2)
            z = random.choice(nucs)  #z = the new mutated nucleotide
            shitass[y] = z

        elif shitass[y] == 1:
            nucs.remove(1)
            z = random.choice(nucs)  #z = the new mutated nucleotide
            shitass[y] = z

        nucs = [1,2,3,4] #re-establishes the nucs after every mutation.

        list2.append(y)
        list3.append(z)
        hundred.remove(y)
        #------------------------------------#------------------------------------#------------------------------------
        j = j+1
        
    for r in range(100):
        r = r+1
        result = shitass
        df[r] = result
        hundred = onehundo #re-establishment of the clone
        shitass = someshit #re-establishment of the clone
        r = r+1
        
    i = i+1

    

    
    
print(x)
print(y)
print(list2)
print(list3)
print(shitass)
print(df)

代码真的很乱,请提前道歉。 这就是我得到的结果:

2
62
[31, 82, 60, 3, 62]
[4, 3, 1, 1, 1]
[4, 4, 2, 1, 4, 1, 3, 2, 1, 1, 2, 3, 4, 3, 3, 4, 1, 3, 4, 4, 3, 2, 4, 4, 4, 3, 3, 1, 3, 1, 4, 4, 3, 4, 2, 4, 2, 3, 3, 3, 1, 2, 1, 3, 2, 3, 2, 4, 4, 3, 4, 4, 4, 3, 2, 1, 4, 3, 4, 4, 1, 2, 1, 1, 2, 2, 1, 1, 4, 2, 1, 4, 3, 3, 2, 4, 4, 1, 1, 2, 1, 4, 3, 4, 4, 3, 3, 1, 3, 2, 3, 3, 1, 4, 1, 2, 2, 3, 2, 4]
    0    1    2    3    4    5    6    7    8    9    ...  91   92   93   94   \
0     4    4    4    4    4    4    4    4    4    4  ...    4    4    4    4   
1     4    4    4    4    4    4    4    4    4    4  ...    4    4    4    4   
2     2    2    2    2    2    2    2    2    2    2  ...    2    2    2    2   
3     4    1    1    1    1    1    1    1    1    1  ...    1    1    1    1   
4     4    4    4    4    4    4    4    4    4    4  ...    4    4    4    4   
..  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...   
95    2    2    2    2    2    2    2    2    2    2  ...    2    2    2    2   
96    2    2    2    2    2    2    2    2    2    2  ...    2    2    2    2   
97    3    3    3    3    3    3    3    3    3    3  ...    3    3    3    3   
98    2    2    2    2    2    2    2    2    2    2  ...    2    2    2    2   
99    4    4    4    4    4    4    4    4    4    4  ...    4    4    4    4   

    95   96   97   98   99   100  
0     4    4    4    4    4    4  
1     4    4    4    4    4    4  
2     2    2    2    2    2    2  
3     1    1    1    1    1    1  
4     4    4    4    4    4    4  
..  ...  ...  ...  ...  ...  ...  
95    2    2    2    2    2    2  
96    2    2    2    2    2    2  
97    3    3    3    3    3    3  
98    2    2    2    2    2    2  
99    4    4    4    4    4    4  

[100 rows x 101 columns]

Tags: the代码clone序列random变体removeprint
1条回答
网友
1楼 · 发布于 2024-05-02 21:06:15

您的问题之一是shitass = someshit #shitass = someshit clone语句。该语句并不像您所期望的那样创建副本或克隆。下面是我的意思的说明

la = [1,2,3,1,4,5,6]
lb = la

委员会:

print(f"LA = {la}\nLB= {lb}") 

收益率:

LA = [1, 2, 3, 1, 4, 5, 6]
LB= [1, 2, 3, 1, 4, 5, 6] 

现在,如果通过以下方式修改lb的内容:

lb[2] = 9

重复前面的print语句,您将看到la和lb具有相同的值,因为lb=la只是在python中为同一对象指定了第二个名称。如下所示:

lb[2] = 9
print(f"LA = {la}\nLB= {lb}")

收益率:

LA = [1, 2, 9, 1, 4, 5, 6]
LB= [1, 2, 9, 1, 4, 5, 6]

要使lb成为la的克隆并支持独立的更改,应使用如下复制功能:

lc = la.copy()
lc[2] = 9
print(f"LA = {la}\nLC= {lc}")

这将产生:

LA = [1, 2, 3, 1, 4, 5, 6]
LC= [1, 2, 9, 1, 4, 5, 6]

相关问题 更多 >