带for循环的数据帧索引

date fruit quantity 4/5/2014 13:34 Apples 73 4/5/2014 3:41 Cherries 85 4/6/2014 12:46 Pears 14 4/8/2014 8:59 Oranges 52 4/10/2014 2:07 Apples 152 4/10/2014 18:10 Bananas 23 4/10/2014 2:40 Strawberries 98

date fruitid quantity 4/5/2014 13:34 fruit0 73 4/5/2014 3:41 fruit1 85 4/6/2014 12:46 fruit2 14 4/8/2014 8:59 fruit3 52 4/10/2014 2:07 fruit0 152 4/10/2014 18:10 fruit4 23 4/10/2014 2:40 fruit5 98

import pandas as pd import numpy df = pd.read_csv('example2.csv', header=0, dtype='unicode') df_count = df['fruit'].value_counts() df.sort_values(['fruit'], ascending=True, inplace=True) #sorting the column #fruit df.reset_index(drop=True, inplace=True) #print(df) x = 0 #starting my counter values or position in the column #old_fruit = df.fruit[x] #new_fruit = df.fruit[x+1] df.loc[:,'NewCol'] = 0 # to create the new column print(df) for x in range(0, len(df)): old_fruit = df.fruit[x] #Starting fruit new_fruit = old_fruit[x+1] #next fruit to compare with if old_fruit == new_fruit: #print(x) #print(old_fruit, new_fruit) df.NewCol[x] = 'fruit' + str(x) #if they are the same, put #fruit[x] or fruit0 in the current row else: print("Not the Same") #print(x) #print(old_fruit, new_fruit) df.NewCol[x+1] = 'fruit' +str(x+1) #if they are the same, #put fruit[x+1] or fruit1 in the current row print(df)

2条回答

网友

1楼 · 编辑于 2024-10-03 21:35:42

新答案

使用factorize

df.assign(
    NewCol=np.core.defchararray.add('Fruit', df.fruit.factorize()[0].astype(str))
)

              date         fruit  quantity  NewCol
0   4/5/2014 13:34        Apples        73  Fruit0
1    4/5/2014 3:41      Cherries        85  Fruit1
2   4/6/2014 12:46         Pears        14  Fruit2
3    4/8/2014 8:59       Oranges        52  Fruit3
4   4/10/2014 2:07        Apples       152  Fruit0
5  4/10/2014 18:10       Bananas        23  Fruit4
6   4/10/2014 2:40  Strawberries        98  Fruit5

不是一条线，而是更好的

^{pr2}$

相同的答案，但正在更新df

f, u = pd.factorize(df.fruit.values)
n = np.core.defchararray.add('Fruit', f.astype(str))
df = df.assign(NewCol=n)
# Equivalent to
# df['NewCol'] = n
df

              date         fruit  quantity  NewCol
0   4/5/2014 13:34        Apples        73  Fruit0
1    4/5/2014 3:41      Cherries        85  Fruit1
2   4/6/2014 12:46         Pears        14  Fruit2
3    4/8/2014 8:59       Oranges        52  Fruit3
4   4/10/2014 2:07        Apples       152  Fruit0
5  4/10/2014 18:10       Bananas        23  Fruit4
6   4/10/2014 2:40  Strawberries        98  Fruit5

旧答案

@SeaMonkey找到了看到错误的原因。在

不过，我猜你想做什么。
我把cumcount添加到fruit

df.assign(NewCol=df.fruit + df.groupby('fruit').cumcount().astype(str))

              date         fruit  quantity         NewCol
0   4/5/2014 13:34        Apples        73        Apples0
1    4/5/2014 3:41      Cherries        85      Cherries0
2   4/6/2014 12:46         Pears        14         Pears0
3    4/8/2014 8:59       Oranges        52       Oranges0
4   4/10/2014 2:07        Apples       152        Apples1
5  4/10/2014 18:10       Bananas        23       Bananas0
6   4/10/2014 2:40  Strawberries        98  Strawberries0

网友

2楼 · 编辑于 2024-10-03 21:35:42

我想你的for循环只需要一个索引

尝试：

for x in range(0, len(df)-1):

取而代之的是

编辑： 有道理的是：

new_fruit = old_fruit[x+1]

并没有给出预期的结果，老嫒果不是一个列表而是一个字符串。我想你想要的是：

new_fruit = df.fruit[x+1]

编辑（2）：

您应该添加： df.NewCol[x+1] = 'fruit' + str(x)

我的工作脚本是：

    import pandas as pd
    import numpy
    df = pd.read_csv('data.csv', header=0, dtype='unicode')
    df_count = df['fruit'].value_counts()
    df.sort_values(['fruit'], ascending=True, inplace=True) #sorting the column 
    #fruit
    df.reset_index(drop=True, inplace=True)
    #print(df)
    x = 0 #starting my counter values or position in the column
    #old_fruit = df.fruit[x]
    #new_fruit = df.fruit[x+1]
    df.loc[:,'NewCol'] = 0 # to create the new column
    print(df)
    for x in range(0, len(df)-1):
            old_fruit = df.fruit[x] #Starting fruit
            new_fruit = df.fruit[x+1] #next fruit to compare with
            if old_fruit == new_fruit:
                    #print(x)
                    #print(old_fruit, new_fruit)
                    df.NewCol[x] = 'fruit' + str(x)
                    df.NewCol[x+1] = 'fruit' + str(x)#if they are the same, put 
                    #fruit[x] or fruit0 in the current row

            else:
                    print("Not the Same")
                    #print(x)
                    #print(old_fruit, new_fruit)
                    df.NewCol[x+1] = 'fruit' +str(x+1) #if they are the same, 
                    #put fruit[x+1] or fruit1 in the current row
    print(df)

相关问题更多 >

编程相关推荐

热门问题

热门文章