如何根据索引转换值？

**Document type** **Invoicenumber Invoicedate** description quantity unit price line amount Invoice 123 28-08-2020 0 NaN 17-09-2020 test 1,5 5 20 0 NaN 16-04-2020 test2 1,5 5 20 Invoice 456 02-03-2020 0 NaN NaN test3 21 3 64 0 0 NaN test3 21 3 64 0 0 NaN test3 21 3 64

**Document type** **Invoicenumber Invoicedate** description#1 description#2 quantity quantity#2 unit price unit price #2 line amount line amount #2 Invoice 123 28-08-2020 test test2 1,5 1,5 5 5 20 20

{'Document Type': {0: 'IngramMicro.AccountsPayable.Invoice', 1: 0, 2: 0, 3: 'IngramMicro.AccountsPayable.Invoice', 4: 0, 5: 0, 6: 0}, 'Factuurnummer': {0: '0.78861803', 1: 'NaN', 2: 'NaN', 3: '202130534', 4: 'NaN', 5: 'NaN', 6: 'NaN'}, 'Factuurdatum': {0: '2021-05-03', 1: nan, 2: nan, 3: '2021-09-03', 4: nan, 5: nan, 6: nan}, 'description': {0: nan, 1: 'TM 300 incl onderstel 3058C003 84433210 4549292119381', 2: 'ESP 5Y 36 inch 7950A539 00000000 4960999794266', 3: nan, 4: 'Basistarief A3 Office', 5: 'Toeslag 100 km enkele reis Leveren installeren Xerox VL C7020 05-03-2021', 6: 'Toeslag 100 km enkele reis Leveren installeren Xerox VL C7020 05-03-2021'}, 'quantity': {0: nan, 1: 1.0, 2: 1.0, 3: nan, 4: 1.0, 5: 1.0, 6: 2.0}, 'unit price': {0: nan, 1: 1211.63, 2: 742.79, 3: nan, 4: 260.0, 5: 30.0, 6: 30.0}, 'line amount': {0: nan, 1: 21.0, 2: 21.0, 3: nan, 4: 260.0, 5: 30.0, 6: 30.0}}

1条回答

网友

1楼 · 发布于 2024-09-29 06:33:23

一种不聪明但可行的方法是通过迭代原始dataframe的行来生成具有所需列的新dataframe

import ast
import sys
import pandas as pd
from io import StringIO

TESTDATA = StringIO("""Document type;Invoicenumber;Invoicedate;description;quantity;unit price;line amount
Invoice;123;28-08-2020
0;NaN;17-09-2020;test;1,5;5;20
0;NaN;16-04-2020;test2;1,5;5;20
Invoice;456;02-03-2020
0;NaN;NaN;test3;21;3;64
0;0;NaN;test3;21;3;64
0;0;NaN;test3;21;3;64
    """)

df = pd.read_csv(TESTDATA, delimiter=";")

df_new = pd.DataFrame(columns=df.columns)

first_type_index = 0
type_count = 0

for index, row in df.iterrows():
    if row['Document type'] == '0':
        type_count += 1

        df_new.loc[first_type_index, f'description#{type_count}'] = row['description']
        df_new.loc[first_type_index, f'quantity#{type_count}'] = row['quantity']
        df_new.loc[first_type_index, f'unit price#{type_count}'] = row['unit price']
        df_new.loc[first_type_index, f'line amount#{type_count}'] = row['line amount']

        row['Document type']
        row['Invoicenumber']
    else:
        first_type_index = index
        type_count = 0
        df_new.loc[first_type_index] = row

df_new.drop(['description', 'quantity', 'unit price', 'line amount'], axis=1, inplace=True)

|   | Document type | Invoicenumber | Invoicedate | description#1 | quantity#1 | unit price#1 | line amount#1 | description#2 | quantity#2 | unit price#2 | line amount#2 | description#3 | quantity#3 | unit price#3 | line amount#3 |
| 0 | Invoice       |         123.0 |  28-08-2020 | test          | "1,5"      |          5.0 |          20.0 | test2         | "1,5"      |          5.0 |          20.0 |               |            |              |               |
| 3 | Invoice       |         456.0 |  02-03-2020 | test3         | 21         |          3.0 |          64.0 | test3         | 21         |          3.0 |          64.0 | test3         |         21 |          3.0 |          64.0 |

要使用列名，可以使用

column_to_swap = df_new.columns[3:]
len_swap = len(column_to_swap)

column_swap = [None] * len_swap

j = 0

# 4 is the count of duplicated columns
# description, quantity, unit price, line amount
for i in range(4):
    fast_i = i

    while fast_i < len_swap:
        column_swap[j] = column_to_swap[fast_i]
        fast_i += 4
        j += 1

df_new = df_new.reindex(columns=list(df_new.columns.values[0:3])+column_swap)

|   | Document type | Invoicenumber | Invoicedate | description#1 | description#2 | description#3 | quantity#1 | quantity#2 | quantity#3 | unit price#1 | unit price#2 | unit price#3 | line amount#1 | line amount#2 | line amount#3 |
| 0 | Invoice       |         123.0 |  28-08-2020 | test          | test2         |               | "1,5"      | "1,5"      |            |          5.0 |          5.0 |              |          20.0 |          20.0 |               |
| 3 | Invoice       |         456.0 |  02-03-2020 | test3         | test3         | test3         | 21         | 21         |         21 |          3.0 |          3.0 |          3.0 |          64.0 |          64.0 |          64.0 |

相关问题更多 >

编程相关推荐

热门问题

热门文章