Python如何使用CSV文件中的循环查找同一列的两行之间的差异?

2024-10-01 15:33:51 发布

您现在位置:Python中文网/ 问答频道 /正文

 

 Date   Revenue
9-Jan   $943,690.00 
9-Feb   $1,062,565.00 
9-Mar   $210,079.00 
9-Apr   -$735,286.00
9-May   $842,933.00 
9-Jun   $358,691.00 
9-Jul   $914,953.00 
9-Aug   $723,427.00 
9-Sep   -$837,468.00
9-Oct   -$146,929.00
9-Nov   $831,730.00 
9-Dec   $917,752.00 
10-Jan  $800,038.00 
10-Feb  $1,117,103.00 
10-Mar  $181,220.00 
10-Apr  $120,968.00 
10-May  $844,012.00 
10-Jun  $307,468.00 
10-Jul  $502,341.00 

# This is what I did so far...

# Dependencies
import csv

# Files to load (Remember to change these)
file_to_load = "raw_data/budget_data_2.csv"


totalrev = 0
count = 0

# Read the csv and convert it into a list of dictionaries
with open(file_to_load) as revenue_data:
    reader = csv.reader(revenue_data)

    next(reader)  



    for row in reader:

        count += 1
        revenue = float(row[1])     
        totalrev += revenue

    for i in range(1,revenue):
         revenue_change = (revenue[i+1] - revenue[i])

avg_rev_change = sum(revenue_change)/count

print("avg rev change: ", avg_rev_change)         

print ("budget_data_1.csv")
print ("---------------------------------")
print ("Total Months: ", count)
print ("Total Revenue:", totalrev)





我有以上的数据在CSV文件。我在查找收入变化时遇到问题,这是第1行-第0行,第2-第1行的收入,等等。。。最后,我要总收入变化的总和。我试过用loop,但我想有一些愚蠢的错误。请给我推荐代码,以便我比较我的错误。我不熟悉python和编码。在


Tags: csvtodatacountloadrevchangejan
2条回答

目前还不清楚是否可以使用第三方软件包,例如熊猫,但熊猫在这类操作方面非常擅长。我建议您使用它的功能,而不是逐行迭代。在

df是一个pandas.DataFrame对象。使用pandas.read_csv将数据加载到数据帧中。在

>>> df
      Date        Revenue
0    9-Jan    $943,690.00
1    9-Feb  $1,062,565.00
2    9-Mar    $210,079.00
3    9-Apr   -$735,286.00
4    9-May    $842,933.00
5    9-Jun    $358,691.00
6    9-Jul    $914,953.00
7    9-Aug    $723,427.00
8    9-Sep   -$837,468.00
9    9-Oct   -$146,929.00
10   9-Nov    $831,730.00
11   9-Dec    $917,752.00
12  10-Jan    $800,038.00
13  10-Feb  $1,117,103.00
14  10-Mar    $181,220.00
15  10-Apr    $120,968.00
16  10-May    $844,012.00
17  10-Jun    $307,468.00
18  10-Jul    $502,341.00

# Remove the dollar sign and any other weird chars
>>> df['Revenue'] = [float(''.join(c for c in row if c in '.1234567890')) for row in df['Revenue']]

使用pandas.Series.shift将上个月的值与当前月份的值对齐,然后减去这两个值:

^{pr2}$
import csv

# Files to load (Remember to change these)
file_to_load = "raw_data/budget_data_2.csv"


# Read the csv and convert it into a list of dictionaries
with open(file_to_load) as revenue_data:
    reader = csv.reader(revenue_data)

    # use of next to skip first title row in csv file
    next(reader) 
    revenue = []
    date = []
    rev_change = []

    # in this loop I did sum of column 1 which is revenue in csv file and counted total months which is column 0 
    for row in reader:

        revenue.append(float(row[1]))
        date.append(row[0])

    print("Financial Analysis")
    print("                 -")
    print("Total Months:", len(date))
    print("Total Revenue: $", sum(revenue))


    #in this loop I did total of difference between all row of column "Revenue" and found total revnue change. Also found out max revenue change and min revenue change. 
    for i in range(1,len(revenue)):
        rev_change.append(revenue[i] - revenue[i-1])   
        avg_rev_change = sum(rev_change)/len(rev_change)

        max_rev_change = max(rev_change)

        min_rev_change = min(rev_change)

        max_rev_change_date = str(date[rev_change.index(max(rev_change))])
        min_rev_change_date = str(date[rev_change.index(min(rev_change))])


    print("Avereage Revenue Change: $", round(avg_rev_change))
    print("Greatest Increase in Revenue:", max_rev_change_date,"($", max_rev_change,")")
    print("Greatest Decrease in Revenue:", min_rev_change_date,"($", min_rev_change,")")


我得到的输出

^{pr2}$

相关问题 更多 >

    热门问题