python3.5.2中Benford定律的绘制

2024-10-06 16:22:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我被分配了一个任务,它要求我输出从CSV文件读取的数据集。。。在

…根据本福德定律,可以得到这样的条形图:

Benford条形图示例

以下是我目前掌握的代码:

import matplotlib.pyplot as plt
import math
import csv
import locale

with open("immigrants.csv", newline='') as csvfile:
    immidata = csv.reader(csvfile)
    X_labels = []
    Y = []
    for row in immidata:
        X_labels.append(row[0])
        Y.append(locale.atoi(row[1]))

numbers = [float(n) for n in range(1, 10)]
benford = [math.log10(1 + 1 / d) for d in numbers]
plt.plot(numbers, benford, 'ro', label = "Benford's Law")
plt.bar(numbers, range(1, 11), align = 'left', normed = True, 
    rwidth = 0.7, label = "Actual data")
plt.bar(benford, range(1, 11), align = 'left', normed = True, 
    rwidth = 0.7, label = "Predicted data")
plt.title("Immigrants in countries")
plt.xlabel("Digit")
plt.ylabel("Probability")
plt.grid(True)
plt.xlim(0, 10)
plt.xticks(numbers)
plt.legend()
plt.show()

以下是CSV文件中的一些信息,其中显示了每个国家的移民人数(国家、移民人数、占世界移民总人数的百分比以及移民占全国人口的百分比):

^{pr2}$

我现在的输出:

line 19, in <module>
  Y.append(locale.atoi(row[1]))
line 321, in atoi
  return int(delocalize(string))
ValueError: invalid literal for int() with base 10: 'Number of
immigrants'

Process finished with exit code 1

我还是比较新的,所以任何建议,将帮助我得到一个输出是非常感谢!在

谢谢你!在

Output

需要看起来像示例的输出。在


Tags: csvinimportforwithrangepltlocale
2条回答

以下是使用熊猫的解决方案:

本福德法律禁止移民的数量

编辑: 您的文件可能有一个标题行,该行由num_ghters列中的字符串“Number of migration”指示。删除读取数据行中的header=None选项。在

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# set the width of the bars, you're gonna have to massage this
width = 0.35

immi = pd.read_csv('immigrants.csv')

# name columns
immi.columns = ['country', 'num_immigrants', 'perc_world', 'perc_nat_pop']

# convert num_immigrants to float
immi.num_immigrants=  immi.num_immigrants.str.replace(',', '').apply(float)
total = immi.num_immigrants.sum()

# scale the immigration to between 0 and 1
immi['immi_scaled'] = immi['num_immigrants'].apply(lambda x: x/total)

indx = np.arange(1, len(immi) + 1)
benford = [np.log10(1 + (1.0 / d)) for d in indx]

plt.bar(indx, benford, width, color='r', label="Benford's Law")
plt.bar(np.arange(1, immi.shape[0]+1)+ width, 
                immi.immi_scaled, width, color='b', label="Predicted data")
# center the xtick labels
ax = plt.gca()
ax.set_xticks(indx + width / 2)
ax.set_xticklabels((indx))

# limit the  number of bars if you have more data
plt.xlim(1, 9)
plt.title("Immigrants in countries")
plt.ylabel("Probability")
plt.grid(True)
plt.legend()
plt.show()

读取数据:

import locale

with open("immigrants.csv", newline='') as csvfile:
    immidata = csv.reader(csvfile)  # defaults are fine!
    X_labels = []
    Y = []
    for row in immidata:
        X_labels.append(row[0])
        Y.append(locale.atoi(row[1]))

为您提供X_labels和{}(转换为int)。
注意:不需要close()块自动执行此操作。在

祝你剩下的好运。顺便说一句:digits在您共享的代码中未定义-您应该尽一切努力使其MCVE

相关问题 更多 >