如何处理文件中的字符串和整数?

2024-09-30 08:22:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文本文件,上面有名字、学生的年级以及学生在考试中获得的分数。其格式如下:

John Doe 3 87
Jane Doe 4 89
Bob Smith 5 84

我需要找出三年级、四年级和五年级所有学生的平均数。这就是我所做的:

    inFile = open("input.txt", "r", encoding = "utf8")
    counter5 = 0
    counter4 = 0
    counter3 = 0
    total5 = 0
    total4 = 0
    total3 = 0
    for line in inFile:
        if "5" in line:
            total5 += int(line[-3:-1])
            counter5 += 1
        elif "4" in line:
            total4 += int(line[-3:-1])
            counter4 += 1
        elif "3" in line:
            total3 += int(line[-3:-1])
            counter3 += 1
    print(total5/counter5)
    print(total4/counter4)
    print(total3/counter3)

当然,问题是,在我的if语句中,有可能在考试分数中出现“3”、“4”或“5”,而不仅仅是分数级别。我相信有更简单的方法可以做到这一点。提前感谢您的帮助


Tags: inifline学生分数infileintprint
3条回答

正如建议的那样,这可以通过熊猫来实现

下面是如何使用熊猫解决这个问题

输入文件(Input.txt):

John Doe 3 87
Jane Doe 4 89
Bob Smith 5 84
Chris Cruse5 3 85
Karen Cane4 4 93
Rob Green3 5 94
Babe Ruth4 3 78
Step Curry1 4 79
Leb James4 5 77

import pandas as pd
df = pd.read_csv('input.txt', sep=" ", header=None)
df.columns = ['First','Last','Grade','Score']
print (df)
print (df.groupby('Grade')['Score'].mean().round(2))

数据将存储在熊猫数据框中,如下所示:

   First    Last  Grade  Score
0   John     Doe      3     87
1   Jane     Doe      4     89
2    Bob   Smith      5     84
3  Chris  Cruse5      3     85
4  Karen   Cane4      4     93
5    Rob  Green3      5     94
6   Babe   Ruth4      3     78
7   Step  Curry1      4     79
8    Leb  James4      5     77

各等级的平均值为:

Grade
3    83.33
4    87.00
5    85.00

您还可以提供:

print (df.groupby('Grade').agg({'Score':['mean']}).round(2))

       Score
        mean
Grade       
3      83.33
4      87.00
5      85.00

I need to find the average of all the students in the 3rd, 4th, and 5th grade

将成绩和分数收集到一个dict中,循环并计算统计数据

见下文

from collections import defaultdict
import statistics

data = defaultdict(list)
with open('in.txt') as f:
    lines = [line.strip() for line in f.readlines()]
    for line in lines:
        fields = line.split()
        fields[-1] = int(fields[-1])
        data[fields[2]].append(fields[-1])
for grade, scores in data.items():
    print(f'{grade}  > {statistics.mean(scores)}')

in.txt

John Doe 3 87
Jane Doe 4 89
Bob Smith 5 84
John Doe12 3 88
Jane Doe12 4 90
Bob Smith12 5 85

输出

3  > 87.5
4  > 89.5
5  > 84.5

您可以将该行拆分为令牌:

>>> line = 'Jane Doe 3 87'
>>> line.split(' ')
['Jane', 'Doe', '3', '87']

请注意,这是字符串,您需要根据自己的目的对其进行转换:

>>> float(line.split(' ')[-1])
87

更完整的版本:

inFile = open("input.txt", "r", encoding = "utf8")
    counter5 = 0
    counter4 = 0
    counter3 = 0
    total5 = 0
    total4 = 0
    total3 = 0
    for line in inFile:
        score = float(line.split(' ')[-1])
        grade = int(line.split(' ')[-2])
        if 5 == grade:
            total5 += score
            counter5 += 1
        elif 4 == grade:
            total4 += score
            counter4 += 1
        elif 3 == grade:
            total3 += score
            counter3 += 1
    print(total5/counter5)
    print(total4/counter4)
    print(total3/counter3)

相关问题 更多 >

    热门问题