尝试组合值并对.txt进行排序

2024-09-29 23:24:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在一个实时问题中使用到目前为止所学的知识,但似乎我仍然遗漏了一些部分,我想请求您的帮助

我有一个名为“debug.txt”的文件。该文件包含一些医生姓名和评论数量。有些医生的名字重复了不止一次,所以我想做的是对同一位医生的评论进行汇总,然后从评论次数最多的医生名字开始排序

Umberto Napoli 11
Prof.Giancarlo Pecorari 33
Edro Colombini 3
Fabrizio malandrinate 18
Nicola Kafalas 1
Maxillo Facciale 11
Ottaviani 4
Luca Cravero 2
Egle Muti 2
Massimiliano Garzaro 25
Salvatore Carlucci 34
Savino Bufo 185
Andrea Milanese 11
VincenzoDel Gaudio 221
Marco Marchetti 6
Andrea Nizza 9
Cosmer Torino 10
Mariafranca Maietta 14
Massimiliano Giuliano 24
Vito Contreas 23
Bellone Donato 69
AndreaRossi Todde 135
Franco Maniglia 33
Francesco Leva 11
MariaLuisa Pozzuoli 81
LaCliniqueof Switzerland
Luca Cravero 59
TheSwiss Clinic 9
GiulioMaria maggi 173
Umberto Napoli 55
Benoit Menye 243
Cristina Sartorio 6
Amisano Massimo 3
Massimo Dolcet 25
AlessandroMaria Caboni 236
Stefano Karoschitz 31
Alessandro Ticozzi 21
Francisco Malatesta 36
Massimo Dolcet 39
Corrado Adorno 8
Umberto Napoli 5
Mariarosa Romeo 117
Francesco Leva 17
Francesco Malatesta 23
Daniele Bollero 32
Paolo Tagliabue 2
Salvatore Carlucci 2
Gianluca Beninca 12
Paolo Gottarelli 23
Salvatore Carlucci 21
Dr.Massimo Dolcet
Salvatore Carlucci 20
Emanuele Zavattero 1
Luca Cravero 55
Marco Marchetti 3
Ssa Pozzuoli 51
Fabrizio Malandrinò 12
Enrico Donde 11
Alessandro Rivolin 8
Daniele Bollero 120
Nicola Vione 2
Egle Muti 6
Luigi Cursio 82
Salvatore Carlucci 21
Luca Cravero 1
Massimo Dolcet 7
GiulioMaria Maggi 35
Enrico Giachero
Dott Arturi 5
Marcello Cavallero 6
Stefano bruschi 4
Paolo Balocco 3

不幸的是,我只清理了一点txt,这是我正在使用的代码:

name = "debugging.txt"
handle = open(name)
for line in handle:
    line = line.rstrip()
    line = line.split()
    try:
        linezero = line[0]
        lineone = line[1]
        name = linezero + " " + lineone
        review = line[2]
    except:
        continue
    print(name,review)

Tags: nametxtline评论医生lucapaolofrancesco
3条回答

这是你能做到的方法

import collections   

file=open('debugging.txt','r')
line=file.readlines()      #reading the file

review_dict={} #dictionary which contains all the data about the doctor and their reviews

for i in line:
    a=i.split()
    b=a[-1]
    if b.isnumeric():
        reviews=int(b)
        a.pop()
        pass
    else:
        reviews=0

    doctor_name=" ".join(a)

    review_dict[doctor_name]=reviews  # updating the dictionary with name of doctor and their reviews

counter = collections.Counter() 
counter.update(review_dict)    #sorting in decending order
print('List of doctors:- ')
print(list(counter))           #printing the name of doctor in decending order

如果您想打印医生的姓名及其评论,请将最后一行ie.print(list(counter))替换为print(dict(counter))

你已经把名字和评论分开了。您可能希望使用int()将审阅从str转换为int。
对于存储和排序,可以使用defaultdict。请参见下面的示例:

>>> from collections import defaultdict
>>> sample_list = [('name1', 100), ('name2', 400), ('name2', 5), ('name1',50), ('name3', 10)]
>>> for name_score in sample_list:
...     review_map[name_score[0]] += name_score[1]
... 
>>> review_map
defaultdict(<class 'int'>, {'name1': 150, 'name2': 405, 'name3': 10})
>>> doctor_list = sorted(review_map, key=lambda x: review_map[x], reverse=True)
>>> doctor_list
['name2', 'name1', 'name3']
>>> 

我建议使用检查每一行都在做什么,python有哪些工具

def get_rev_srt(filename):
    # This is a dict(), it is used to pair data 'key: value',
    # the file contains a simple match of a string and a int.
    # Dictionaries don't contain the same functions as the lists,
    # they don't require to have the same data type for a key o value,
    # adding a value is a simple as adding an equal such as data['key'] = value
    # if 'key' already exists 'value' will be replaced
    data = {
        'Alex': 999
    }
    # On python version >3 a context call is used to make ease of use.
    # 'open()' context is handled by 'with' and gives you the return value
    # like a function will, but the scope is extended, this means that
    # any value created inside 'with' is not going to be deleted and can be
    # accessed by the following code
    with open(filename, 'r') as file:
        # It is possible to find how many lines it has but it is much
        # more easy to doing a loop until it has no more and
        # it has no impact on performance
        while True:
            # Here im piping outputs, tis means that you can chain the correct
            # output into another function and get the return of the lastone
            # variable <- rsplit() <- readline()
            cnt = file.readline().rsplit(' ', 1)
            # This one just breaks the loop if the next line is empty
            if cnt == ['']:
                break
            # Using try is not recommended for every occasion, but,
            # as before it is really easy to just doit and see if it
            # has work than check each time
            try:
                # [4:] means 'get all the values as a new list starting
                # from the index 4', negative values just wrap around
                # then [0:-1] would mean 'get from the beginning up to
                # the value before last one', and like before they
                # can be chained and nested
                num = int(cnt[1][0:-1])
                name = cnt[0]
                # This one is quite obvious, check if a element is
                # in the dict, olso works with list and tuple variables
                if name in data:
                    data[name] += num
                else:
                    data[name] = num
            # It is recommended to have an exception type but not required
            except ValueError:
                # The new thing here is .join(), works in inverse as split;
                # supply a list or tuple to a string, .format() at the end of the file
                print('Bad formatting "{0}"'.format(
                    ''.join(cnt).replace('\n', ' %n'))
                )

    # Simple bubble sort, works by descanting one by one the biggest value
    # from a srinkin list of values, a conversion to 2 list is required because
    # dicts don't allow the use of index, only keys
    d_keys = list(data.keys())
    d_values = list(data.values())
    # l1 is the value to swap with the biggest
    # range() creates a list of increasing number
    for l1 in range(len(d_keys) - 1):
        # Clear the max value
        mxi, mxn = 0, 0
        # loop the remaining values and check with one
        # is the biggest and save it
        for l2 in range(l1, len(d_keys) - 1):
            if d_values[l2] > mxn:
                mxn = d_values[l2]
                mxi = l2

        # This just swaps 2 elements in parallel from bought lists
        tmn, tmv = d_values[l1], d_keys[l1]
        d_values[l1], d_keys[l1] = d_values[mxi], d_keys[mxi]
        d_values[mxi], d_keys[mxi] = tmn, tmv

    # And pairs up the list back to a dict, this isn't that common
    # don't worrie about how this works
    return dict(zip(d_keys, d_values))


# Some functions used can take multiple arguments and use them
# or change the behavior, functions have 2 special values,
# a list and a dict that grout all arguments an k arguments,
# they are described by the use of a * and not the name

def undefined_args(*args, **kwargs):
    # this function returns a list of how many
    # arguments they have
    return len(args), len(kwargs)


# This condition is a guard, it is added to ensure that this file
# is been executed by this file, that just means that this is the
# execution origin
if __name__ == '__main__':
    pp_d = get_rev_srt('debugging.txt')
    # A tuple is a fast list, the drawback is
    # that the values can not be changed, only read
    names = tuple(pp_d.keys())
    for i in range(3):
        # .format() works similarly to join, its a function of a string
        # and "renders" like jinja would do, {} means value goes here,
        # {0} insert here the first argument and {0:<25} is insert the
        # first argument and give a tabulation to meet the size
        # depending on the version the syntax may varies, the most modern
        # is with f'{names[i]}'
        print('Doctor : {0:<25}  {1:<5} reviews'.format(
            names[i], pp_d[names[i]])
        )

相关问题 更多 >

    热门问题