如何将所有百分比转换为小数,并将其写入CSV?

2024-10-03 02:40:21 发布

您现在位置:Python中文网/ 问答频道 /正文

目标是将所有值从百分比转换为十进制形式。代码如下:

import requests
from bs4 import BeautifulSoup
import lxml


FIU = open('C://Users//joey//Desktop//response.txt','r').read()
#soup = BeautifulSoup(FIU, "html.parser")


soup = BeautifulSoup(FIU, "lxml")

tables = soup.find_all('table')

for table in tables:
    rows = table.find_all("tr")
    for row in rows:
        cells = row.find_all("td")
        if len(cells) == 7:  # this filters out rows with 'Term', 'Instructor Name' etc.
            for cell in cells:
                print(cell.text + "\t", end="")  # \t is a Tab character, and end="" prevents a newline between cells
            print("")  # newline after each row



def p2f(x): return float(x.strip('%'))/100
percentage_list = []
for cell in cells:
    if '%' in cell.text:
        percentage_list.append(p2f(cell.text))

在最下面,您将看到我尝试剥离百分比的函数,然后除以100得到每个数字的小数点。但是,这并不影响输出:

Description of course objectives and assignments    0.0%    68.4%   10.5%   15.8%   5.3%    0.0%    
Communication of ideas and information  0.0%    52.6%   26.3%   10.5%   10.5%   0.0%    
Expression of expectations for performance in this class    0.0%    68.4%   15.8%   10.5%   0.0%    5.3%    
Availability to assist students in or out of class  0.0%    57.9%   31.6%   10.5%   0.0%    0.0%    
Respect and concern for students    0.0%    47.4%   42.1%   10.5%   0.0%    0.0%    
Stimulation of interest in course   0.0%    47.4%   26.3%   21.1%   0.0%    5.3%    
Facilitation of learning    0.0%    52.6%   26.3%   10.5%   10.5%   0.0%    
Overall assessment of instructor    0.0%    52.6%   31.6%   10.5%   0.0%    5.3%

我可以实现什么代码来解决这个问题?你知道吗


Tags: andofinimportfortablecellall
2条回答

在此处使用p2f函数:

def p2f(x): 
    return float(x.strip('%'))/100    
for table in tables:
    rows = table.find_all("tr")
    for row in rows:
        cells = row.find_all("td")
        if len(cells) == 7:
            for cell in cells:
                if '%' in cell.text:
                    print(str(p2f(cell.text)) + "\t", end="")
                else:
                    print(cell.text + "\t", end="")
                print("")  # newline after each row

我想办法把它译成词典,但结果没有我所希望的那么清楚

 one_table = {}
for row in rows:
    cells = row.find_all("td")
    name = cells[0].text
    one_table[name] = []
    if all('%' in cell.text for cell in cells[1:]):
        one_table[name] = [] #Create dictionary entry if this is a percentage row
    else:
        continue  #Otherwise, move on to the next row
    for cell in cells[1:]:
            one_table[name].append(p2f(cell.text))


with open('dict.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    for key, value in one_table.items():
       writer.writerow([key, value])

print(one_table)

结果是:

{'Term: 1171 - Spring 2017': [], 'Instructor Name: Austin, Lathan Craig': [], 'Course: TRA   4721  ': [], 'Enrolled: 27': [], '\xa0': [], 'Question': [], 'Description of course objectives and assignments': [0.0, 0.684, 0.105, 0.158, 0.053, 0.0], 'Communication of ideas and information': [0.0, 0.526, 0.263, 0.105, 0.105, 0.0], 'Expression of expectations for performance in this class': [0.0, 0.684, 0.158, 0.105, 0.0, 0.053], 'Availability to assist students in or out of class': [0.0, 0.579, 0.316, 0.105, 0.0, 0.0], 'Respect and concern for students': [0.0, 0.474, 0.42100000000000004, 0.105, 0.0, 0.0], 'Stimulation of interest in course': [0.0, 0.474, 0.263, 0.21100000000000002, 0.0, 0.053], 'Facilitation of learning': [0.0, 0.526, 0.263, 0.105, 0.105, 0.0], 'Overall assessment of instructor': [0.0, 0.526, 0.316, 0.105, 0.0, 0.053]}

所以它确实写了一个CSV文件,但是我在把它转换成字典的过程中丢失了99%的数据。只有一张桌子,我以前有几百张。CSV的输出也不理想:

Not ideal

因此,如果我能找到一种方法来包含我所有的数据,然后像它应该的那样用逗号分隔,那么也许这就足够了。你知道吗

相关问题 更多 >