读取Excel文件并将行排列为列（粗体），行值在粗体行下？

|-----------|----------------------|------------------|------------------|-------------------------|-------------------------| | Article | Transmitter | Receiver | Logistic data: | BSCI / SA8000 | Features | |-----------|----------------------|------------------|------------------|-------------------------|-------------------------| | headphone | 2.4GHZ | diameter | 20' | | | | | 5GHZ | impedance | 40' | | | | | Power LED | Mic Jack | | | | | | - 1 x 3.5mm mic jack | | | | | |-----------|----------------------|------------------|------------------|-------------------------|-------------------------| | keyboard | | | Qty | BSCI / SA8000 certified | Display (LCD, or LED) | | | | | Carton | Certificate validity | Sync | | | | | 20' |

2条回答

网友

1楼 · 编辑于 2024-09-30 12:16:58

仅适用于.xlsx文件

from openpyxl import load_workbook

path = "test.xlsx"
book = load_workbook(path)
sheet = book.worksheets[0] # get first Excel sheet of test.xlsx

for cells in range(1, 201): # check first 200 cells
        cell = sheet.cell(cells, 1) # iterate over cells in Column 1 = A -> A1, A2, A3, ...
        if cell.value != None and cell.font.b == 1: # ignore empty cells and get bold cells
        ... "do stuff with cell" ...

网友

2楼 · 编辑于 2024-09-30 12:16:58

为了检测样式，可以使用像styleframe这样的外部包。对每个示例文件重复步骤1和2

阅读示例文件，确定style为粗体的索引

from styleframe import StyleFrame
sf = StyleFrame.read_excel('Example-1.xlsx', read_style=True, use_openpyxl_styles=False, headers=None)
indices=[]
for i in range(0, len(sf)):
    for val in sf.iloc[i]:
        if(val.style.bold):
            indices.append(i)

查找索引之间的值

df=pd.read_excel("Example-1.xlsx", headers=None)
df=df.astype(str)
columns=[]
values=[]
for i in range(0,len(indices)):
    print(i)
    columns.append(df.iloc[indices[i]].values[0])
    if(i+1<len(indices)):
        values.append(list(df.iloc[indices[i]+1:indices[i+1]].values))
    else:
        if(indices[i]+1<len(df)):
            values.append(list(df.iloc[indices[i]+1:].values))
        else:
            values.append([])

values=list(map(lambda z: " ".join([x[0] for x in z]), values))
temp_dict=dict(zip(columns, values))

以下代码根据需要创建最终数据帧-

final_dict=[]
final_dict.append(temp_dict)
final_df=pd.DataFrame.from_dict(final_dict)

Example File必须包含一个额外的头，以减少歧义

相关问题更多 >

编程相关推荐

热门问题

热门文章