使用Python正确格式化文本文件中的数据

2024-06-26 01:57:13 发布

男 | 程序猿一只，喜欢编程写python代码。

我的.txt文件中有数据

productname1
7,64
productname2
6,56
4.73
productname3
productname4
12.58
10.33

所以数据在这里解释。我们在第一行有产品名称，第二行是价格。但对于第二个产品名称，我们有原始产品价格和折扣价格。此外，价格有时包含“.”和“，”表示美分。我想用以下方式格式化数据

    Product   o_price   d_price
productname1    7.64       -
productname2    6.56      4.73
productname3    -          -
productname4    12.58    10.33

我目前的方法有点幼稚，但它适用于98%的情况

import pandas as pd
data = {}
tempKey = []
with open("myfile.txt", encoding="utf-8") as file:
    arr_content = file.readlines()
    for val in arr_content:
        if not val[0].isdigit():# check whether Starting letter is a digit or text
            val = ' '.join(val.split()) # Remove extra spaces
            data.update({val: []}) # Adding key to the dict and initializing it with a list in which I'll populate values
            tempKey.append(val) # keeping track of the last key added because dicts are not sequential
         else:
             data[str(tempKey[-1])].append(val) # Using last added key and updating it with prices

df = pd.DataFrame(list(data.items()), columns = ['Product', 'Pricelist'])
df[['o_price', 'd_price']] = pd.DataFrame([x for x in df.Pricelist])
df = df.drop('Prices', axis=1)

因此，当产品名称以数字开头时，这种技术不起作用。对更好的方法有什么建议吗

Tags：数据 key in txt df data with 价格

1条回答

网友

1楼 · 发布于 2024-06-26 01:57:13

使用正则表达式检查行是否只包含数字和/或句点

if (re.match("^[0-9\.]*$", val)):

   # This is a product price

else:

   # This is a product name

使用Python正确格式化文本文件中的数据

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Python正确格式化文本文件中的数据

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >