用python格式化多行字符串

2024-10-01 07:40:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要用python格式化下面显示的多行字符串。我试过很多方法,但结果都不好

AMAZON
IPHONE: 700
SAMSUNG: 600

=============

WALMART
IPHONE: 699

===========

ALIBABA
SONY: 500

因此,上述数据代表了在线商店及其品牌手机的价格。我需要将这些添加到数据库中。所以应该是这样

-------------------
AMAZON | IPHONE | 700
-------------------
AMAZON | SAMSUNG | 600
-------------------
WALMART | IPHONE | 699
-------------------
ALIBABA | SONY | 500
-------------------

我需要格式化上面的文本并将其存储在数据库表中

我试过什么? 我尝试拆分多行,创建一个更有可能是JSON的字典。但结局并不好。但它只需要一条线。如果有其他简单的方法,请与我分享。请帮我做这个


Tags: 数据方法字符串文本数据库amazon代表价格
2条回答

@scito提交的答案足够了,但我只是以防万一。 您可以使用regex,下面是一个工作示例:

strng = """
AMAZON
IPHONE: 700
SAMSUNG: 600

=============

WALMART
IPHONE: 699

===========

ALIBABA
SONY: 500

======
"""

multistrng = strng.split("\n") # get each line seperated by \n

import re 

market_re = re.compile('([a-zA-Z]+)') # regex to find market name

phone_re = re.compile(r"([a-zA-Z]+):\s(\d+)") # regex to find phone and its price

js = [] # list to hold all data found

for line in multistrng:
    phone = phone_re.findall(line) # if line contains phone and its price
    if phone:
        js[-1].append(phone[0]) # add phone to recently found marketplace
        continue
    market = market_re.findall(line)
    if market: # if line contains market place name
        js.append([market[0]])
        continue
    else:
        continue # empty lines ignore

# now you have the data in structured manner, you can print or add it to the database

for market in js:
    for product in market[1:]:
        print("          -")
        print("{} | {} | {}".format(market[0], product[0], product[1]))

print("          -")

输出:

          -
AMAZON | IPHONE | 700
          -
AMAZON | SAMSUNG | 600
          -
WALMART | IPHONE | 699
          -
ALIBABA | SONY | 500
          -

数据存储在js列表中,若迭代js,子列表中的第一个元素是market place,其余元素是该market place的产品

[['AMAZON', ('IPHONE', '700'), ('SAMSUNG', '600')], ['WALMART', ('IPHONE', '699')], ['ALIBABA', ('SONY', '500')]]

我做了一些假设:

  • 供应商名称始终位于产品之前
  • 至少===作为供应商条目之间的分隔符
  • 空行可以忽略

工作代码:

str = """
AMAZON
IPHONE: 700
SAMSUNG: 600

=============

WALMART
IPHONE: 699

===========

ALIBABA
SONY: 500
"""

new_entry = True
print("         -")
for line in str.split("\n"):
    # assuming first entry is always the vendor name
    if not line.strip():
        continue
    elif new_entry:
        vendor = line.strip()
        new_entry = False
    elif "===" in line:
        new_entry = True
    else:
        product = line.split(":")
        print("{} | {} | {}".format(vendor, product[0].strip(), product[1].strip()))
        print("         -")

输出为:

         -
AMAZON | IPHONE | 700
         -
AMAZON | SAMSUNG | 600
         -
WALMART | IPHONE | 699
         -
ALIBABA | SONY | 500
         -

替代方法:供应商名称也可以作为文本行找到,但不带冒号

相关问题 更多 >