Python为每个lin使用rsplit将字符串转换为表

2024-10-03 11:25:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我是Python新手,需要一些关于字符串的帮助,该字符串如下所示:

string='Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75\n'

需要将它转换成一个更像这样的表:

Category   Dish   Price
Starters   Salad with Greens   14.00
Starters   Salad Goat Cheese   12.75
Mains   Pizza   12.75
Mains  Pasta  12.75

实现这一目标的最佳方法是什么?你知道吗

我想申请字符串.rsplit(“”,2),但无法计算如何使其按行执行。不知道如何将标题重复到一个单独的列中。 任何帮助都将不胜感激。你知道吗

提前谢谢!你知道吗


Tags: 字符串stringwithcategory新手cheesegoatsalad
3条回答

您可以在Python3中使用基于类的解决方案,并使用运算符重载来获得对数据的附加可访问性:

import re
import itertools
class MealPlan:
    def __init__(self, string, headers):
       self.headers = headers
       self.grouped_data = [d for c, d in [(a, list(b)) for a, b in itertools.groupby(string.split('\n'), key=lambda x:x in ['Starters', 'Mains'])]]
       self.final_grouped_data = list(map(lambda x:[x[0][0], x[-1]], [grouped_data[i:i+2] for i in range(0, len(grouped_data), 2)]))
       self.final_data = [[[a, *list(filter(None, re.split('\s(?=\d)', i)))] for i in b] for a, b in final_grouped_data]
       self.final_data = [list(filter(lambda x:len(x) > 1, i)) for i in self.final_data]
    def __getattr__(self, column):
        if column not in self.headers:
            raise KeyError("'{}' not found".format(column))
        transposed = [dict(zip(self.headers, i)) for i in itertools.chain.from_iterable(self.final_data)]
        yield from map(lambda x:x[column], transposed)
    def __getitem__(self, row):
         new_grouped_data = {a:dict(zip(self.headers[1:], zip(*[i[1:] for i in list(b)]))) for a, b in itertools.groupby(list(itertools.chain(*self.final_data)), key=lambda x:x[0])}
         return new_grouped_data[row]
    def __repr__(self):
         return ' '.join(self.headers)+'\n'+'\n'.join('\n'.join(' '.join(c) for c in i) for i in self.final_data)

string='Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75\n' 
meal = MealPlan(string, ['Category', 'Dish', 'Price'])
print(meal)
print([i for i in meal.Category])
print(meal['Starters'])

输出:

Category Dish Price
Starters Salad with Greens 14.00
Starters Salad Goat Cheese 12.75
Mains Pizza 12.75
Mains Pasta 12.75
['Starters', 'Starters', 'Mains', 'Mains']
{'Dish': ('Salad with Greens', 'Salad Goat Cheese'), 'Price': ('14.00', '12.75')}

不是说我会在生产环境中使用它,而是为了学术挑战:

import re

string = """Starters
Salad with Greens 14.00
Salad Goat Cheese 12.75
Mains
Pizza 12.75
Pasta 12.75"""

rx = re.compile(r'^(Starters|Mains)', re.MULTILINE)

result = "\n".join(["{}\t{}".format(category, line)
                for parts in [[part.strip() for part in rx.split(string) if part]]
                for category, dish in zip(parts[0::2], parts[1::2])
                for line in dish.split("\n")])
print(result)

这就产生了

Starters    Salad with Greens 14.00
Starters    Salad Goat Cheese 12.75
Mains   Pizza 12.75
Mains   Pasta 12.75

我想你必须决定如何区分类别和项目。我认为一件商品应该有它的价格。这段代码检查是否存在一个点,但您可能应该使用regexp。你知道吗

s = 'Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75'
items = s.split('\n')
# ['Starters', 'Salad with Greens 14.00', 'Salad Goat Cheese 12.75', 'Mains', 'Pizza 12.75', 'Pasta 12.75']

category = ''
menu = {}
for item in items:
    print(item)
    if '.' in item:
        menu[category].append(item)
    else:
        category = item
        menu[category] = []
print(menu)

# {'Starters': ['Salad with Greens 14.00', 'Salad Goat Cheese 12.75'], 'Mains': ['Pizza 12.75', 'Pasta 12.75']}

UPD:您可以替换

if '.' in item:

if re.match(r".*\d.\d\d", item):

它正在搜索以1.11结尾的字符串(如果在category name中有缩写,那么它很有用)

相关问题 更多 >