从字符串识别模式并更新数据帧

|chocolate#|chocolateName| |chocolate1|a| |chocolate1|b| |chocolate1|c| |chocolate1|d| |icecream|e| |icecream|f| |icecream|g| |icecream|h| |icecream|i| |icecream|j| |cookie|k| |cookie|l| |cookie|m| |cookie|n|

new_text = [] for line in text.splitlines(): if len(line.split())==0 or len(line.split())==1: continue else: new_text.append(line) for i in new_text[13:]: if ';' not in i: title_index = new_text.index(i) print(title_index) break

1条回答

网友

1楼 · 发布于 2024-10-03 04:38:14

试试这个：

import pandas as pd

# Create a pandas dataframe from list
text =  ['chocolate1','a;b;','c;d','icecream','e;f;','g;h', 'i;j', 'cookie', 'k;l', 'm;n']
s = pd.Series(text)
df = s.to_frame(name='letters')

# Create new column food where strings do not have ;
df['food'] = df.loc[~df['letters'].str.contains(';'), 'letters']
df['food'] = df['food'].ffill()

# remove rows that doesn't have ';' for letters
df = df[df['letters'].str.contains(';')].copy()

# Explode letters into rows of dataframe
df['letters'] = df['letters'].str.split(';')
df_out = df.explode('letters')

# Eliminate rows with blank for letters
df_out = df_out[df_out['letters'] != '']

print(df_out)

输出：

  letters        food
1       a  chocolate1
1       b  chocolate1
2       c  chocolate1
2       d  chocolate1
4       e    icecream
4       f    icecream
5       g    icecream
5       h    icecream
6       i    icecream
6       j    icecream
8       k      cookie
8       l      cookie
9       m      cookie
9       n      cookie

相关问题更多 >

编程相关推荐

热门问题

热门文章

从字符串识别模式并更新数据帧

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >