需要使用可变列数动态添加到数据帧的帮助吗

TokensTable = pd.DataFrame({'Token': [], 'Value1': [],'Value2': [],'Value3': []}) counter = 0 for index, row in metricsByToken2.iterrows(): #for each row in the dataframe with values and the token lists for index2, token in enumerate(row[0]): #for each token in the list of tokens in each row if token.isalpha(): #If the token doesnt contain punctuation then token = token.lower() #lowercase the token if token in stop_words: #if the token is a stop word then del token #delete the token else: TokensTable.loc[counter] = [row[0][index2]] + [row[1]] + [row[2]] + [row[3]] counter = counter + 1 #increase counter to move to the next row in new df else: del token

╔══════════════════╦════════╦════════╦════════╗ ║ Text ║ Value1 ║ Value2 ║ Value3 ║ ╠══════════════════╬════════╬════════╬════════╣ ║ ['A','B','C'] ║ 1 ║ 3 ║ 7 ║ ║ ['A1','B1','C1'] ║ 2 ║ 4 ║ 8 ║ ║ ['A2','B2','C2'] ║ 3 ║ 5 ║ 9 ║ ╚══════════════════╩════════╩════════╩════════╝

╔═══════╦════════╦════════╦════════╗ ║ Token ║ Value1 ║ Value2 ║ Value3 ║ ╠═══════╬════════╬════════╬════════╣ ║ A ║ 1 ║ 3 ║ 7 ║ ║ B ║ 1 ║ 3 ║ 7 ║ ║ C ║ 1 ║ 3 ║ 7 ║ ║ A1 ║ 2 ║ 4 ║ 8 ║ ║ B1 ║ 2 ║ 4 ║ 8 ║ ║ C1 ║ 2 ║ 4 ║ 8 ║ ║ A2 ║ 3 ║ 5 ║ 9 ║ ║ B2 ║ 3 ║ 5 ║ 9 ║ ║ C2 ║ 3 ║ 5 ║ 9 ║ ╚═══════╩════════╩════════╩════════╝

1条回答

网友

1楼 · 发布于 2024-09-28 23:16:11

感谢链接@HS星云它引导我找到了我需要的答案。最后，我使用了一个循环来清理聚集的令牌，但为了取消它们的嵌套，我使用了以下方法：

TokensTable = metricsByToken2.apply(lambda x: pd.Series(x['Token']),axis=1).stack().reset_index(level=1, drop=True)
TokensTable.name = 'Token'
TokensTable = metricsByToken2.drop('Token', axis=1).join(TokensTable)
TokensTable = TokensTable.reset_index(drop=True)

相关问题更多 >

编程相关推荐

热门问题

热门文章