根据python中的条件从和数组中提取子数组

2024-06-25 23:01:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我想根据条件提取一些行和列(类似于子数组)

下面是一个输入和期望输出的示例

[["00:00:01","data_update","data1",10.5,"blabla"],
 ["00:00:02","proc_call","xxx","xxx","blalla"],
 ["00:00:15","data_update","data2",34.5,"blabla"],
 ["00:00:25","proc_call","xxx","xxx","blalla"]]

所需输出(用列0、2和3保持“数据更新”行)

下面是一个输入和期望输出的示例

[["00:00:01","data1",10.5],
 ["00:00:15","data2",34.5]]

在python中有没有一种简单的方法可以做到这一点


Tags: 数据方法示例dataupdate数组proccall
3条回答
result = filter(lambda x: "data_update" in x, a)
result = [[item[0],item[2],item[3]] for item in result]

第一行,找出所有包含“数据更新”的行 第二行,使用所需的3列重新生成结果

对于输入:

l = [["00:00:01","data_update","data1",10.5,"blabla"],
 ["00:00:02","proc_call","xxx","xxx","blalla"],
 ["00:00:15","data_update","data2",34.5,"blabla"],
 ["00:00:25","proc_call","xxx","xxx","blalla"]]

cols = (0, 2, 3)

做:

result = map(lambda sub: [sub[i] for i in cols], filter(lambda sub: "data_update" in sub, l))
print(list(result))

输出:

[['00:00:01', 'data1', 10.5], ['00:00:15', 'data2', 34.5]]

您可以使用for循环,如下所示:

reduced_array = []

for i in range(len(full_array)):
  if full_array[i][1] == 'data_update':
    reduced_array.append([i[0],i[2],i[3]])

或者通过列表理解

reduced_array = [[i[0],i[2],i[3]] for i in full_array if i[1] == 'data_update']

如果需要处理更多的列,也可以使用

cols = [0,2,3]
reduced_array = [[i[col] for col in cols] for i in full_array if i[1] == 'data_update']

关于adnanmuttaleb答案,使用lambda函数比我提出的列表理解方法快得多,但是如果有人不熟悉这个概念,它也会更困难。为了更全面,也不想因为他的回答而受到赞扬,我把它加在这里

reduced_array = map(lambda sub: [sub[i] for i in cols], filter(lambda sub: "data_update" in sub, full_array))

运行时比较:

import random as rd
import time

full_array = [[rd.random(),"data_update" if rd.random()< 0.2 else "no",rd.random(),rd.random()] for i in range(1000000)]
cols = [0,2,3]

start1 = time.time()
reduced_array = map(lambda sub: [sub[i] for i in cols], filter(lambda sub: "data_update" in sub, full_array))
print(time.time()-start1)

start2 = time.time()
reduced_array2 = [[i[col] for col in cols] for i in full_array if i[1] == 'data_update']
print(time.time()-start2)

导致

#Lambda function:
0.004003286361694336
#List comprehension
0.254199743270874

相关问题 更多 >