性能调整：使用集合或数据帧比较两个表：将最终消息设为“Killed”

2024-06-28 18:44:40 发布

男 | 程序猿一只，喜欢编程写python代码。

我有几个表，表将在不同的数据库和下面的样本比较，我正在尝试

EmplTbl = cur.execute("select A , B , C from EmployeeTable where EmplName in ('A','B')") 
emp_entries = set(cur)

DeptTbl = cur.execute("select A , B , C from DeptTable") 
dept_entries = set(cur) 

print(emp_entries.difference(dept_entries))

在这个例子中，我只提供了3列进行比较。但在我的情况下，我有30-40列。当我试图在集合之间做一个区别或者使用'for'循环或者数据帧连接比较时——脚本运行得非常慢，我得到的最后一条消息是“Killed”

在下面的代码中，我尝试进行内部连接以获得精确匹配

EmplTbl = cur.execute("select A , B , C from EmployeeTable where EmplName in ('A','B')") 
emp_entries = set(cur)

DeptTbl = cur.execute("select A , B , C from DeptTable") 

for DeptTbl in cur:
    if emp_entries in DeptTbl:
        print(emp_entries)

记录的数量：我可能有1000万

有没有什么方法可以提高我的表现，让它跑得快。我有4节点配置的linux服务器。请建议

Tags： in from execute where select entries set cur

1条回答

网友

1楼 · 发布于 2024-06-28 18:44:40

您可以直接使用差异查询：

Select col1, col2, col3 from table 1
Minus
Select col1, col2, col3 from table2;

或者

Select col1, col2, col3 from table1 t1
Where exists 
(Select 1 from table2 t2
Where t1.col1 = t2.col1
And t1.col2 = t2.col2
And t1.col3 = t2.col3)

干杯！！你知道吗

性能调整：使用集合或数据帧比较两个表：将最终消息设为“Killed”

相关问题更多 >

编程相关推荐

热门问题

热门文章

性能调整：使用集合或数据帧比较两个表：将最终消息设为“Killed”

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >