os.walk速度很慢，有方法可以优化吗？

网友

1楼 · 编辑于 2024-09-24 20:53:42

python2.7中的一种优化方法，用scandir.walk()代替os.walk()，参数完全相同。

import scandir
directory = "/tmp"
res = scandir.walk(directory)
for item in res:
    print item

PS：正如注释中提到的@reconp，scandir需要在python2.7中使用之前安装。

网友

2楼 · 编辑于 2024-09-24 20:53:42

os.walk当前非常慢，因为它首先列出目录，然后对每个条目执行stat操作，以查看它是目录还是文件。

PEP 471中提出了一个改进，在Python 3.5中很快就会出现。同时，您可以使用scandir包在Python 2.7中获得相同的好处

网友

3楼 · 编辑于 2024-09-24 20:53:42

是：使用Python 3.5（它目前仍然是RC，但是should be out momentarily）。在Python 3.5中，os.walk被重写以提高效率。

这项工作是PEP 471的一部分。

摘自政治公众人物：

Python's built-in os.walk() is significantly slower than it needs to be, because -- in addition to calling os.listdir() on each directory -- it executes the stat() system call or GetFileAttributes() on each file to determine whether the entry is a directory or not.
But the underlying system calls -- FindFirstFile / FindNextFile on Windows and readdir on POSIX systems -- already tell you whether the files returned are directories or not, so no further system calls are needed. Further, the Windows system calls return all the information for a stat_result object on the directory entry, such as file size and last modification time.
In short, you can reduce the number of system calls required for a tree function like os.walk() from approximately 2N to N, where N is the total number of files and directories in the tree. (And because directory trees are usually wider than they are deep, it's often much better than this.)
In practice, removing all those extra system calls makes os.walk()about 8-9 times as fast on Windows, and about 2-3 times as fast on POSIX systems. So we're not talking about micro-optimizations. See more benchmarks here.

相关问题更多 >

编程相关推荐

热门问题

热门文章

os.walk速度很慢，有方法可以优化吗？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >