<ul>
<li>拆分路径的所有部分并重新连接它们似乎会降低效率</李>
<li><p>查找最后一个“/”实例的索引和切片的速度要快得多</p>
<pre><code>def remove_tail(path):
index = path.rfind('/') # returns index of last appearance of '/' or -1 if not present
return (path[:index] if index != -1 else '.') # return . for parent directory
.
.
.
subFolderList = list(set([remove_tail(path) for path in tempImageList]))
</code></pre></li>
<li><p>已在AWA2数据集文件夹(50个文件夹和37322个图像)上验证</p></li>
<li>观察到的结果快了大约3倍</李>
<li>使用列表理解增强可读性</李>
<li>已处理父目录具有映像的情况(这将导致现有实现出错)</li>
</ul>
<p>添加用于验证的代码</p>
<pre><code>import os
from treeHandler import treeHandler
import time
def remove_tail(path):
index = path.rfind('/')
return (path[:index] if index != -1 else '.')
th=treeHandler()
tempImageList= th.getFiles('JPEGImages',['jpg'])
tempImageList = tempImageList
### basically tempImageList will be list of path of all files with '.jpg' extension
### now is the filtering part,the line which requires optimisation.
print(len(tempImageList))
start = time.time()
originalSubFolderList=list(set(list(map(lambda x:os.path.join(*x.split('/')[:-1]),tempImageList))))
print("Current method takes", time.time() - start)
start = time.time()
newSubFolderList = list(set([remove_tail(path) for path in tempImageList]))
print("New method takes", time.time() - start)
print("Is outputs matching: ", originalSubFolderList == newSubFolderList)
</code></pre>