LeetCode问题的运行时差异原因(打开锁)

2024-09-27 00:20:45 发布

您现在位置:Python中文网/ 问答频道 /正文

LeetCod问题链接: https://leetcode.com/problems/open-the-lock/(第screenshot页)

下面是代码#1从源代码和目标代码执行bfs:

from collections import deque

class Solution:
    def openLock(self, deadends: List[str], target: str) -> int:
        deadend = set(deadends)
        
        if target in deadend or '0000' in deadend:
            return -1
        
        if target == '0000':
            return 0
        
        def getNextCombi(combi: str) -> List[str]:
            res = []
            for i in range(4):
                nextCombi1 = combi[:i] + str((int(combi[i:i + 1]) + 1) % 10) + combi[i + 1:]
                nextCombi2 = combi[:i] + str((int(combi[i:i + 1]) - 1) % 10) + combi[i + 1:]
                if nextCombi1 not in deadend:
                    res.append(nextCombi1)
                if nextCombi2 not in deadend:
                    res.append(nextCombi2)
            return res
        
        sourceQueue = deque(['0000'])
        targetQueue = deque([target])
        
        sourceSeen = {'0000': 0}
        targetSeen = {target: 0}
        
        while len(sourceQueue) != 0 and len(targetQueue) != 0:
            sourceCombi = sourceQueue.popleft()
            targetCombi = targetQueue.popleft()
            
            for nextCombi in getNextCombi(sourceCombi):
                if nextCombi not in sourceSeen:
                    sourceSeen[nextCombi] = sourceSeen[sourceCombi] + 1
                    sourceQueue.append(nextCombi)
                    if nextCombi in targetSeen:
                        return sourceSeen[nextCombi] + targetSeen[nextCombi]
            
            for nextCombi in getNextCombi(targetCombi):
                if nextCombi not in targetSeen:
                    targetSeen[nextCombi] = targetSeen[targetCombi] + 1
                    targetQueue.append(nextCombi)
                    if nextCombi in sourceSeen:
                        return sourceSeen[nextCombi] + targetSeen[nextCombi]
        
        return -1

下面是代码#2从源代码执行bfs:

from collections import deque

class Solution:
    def openLock(self, deadends: List[str], target: str) -> int:
        deadend = set(deadends)
        
        if target in deadend or '0000' in deadend:
            return -1
        
        if target == '0000':
            return 0
        
        def getNextCombi(combi: str) -> List[str]:
            res = []
            for i in range(4):
                nextCombi1 = combi[:i] + str((int(combi[i:i + 1]) + 1) % 10) + combi[i + 1:]
                nextCombi2 = combi[:i] + str((int(combi[i:i + 1]) - 1) % 10) + combi[i + 1:]
                if nextCombi1 not in deadend:
                    res.append(nextCombi1)
                if nextCombi2 not in deadend:
                    res.append(nextCombi2)
            return res
        
        sourceQueue = deque(['0000'])
        
        sourceSeen = {'0000': 0}
        
        while len(sourceQueue) != 0:
            sourceCombi = sourceQueue.popleft()
            
            for nextCombi in getNextCombi(sourceCombi):
                if nextCombi not in sourceSeen:
                    sourceSeen[nextCombi] = sourceSeen[sourceCombi] + 1
                    sourceQueue.append(nextCombi)
                    if nextCombi == target:
                        return sourceSeen[nextCombi]

代码#1在LeetCode上给出了大约120毫秒的速度, 代码#2在LeetCode上提供了大约640ms的速度。 我试了好几次,所以我相信算法本身有很大的不同

为什么1比2快得多? 我看不出在时间复杂性方面有什么不同。 是否仅仅因为LeetCode中使用的示例在#1上更快

我对这两种代码的分析: 我认为它们具有相同的时间复杂度O(1),因为在最坏的情况下,它会遍历所有可能的组合,即4^9。我不确定我的恒定时间复杂性,但我认为我做bfs源代码是正确的->;目标,并执行bfs源->;满足<;-目标,是一样的


Tags: intargetreturnifnotresintappend
1条回答
网友
1楼 · 发布于 2024-09-27 00:20:45
< >我实现C++程序来测量两个算法中的运行码和行数。为什么C++,而不是Python?不是因为C++速度,而是因为只有Python C API提供足够的功能来精细地编码代码跟踪函数,以将其设置为测量操作码和行。p>

下一个C++代码的测量结果(输出)为:

Traced 'algo1': time 3441700 nanoseconds, 1455 lines, 9304 opcodes,
    2365 avg ns/line, 369 avg ns/opcode.
Traced 'algo2': time 18011100 nanoseconds, 22678 lines, 155640 opcodes,
    794 avg ns/line, 115 avg ns/opcode.

使用下一个测试输入(取自LeetCode示例)测量两个算法:

deadends = ['0201', '0101', '0102', '1212', '2002'], target = '0202'

因此,algo1比algo2快得多,algo1只执行9304个操作码,而algo2执行155640个操作码,即16倍多的操作码。它是什么意思opcode-它是Python中最小的操作,与汇编程序中的指令相同,但这里的操作码是Python虚拟机中的指令。尽管某些操作码可能比其他操作码花费更长的时间,但它仍然是一个很好的定量度量(操作码的数量)

这意味着即使您的两个算法都是O(1),每个算法中执行的操作数仍然是非常不同的。还要记住,任何复杂性O(f(N))实际上意味着运行时间是<= C * f(N),其中C是每个算法的固定常数。例如,第一个算法可能有C = 1000,而第二个算法可能有C = 9999,两者都有O(1)复杂度,但运行时间不同

此外,运行的行数也非常不同。这个数字表示在执行算法时代码中已传递了多少行。它们也是很好的速度/时间测量工具

为了简单起见,您还可以只运行Python profiler(从标准库),使用以下代码行:

    import cProfile, profile
    with cProfile.Profile() as prof:
        Solution().openLock(deadends = ['0201', '0101', '0102', '1212', '2002'], 
            target = '0202')
    stats = pstats.Stats(prof).sort_stats('ncalls')
    stats.print_stats()

对于algo1:

         397 function calls in 0.003 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      161    0.001    0.000    0.001    0.000 {method 'append' of 'list' objects}
      101    0.000    0.000    0.000    0.000 {method 'append' of 'collections.deque' objects}
       22    0.000    0.000    0.000    0.000 {method 'popleft' of 'collections.deque' objects}
       22    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
       22    0.000    0.000    0.000    0.000 {built-in method builtins.len}
       21    0.001    0.000    0.002    0.000 C:\dev\tmp_code\main.py:17(getNextCombi)
        1    0.001    0.001    0.003    0.003 C:\dev\tmp_code\main.py:7(openLock)
        1    0.000    0.000    0.000    0.000 C:\bin\Python39\lib\typing.py:256(inner)
        1    0.000    0.000    0.000    0.000 C:\bin\Python39\lib\cProfile.py:117(__exit__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

对于algo2:

         6261 function calls in 0.055 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     3093    0.011    0.000    0.011    0.000 {method 'append' of 'list' objects}
      821    0.003    0.000    0.003    0.000 {method 'append' of 'collections.deque' objects}
      391    0.001    0.000    0.001    0.000 {built-in method builtins.hasattr}
      390    0.021    0.000    0.040    0.000 C:\dev\tmp_code\main.py:17(getNextCombi)
      390    0.001    0.000    0.001    0.000 {method 'popleft' of 'collections.deque' objects}
      390    0.001    0.000    0.001    0.000 {built-in method builtins.len}
        1    0.009    0.009    0.055    0.055 C:\dev\tmp_code\main.py:7(openLock)
        1    0.000    0.000    0.000    0.000 C:\bin\Python39\lib\typing.py:256(inner)
        1    0.000    0.000    0.000    0.000 C:\bin\Python39\lib\cProfile.py:117(__exit__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

从上面两个分析器的输出中,您还可以看到,algo2比algo1执行更多的函数调用

下一个C++代码可以使用任何编译器,如MsVC/CLAN/GCC在Windows /Linux/MACOS上编译。在我的例子中,我使用64位Windows 10上的CLang进行编译。我使用了next命令行来编译它C:\bin\llvm\bin\clang++.exe -g -m64 -O3 -std=c++20 -Ic:/bin/python39/include main.cpp c:/bin/python39/libs/python39.lib,这里您必须提供正确的CLang和Python安装路径。如果您想安装与我相同的CLang,请转到LLVM releases并下载LLVM-11.1.0-win64.exe

这里是C++代码本身:

Try it online!

#include <stdexcept>
#include <iostream>
#include <string>
#include <cstdint>
#include <chrono>

#include <Python.h>
struct _frame;

#define ASSERT(cond) { if (!(cond)) throw std::runtime_error("Assertion (" #cond ") failed at line " + std::to_string(__LINE__) + "!"); }

class Tracer {
public:
    Tracer(std::string const & name)
        : name_(name) {
        PyEval_SetTrace(&Tracer::TraceFunc, nullptr);
        lines_cnt_ = 0;
        opcodes_cnt_ = 0;
        tb_ = std::chrono::high_resolution_clock::now();
    }
    ~Tracer() {
        auto te = std::chrono::high_resolution_clock::now();
        PyEval_SetTrace(nullptr, nullptr);
        auto tp = std::chrono::duration_cast<std::chrono::nanoseconds>(te - tb_).count();
        std::cout << "Traced '" << name_ << "': time " << tp << " nanoseconds, " << lines_cnt_ << " lines, " << opcodes_cnt_ << " opcodes, "
            << uint64_t(double(tp) / double(lines_cnt_)) << " avg ns/line, "            
            << uint64_t(double(tp) / double(opcodes_cnt_)) << " avg ns/opcode." << std::endl;
    }
    static int TraceFunc(PyObject * obj, _frame * frame, int what, PyObject * arg) {
        if (what == PyTrace_LINE)
            ++lines_cnt_;
        else if (what == PyTrace_OPCODE)
            ++opcodes_cnt_;
        return 0;
    }
private:
    static inline uint64_t opcodes_cnt_ = 0, lines_cnt_ = 0;
    std::string name_;
    std::chrono::high_resolution_clock::time_point tb_;
};

void Measure() {
    {
        ASSERT(PyRun_SimpleString(R"(
from collections import deque
from typing import List
import inspect
        )") == 0);
    }
    {
        Tracer tracer("algo1");
        ASSERT(PyRun_SimpleString(R"(
try:
    from collections import deque
    from typing import List
    import inspect

    class Solution:
        def openLock(self, deadends: List[str], target: str) -> int:
            inspect.currentframe().f_trace_opcodes = True
            deadend = set(deadends)
            
            if target in deadend or '0000' in deadend:
                return -1
            
            if target == '0000':
                return 0
            
            def getNextCombi(combi: str) -> List[str]:
                inspect.currentframe().f_trace_opcodes = True
                res = []
                for i in range(4):
                    nextCombi1 = combi[:i] + str((int(combi[i:i + 1]) + 1) % 10) + combi[i + 1:]
                    nextCombi2 = combi[:i] + str((int(combi[i:i + 1]) - 1) % 10) + combi[i + 1:]
                    if nextCombi1 not in deadend:
                        res.append(nextCombi1)
                    if nextCombi2 not in deadend:
                        res.append(nextCombi2)
                return res
            
            sourceQueue = deque(['0000'])
            targetQueue = deque([target])
            
            sourceSeen = {'0000': 0}
            targetSeen = {target: 0}
            
            while len(sourceQueue) != 0 and len(targetQueue) != 0:
                sourceCombi = sourceQueue.popleft()
                targetCombi = targetQueue.popleft()
                
                for nextCombi in getNextCombi(sourceCombi):
                    if nextCombi not in sourceSeen:
                        sourceSeen[nextCombi] = sourceSeen[sourceCombi] + 1
                        sourceQueue.append(nextCombi)
                        if nextCombi in targetSeen:
                            return sourceSeen[nextCombi] + targetSeen[nextCombi]
                
                for nextCombi in getNextCombi(targetCombi):
                    if nextCombi not in targetSeen:
                        targetSeen[nextCombi] = targetSeen[targetCombi] + 1
                        targetQueue.append(nextCombi)
                        if nextCombi in sourceSeen:
                            return sourceSeen[nextCombi] + targetSeen[nextCombi]
            
            return -1

    Solution().openLock(deadends = ['0201', '0101', '0102', '1212', '2002'], target = '0202')
except:
    import traceback
    traceback.print_exc()
    print(flush = True)
        )") == 0);
    }
    {
        Tracer tracer("algo2");
        ASSERT(PyRun_SimpleString(R"(
try:
    from collections import deque
    from typing import List
    import inspect

    class Solution:
        def openLock(self, deadends: List[str], target: str) -> int:
            inspect.currentframe().f_trace_opcodes = True
            deadend = set(deadends)
            
            if target in deadend or '0000' in deadend:
                return -1
            
            if target == '0000':
                return 0
            
            def getNextCombi(combi: str) -> List[str]:
                inspect.currentframe().f_trace_opcodes = True
                res = []
                for i in range(4):
                    nextCombi1 = combi[:i] + str((int(combi[i:i + 1]) + 1) % 10) + combi[i + 1:]
                    nextCombi2 = combi[:i] + str((int(combi[i:i + 1]) - 1) % 10) + combi[i + 1:]
                    if nextCombi1 not in deadend:
                        res.append(nextCombi1)
                    if nextCombi2 not in deadend:
                        res.append(nextCombi2)
                return res
            
            sourceQueue = deque(['0000'])
            
            sourceSeen = {'0000': 0}
            
            while len(sourceQueue) != 0:
                sourceCombi = sourceQueue.popleft()
                
                for nextCombi in getNextCombi(sourceCombi):
                    if nextCombi not in sourceSeen:
                        sourceSeen[nextCombi] = sourceSeen[sourceCombi] + 1
                        sourceQueue.append(nextCombi)
                        if nextCombi == target:
                            return sourceSeen[nextCombi]

    Solution().openLock(deadends = ['0201', '0101', '0102', '1212', '2002'], target = '0202')
except:
    import traceback
    traceback.print_exc()
    print(flush = True)
        )") == 0);
    }
}

int main() {
    try {
        Py_SetProgramName(L"main.py");
        Py_Initialize();
        Measure();
        Py_FinalizeEx();
        return 0;
    } catch (std::exception const & ex) {
        std::cout << "Exception: " << ex.what() << std::endl;
        return -1;
    }
}

相关问题 更多 >

    热门问题