<p>按<code>%%timeit</code>结果排序的方法</p>
<p>我对所有建议的方法进行了计时,并在两个数据帧上对更多的方法进行了计时。以下是建议方法的计时结果(谢谢@meW和@jezrael)。如果我错过了任何一个或你有另一个,让我知道,我会添加它。你知道吗</p>
<p>每个方法显示两个计时:首先处理示例df中的3行,然后处理另一个df中的57K行。其他系统的计时可能不同。在连接字符串中包含<code>TEST['dot']</code>的解决方案需要在df中使用此列:用<code>TEST['dot'] = '.'</code>添加它。你知道吗</p>
<p>原始方法(仍然是最快的):</p>
<p><strong>.astype(str),+,'..'.</strong></p>
<pre><code>%%timeit
TEST['filename'] = TEST['job_number'].astype(str) + '.' + TEST['task_number'].astype(str)
# 553 µs ± 6.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 69.6 ms ± 876 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) on 57K rows
</code></pre>
<p>建议的方法和一些排列:</p>
<p><strong>.astype(int).astype(str),+,'..'.</strong></p>
<pre><code>%%timeit
TEST['filename'] = TEST['job_number'].astype(int).astype(str) + '.' + TEST['task_number'].astype(int).astype(str)
# 553 µs ± 6.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 70.2 ms ± 739 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) on 57K rows
</code></pre>
<p><strong>。值.astype(int).astype(str),+,TEST['dot']</strong></p>
<pre><code>%%timeit
TEST['filename'] = TEST['job_number'].values.astype(int).astype(str) + TEST['dot'] + TEST['task_number'].values.astype(int).astype(str)
# 221 µs ± 5.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 82.3 ms ± 743 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) on 57K rows
</code></pre>
<p><strong>。值.astype(str),+,测试['dot']</strong></p>
<pre><code>%%timeit
TEST["filename"] = TEST['job_number'].values.astype(str) + TEST['dot'] + TEST['task_number'].values.astype(str)
# 221 µs ± 5.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 92.8 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) on 57K rows
</code></pre>
<p><strong>'.'.join(),列表理解。值.astype(str)</strong></p>
<pre><code>%%timeit
TEST["filename"] = ['.'.join(i) for i in TEST[["job_number",'task_number']].values.astype(str)]
# 743 µs ± 19.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 147 ms ± 532 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) on 57K rows
</code></pre>
<p><strong>f-string,列表理解。值.astype(str)</strong></p>
<pre><code>%%timeit
TEST["filename2"] = [f'{i}.{j}' for i,j in TEST[["job_number",'task_number']].values.astype(str)]
# 642 µs ± 27.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 167 ms ± 3.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) on 57K rows
</code></pre>
<p>.join(),zip,列表理解,.map(str)</strong></p>
<pre><code>%%timeit
TEST["filename"] = ['.'.join(i) for i in
zip(TEST["job_number"].map(str), TEST["task_number"].map(str))]
# 512 µs ± 5.74 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 181 ms ± 4.17 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) on 57K rows
</code></pre>
<p><strong>应用(lambda,str(x[2]),+,'.')</strong></p>
<pre><code>%%timeit
TEST['filename'] = TEST.T.apply(lambda x: str(x[2]) + '.' + str(x[10]))
# 735 µs ± 13.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) on 3 rows
# 2.69 s ± 18.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) on 57K rows
</code></pre>
<p>如果你发现了一个改进的方法,请告诉我,我会添加到列表中!你知道吗</p>