<p>根据您添加的代码示例,您试图回答的问题是如何为<code>pandas dataframe</code>中的每一行用<code>', '</code>替换<code>' '</code>。你知道吗</p>
<p>有一种方法:</p>
<pre><code>import pandas as pd
sampletxt = pd.read_csv('teste.csv' , header = None)
output = sampletxt.replace('\s+', ', ', regex=True)
print(output)
</code></pre>
<p><strong>示例:</strong></p>
<pre><code>In [24]: l
Out[24]:
['input phrase of the file to exemplify',
'input phrase of the file to exemplify 2',
'input phrase of the file to exemplify 4']
In [25]: sampletxt = pd.DataFrame(l)
In [26]: sampletxt
Out[26]:
0
0 input phrase of the file to exemplify
1 input phrase of the file to exemplify 2
2 input phrase of the file to exemplify 4
In [27]: output = sampletxt.replace('\s+', ', ', regex=True)
In [28]: output
Out[28]:
0
0 input, phrase, of, the, file, to, exemplify
1 input, phrase, of, the, file, to, exemplify, 2
2 input, phrase, of, the, file, to, exemplify, 4
</code></pre>
<hr/>
<p><strong>旧答案</p>
<p>您还可以使用<code>re.sub(..)</code>,如下所示:</p>
<pre><code>In [3]: import re
In [4]: st = "input phrase of the file to exemplify"
In [5]: re.sub(' ',', ', st)
Out[5]: 'input, phrase, of, the, file, to, exemplify'
</code></pre>
<hr/>
<p><code>re.sub(...)</code>比<code>str.replace(..)</code>快</p>
<pre><code>In [6]: timeit re.sub(' ',', ', st)
100000 loops, best of 3: 1.74 µs per loop
In [7]: timeit st.replace(' ',', ')
1000000 loops, best of 3: 257 ns per loop
</code></pre>
<hr/>
<p>如果有多个空格分隔两个单词,那么基于<code>str.replace(' ',',')</code>的所有答案的输出都是错误的。例如</p>
<pre><code>In [15]: st
Out[15]: 'input phrase of the file to exemplify'
In [16]: re.sub(' ',', ', st)
Out[16]: 'input, phrase, of, the, file, to, , exemplify'
In [17]: st.replace(' ',', ')
Out[17]: 'input, phrase, of, the, file, to, , exemplify'
</code></pre>
<p>要解决此问题,需要使用与一个或多个空格匹配的正则表达式,如下所示:</p>
<pre><code>In [22]: st
Out[22]: 'input phrase of the file to exemplify'
In [23]: re.sub('\s+', ', ', st)
Out[23]: 'input, phrase, of, the, file, to, exemplify'
</code></pre>