<p>我当前的数据如下所示</p>
<pre><code>+-------+----------------------------+-------------------+-----------------------+
| Index | 0 | 1 | 2 |
+-------+----------------------------+-------------------+-----------------------+
| 0 | Reference Curr | Daybook / Voucher | Invoice Date Due Date |
| 1 | V50011 Tech Comp | nan | Phone:0177222222 |
| 2 | Regis Place | nan | Fax:017757575789 |
| 3 | Catenberry | nan | nan |
| 4 | Manhattan, NY | nan | nan |
| 5 | V7484 Pipe | nan | Phone: |
| 6 | Japan | nan | nan |
| 7 | nan | nan | nan |
| 8 | 4543.34GBP (British Pound) | nan | nan |
+-------+----------------------------+-------------------+-----------------------+
</code></pre>
<p>我正在尝试创建一个新列<code>df['Company']</code>,它应该包含<code>df[0]</code>中的内容,如果它以“V”开头,并且<code>df[2]</code>中有“Phone”。如果条件不满足,那么它可以是<code>nan</code>。下面是我要找的</p>
<pre><code>+-------+----------------------------+-------------------+-----------------------+------------+
| Index | 0 | 1 | 2 | Company |
+-------+----------------------------+-------------------+-----------------------+------------+
| 0 | Reference Curr | Daybook / Voucher | Invoice Date Due Date | nan |
| 1 | V50011 Tech | nan | Phone:0177222222 |V50011 Tech |
| 2 | Regis Place | nan | Fax:017757575789 | nan |
| 3 | Catenberry | nan | nan | nan |
| 4 | Manhattan, NY | nan | nan | nan |
| 5 | V7484 Pipe | nan | Phone: | V7484 Pipe |
| 6 | Japan | nan | nan | nan |
| 7 | nan | nan | nan | nan |
| 8 | 4543.34GBP (British Pound) | nan | nan | nan |
+-------+----------------------------+-------------------+-----------------------+------------+
</code></pre>
<p>我正在尝试下面的脚本,但我得到一个错误<code>ValueError: Wrong number of items passed 1420</code>,位置意味着1</p>
<pre class="lang-py prettyprint-override"><code>df['Company'] = pd.np.where(df[2].str.contains("Ph"), df[0].str.extract(r"(^V[A-Za-z0-9]+)"),"stop")
</code></pre>
<p>我将“stop”作为else部分,因为我不知道在不满足条件时如何让python使用<code>nan</code></p>
<p>我还希望能够解析出df[0]的一个部分,例如,仅解析v5001部分,而不解析其余的单元格内容。我使用AMCs答案尝试了类似的方法,但出现了一个错误:</p>
<pre class="lang-py prettyprint-override"><code>df.loc[df[0].str.startswith('V') & df[2].str.contains('Phone'), 'Company'] = df[0].str.extract(r"(^V[A-Za-z0-9]+)")
</code></pre>
<p>多谢各位</p>