擅长:python、mysql、java
<p>从指定“^”作为分隔符并使用任意列名开始</p>
<pre><code>df = pd.read_csv('data.csv', delimiter='\^', names=['A', 'B'])
print (df)
A B
0 level country layla
1 hello sandra organization
2 hello people layla
3 hello samar organization
</code></pre>
<p>然后我们分开得到我们想要的值。我相信这在熊猫16中是新发现的</p>
<pre><code>df['A'] = df['A'].str.split(' ', expand=True)[1]
print(df)
A B
0 country layla
1 sandra organization
2 people layla
3 samar organization
</code></pre>
<p>然后我们将B列分组并应用tuple函数。注意:我们正在重置索引,以便稍后使用</p>
<pre><code>g = df.groupby('B')['A'].apply(tuple).reset_index()
print(g)
B A
0 layla (country, people)
1 organization (sandra, samar)
</code></pre>
<p>使用字符串“item”和索引创建新列</p>
<pre><code> g['item'] = 'item' + g.index.astype(str)
print (g[['item','A']])
item A
0 item0 (country, people)
1 item1 (sandra, samar)
</code></pre>