擅长:python、mysql、java
<p>所以有很多方法可以做到这一点,这只是其中之一。假设您将两个数据帧存储为df1和df2:</p>
<p>首先,规范化df1中的district_id列,使其长度相同:</p>
<pre><code># make all strings the same size when split
def return_full_string(text):
l = len(text.split(';'))
for _ in range(5 - l):
text = f"{text};"
return text
df1['district_id'] = df1.district_id.apply(return_full_string)
</code></pre>
<p>然后将文本列拆分为单独的列并删除原始列:</p>
<pre><code># split district id's into different columns
district_columns = [f"district_name{n+1}" for n in range(5)]
df1[district_columns] = list(df1.district_id.str.split(';'))
df1.drop('district_id', inplace=True)
</code></pre>
<p>然后获取df2中ID到其各自名称的映射,并使用该映射替换新列中的值:</p>
<pre><code>id_to_name = {str(ii): nn for ii, nn in zip(df2['district_id'], df2['district_name'])}
for col in district_columns:
df1[col] = df1[col].apply(id_to_name.get)
</code></pre>
<p>就像我说的,我相信还有其他方法可以做到这一点,但这应该是可行的</p>