<p>让我们使用<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.str.replace.html" rel="nofollow noreferrer">^{<cd1>}</a>和<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.str.split.html" rel="nofollow noreferrer">^{<cd2>}</a>然后<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.stack.html" rel="nofollow noreferrer">^{<cd3>}</a>将标题转换为可用的多索引,以从宽格式转换为长格式。然后<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.cumcount.html" rel="nofollow noreferrer">^{<cd4>}</a>创建BaseYear列</p>
<pre><code># Save Columns
df = df.set_index('refnum')
# Create a MultiIndex with Numbers at the end and split into multiple levels
df.columns = (
df.columns.str.replace(r'^(.*?)(\d+)(.*)$', r'\1\3/\2', regex=True)
.str.split('/', expand=True)
)
# Wide Format to Long + Rename Columns
df = df.stack().droplevel(-1).reset_index().rename(
columns={'y': 'Year', 'ygp': 'GP', 'yrev': 'REV'}
)
# Add Base Year Column
df['BaseYear'] = "BaseYear+" + df.groupby('refnum').cumcount().astype(str)
# df['BaseYear'] = df.groupby('refnum').cumcount() # (int version)
</code></pre>
<p><code>df</code>:</p>
<pre><code> refnum Year GP REV BaseYear
0 10001 2021 200 300 BaseYear+0
1 10001 2022 600 100 BaseYear+1
2 10001 2023 300 300 BaseYear+2
3 10002 2020 200 300 BaseYear+0
4 10002 2021 500 200 BaseYear+1
5 10002 2022 300 300 BaseYear+2
6 10003 2021 200 300 BaseYear+0
7 10003 2022 500 500 BaseYear+1
8 10003 2023 300 300 BaseYear+2
</code></pre>