擅长:python、mysql、java
<p>如果数据不经常更改,则反向索引可能是最有效的解决方案:</p>
<pre><code>import numpy as np
import pandas as pd
df = pd.DataFrame({
"Country": ["USA", "India"],
"State": ["Texas", "Maharashtra"],
"Population": [100_000, 200_000],
})
# Create an inverse index - must be done only once:
inverse_map = pd.Series()
for idx, column in enumerate(df.columns):
column_data = pd.Series(np.repeat(idx, len(df[column])), index=df[column])
inverse_map = inverse_map.append(column_data)
# This should be fast - even for many queries:
df.columns[inverse_map.loc["Maharashtra"]]
# Output: 'State'
</code></pre>
<p>我使用反向映射中的索引而不是列名来节省内存。你知道吗</p>