<p>注意,我已经将您的第一个Head1列重命名为Header(您的示例中有重复的列)。你知道吗</p>
<p>我的数据帧设置与你的不同,但足够接近。我没有填写与问题无关的栏目。你知道吗</p>
<p>这是我的设置代码:</p>
<pre><code>import pandas as pd
df = pd.DataFrame([],
columns=["Header",
"LongHeader",
"Head0",
"Strand0",
"Distance0",
"Head1",
"Strand1",
"Distance1",
"Head2",
"Strand2",
"Distance2"])
df["Header"] = ["ABC", "EFG", "HIJ", "KLM", "SOS"]
df["LongHeader"] = ["1", "2", "3", "4", "5"]
df["Head0"] = ["SAP", "HES3", "CORT", "AAD", "MFA"]
df["Strand0"] = ["+", "-", "-", "-", "-"]
df["Distance0"] = ["115590", "6350", "19440", "25488", "11174"]
df["Head1"] = ["GRN", "CMT", "API", "DH", "13A2"]
df["Strand1"] = ["+", "-", "-", "-", "-"]
df["Distance1"] = ["426250", "1902", "177", "1341", "19763"]
df["Head2"] = ["None", "None", "None", "DSQ", "None"]
df["Strand2"] = ["+", "-", "-", "-", "-"]
df["Distance2"] = ["None", "None", "None", "120001", "None"]
print df
</code></pre>
<p>给出了与您的示例类似的数据:</p>
<pre><code> Header LongHeader Head0 Strand0 Distance0 Head1 Strand1 Distance1 Head2
0 ABC 1 SAP + 115590 GRN + 426250 None
1 EFG 2 HES3 - 6350 CMT - 1902 None
2 HIJ 3 CORT - 19440 API - 177 None
3 KLM 4 AAD - 25488 DH - 1341 DSQ
4 SOS 5 MFA - 11174 13A2 - 19763 None
</code></pre>
<p>这是做这项工作的代码。其主要思想是提取Headx和Distancex列,并简单地将它们堆叠在彼此的顶部。然后将Distance的数据类型更改为int,并且只保留Distance>;=100000的行。你知道吗</p>
<pre><code>frames_to_concat = []
for col in df:
if col.startswith("Dis"):
dis_num = col[-1] # Extract the # from a column like Distance# or Dis#
frame_to_concat = df[["Header", "Head" + dis_num, "Distance" + dis_num]]
frame_to_concat.columns = ["Header", "Head", "Distance"]
frames_to_concat.append(frame_to_concat)
stacked_columns = pd.concat(frames_to_concat)
stacked_columns = stacked_columns[stacked_columns["Distance"] != "None"]
stacked_columns["Distance"] = stacked_columns["Distance"].astype(int)
result = stacked_columns[stacked_columns["Distance"] > 100000]
print result
</code></pre>
<p>它给出:</p>
<pre><code># Output:
Header Head Distance
0 ABC SAP 115590
0 ABC GRN 426250
3 KLM DSQ 120001
</code></pre>
<p>下次你问问题的时候,不要对潜在的回答者太苛刻。提供设置代码!!!你知道吗</p>
<p>您将不得不稍微修改这个解决方案,使之与实际的列名保持一致,由于存在重复列名的问题,我不确定实际应该调用什么。嗯!你知道吗</p>