回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我创建了一个代码,用一个<code>excel</code>文件将同名但在不同表中的列连接起来</p>
<p>这是我的最新代码</p>
<pre><code>import os
import pandas as pd
import numpy as np
mydir = (os.getcwd()).replace('\\', '/') + '/'
cluster_cell = pd.read_excel(r'' + mydir + 'Cluster_Cell.xlsx', usecols=['CELLNAME'], encoding='latin-1')
GUL_EP_GSM = pd.read_excel(r'' + mydir + 'GU EP20191016171519.xlsx', sheet_name="GSM_CDDData", usecols=['CELLNAME'],
encoding='latin-1')
print(GUL_EP_GSM)
GUL_EP_UMTS = pd.read_excel(r'' + mydir + 'GU EP20191016171519.xlsx', sheet_name="UMTS_CDDData", usecols=['CELLNAME'],
encoding='latin-1')
print(GUL_EP_UMTS)
GUL_EP_LTE = pd.read_excel(r'' + mydir + 'GU EP20191016171519.xlsx', sheet_name="LTE_CDDData", usecols=['CELLNAME'],
encoding='latin-1')
print(GUL_EP_LTE)
appended_cell_name = pd.DataFrame({'CELLNAME': cluster_cell['CELLNAME'] + GUL_EP_GSM['CELLNAME'] + GUL_EP_UMTS['CELLNAME'] + GUL_EP_LTE['CELLNAME']})
mask1 = appended_cell_name['CELLNAME'].isin(GUL_EP_GSM['CELLNAME'])
mask2 = appended_cell_name['CELLNAME'].isin(GUL_EP_UMTS['CELLNAME'])
mask3 = appended_cell_name['CELLNAME'].isin(GUL_EP_LTE['CELLNAME'])
cond=[mask1,mask2,mask3]
values=['GSM','UMTS','LTE']
appended_cell_name['system']=np.select(cond,values)
GUL_EP_GSM['CELLNAME']
GUL_EP_UMTS['CELLNAME']
GUL_EP_LTE['CELLNAME']
appended_cell_name['system'].unique()
print(appended_cell_name)
appended_cell_name.to_excel("Cluser_Cell_Final.xlsx", index=False)
</code></pre>
<p>所以现在我需要为我创建的名为'System<code>contains</code>values<code>related to</code>pandas'的连接对象创建一个列</p>
<p>使它深入。。。。
为了示例:-
对于<code>GUL_EP_GSM</code>对象,我想用值<code>GSM</code>向新创建的<code>column</code>添加值;对于<code>GUL_EP_UMTS</code>,我想向新创建的<code>column</code><code>UMTS</code>添加值;对于<code>GUL_EP_LTE</code>,我想向新创建的<code>column</code><code>LTE</code>添加值</p>
<p>如下图所示</p>
<p>这是以前的,只是我的代码</p>
<p><a href="https://i.stack.imgur.com/BoxyE.png" rel="nofollow noreferrer">before</a></p>
<p>这是之后我想做的</p>
<p><a href="https://i.stack.imgur.com/6tQfJ.png" rel="nofollow noreferrer">after</a></p>
<p>你知道怎么做吗?你知道吗</p>
<p>我需要把我的代码简化成最简单的方式。。。。你知道吗</p>
<p>错误</p>
<pre><code>Traceback (most recent call last):
File "C:/Users/mwx825326/PycharmProjects/MyExcelCombine/MyCopyAndPaste.py", line 11, in <module>
appended_cell_name=pd.DataFrame({'CELLNAME':cluster_cell + GUL_EP_GSM + GUL_EP_UMTS + GUL_EP_LTE})
File "C:\Users\mwx825326\PycharmProjects\MyExcelCombine\venv\lib\site-packages\pandas\core\frame.py", line 411, in __init__
mgr = init_dict(data, index, columns, dtype=dtype)
File "C:\Users\mwx825326\PycharmProjects\MyExcelCombine\venv\lib\site-packages\pandas\core\internals\construction.py", line 257, in init_dict
return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "C:\Users\mwx825326\PycharmProjects\MyExcelCombine\venv\lib\site-packages\pandas\core\internals\construction.py", line 77, in arrays_to_mgr
index = extract_index(arrays)
File "C:\Users\mwx825326\PycharmProjects\MyExcelCombine\venv\lib\site-packages\pandas\core\internals\construction.py", line 358, in extract_index
raise ValueError("If using all scalar values, you must pass an index")
ValueError: If using all scalar values, you must pass an index
</code></pre>
<h2>打印输出</h2>
<p><code>appended_cell_name</code></p>
<pre><code>
CELLNAME system
0 3129648.0 0
1 3129652.0 0
2 3129656.0 0
3 3141617.0 0
4 3141621.0 0
... ... ...
2347 NaN 0
2348 NaN 0
2349 NaN 0
2350 NaN 0
2351 NaN 0
</code></pre>
<p><code>GUL_EP_GSM</code></p>
<pre><code> CELLNAME
0 10169
1 10170
2 10171
3 10349
4 10350
.. ...
124 54237
125 54996
126 54997
127 54998
128 55966
</code></pre>
<p><code>GUL_EP_UMTS</code></p>
<pre><code> CELLNAME
0 1106382
1 1106383
2 1106384
3 1019810
4 1019811
.. ...
262 1102891
263 1102892
264 1018393
265 1018394
266 1018395
</code></pre>
<p><code>GUL_EP_LTE</code></p>
<pre><code> CELLNAME
0 2002388
1 2002389
2 2002390
3 2100118
4 2100119
.. ...
538 2304376
539 2304377
540 2205890
541 2205891
542 2205892
</code></pre>
<p><code>mask1</code>输出</p>
<pre><code>0 False
1 False
2 False
3 False
4 False
...
2347 False
2348 False
2349 False
2350 False
2351 False
Name: CELLNAME, Length: 2352, dtype: bool
</code></pre>
<p><code>mask2</code>输出</p>
<pre><code>0 False
1 False
2 False
3 False
4 False
...
2347 False
2348 False
2349 False
2350 False
2351 False
Name: CELLNAME, Length: 2352, dtype: bool
</code></pre>
<p><code>mask3</code>输出</p>
<pre><code>0 False
1 False
2 False
3 False
4 False
...
2347 False
2348 False
2349 False
2350 False
2351 False
Name: CELLNAME, Length: 2352, dtype: bool
</code></pre>
<p>的输出</p>
<pre><code>print(GUL_EP_LTE.info())
print(GUL_EP_UMTS.info())
print(GUL_EP_LTE.info())
</code></pre>
<p>是</p>
<pre><code>[2352 rows x 2 columns]
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 543 entries, 0 to 542
Data columns (total 1 columns):
CELLNAME 543 non-null int64
dtypes: int64(1)
memory usage: 4.3 KB
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 267 entries, 0 to 266
Data columns (total 1 columns):
CELLNAME 267 non-null int64
dtypes: int64(1)
memory usage: 2.1 KB
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 543 entries, 0 to 542
Data columns (total 1 columns):
CELLNAME 543 non-null int64
dtypes: int64(1)
memory usage: 4.3 KB
None
</code></pre>
<p><code>print(appended_cell_name. info())</code></p>
<pre><code><class 'pandas.core.frame.DataFrame'>
Int64Index: 2352 entries, 0 to 2351
Data columns (total 2 columns):
CELLNAME 129 non-null float64
system 2352 non-null object
dtypes: float64(1), object(1)
memory usage: 45.9+ KB
None
</code></pre>