回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我想在python中合并两个列表来过滤这个获得的列表</p>
<p>我有以下数据帧df:</p>
<pre><code>+---+--------+
|v1 | v2 | v |
+---+--------+
| 2| 4| 24|
| 4| 2| 42|
| 1| 1| 11|
| 1| 3| 13|
| 2| 2| 22|
+---+----+---+
</code></pre>
<p>我有两个brodcast变量(collectAsMap):</p>
<ul>
<li>t1:<code>{'3': ['4'], '1': ['2', '4', '3'], '2': ['3', '4']}</code></li>
<li>t2:<code>{'3': ['4'], '5': ['6'], '1': ['2']}</code></li>
</ul>
<p>为了过滤和合并列表,我尝试了以下操作</p>
<pre><code>merge_udf = udf(merge, ArrayType(StringType()))
df = df.distinct().withColumn('MergeList', merge_udf(df.v1, df.v2)
</code></pre>
<p>其中:</p>
<pre><code>"""merge two lists in one list"""
def merge2List(listA, listB):
merge = [(itemA+itemB) for itemA in listA for itemB in listB]
return merge
"""merge the entry of two entries of dataframes"""
def merge(x, y):
listA = t1.value.get(x)
if(listA is None):
listA = []
listA.append(x)
listB = t2.value.get(y)
if(listB is None):
listB = []
listB.append(y)
m = merge2List(listA, listB)
return m
</code></pre>
<p>所得结果如下:</p>
<pre><code>+---+---------+------------+
|v1 |v2 | MergeList|
+---+---------+------------+
| 2| 4| [34, 44]|
| 4| 2| [42]|
| 1| 1|[22, 42, 32]|
| 1| 3|[24, 44, 34]|
| 2| 2| [32, 42]|
+---+---------+------------+
</code></pre>
<p>我有一个t3 brodcast变量,其中<code>print(list(t3.value.keys()))</code>给出<code>['24', '42', '11', '13', '22']</code></p>
<p>现在我想过滤掉合并列表列中每个列表中的元素。因此,我创建以下函数并更新merge2List函数:</p>
<pre><code>def filterList(v):
vert = list(t3.value.keys())
if(v in vert):
return True
return False
"""merge two lists in one list"""
def merge2List(listA, listB):
merge = [(itemA+itemB) for itemA in listA for itemB in listB]
filteredList = filter(filterList, merge)
return filteredList
</code></pre>
<p>引发以下异常:</p>
<pre><code>_pickle.PicklingError: Can't pickle <function filterList at 0x2b2fb1aa6840>: attribute lookup filterList on __main__ failed
</code></pre>
<p>有人能帮我找出哪里是我的错吗</p>