擅长:python、mysql、java
<p><a href="https://docs.python.org/3/library/functools.html#functools.reduce" rel="nofollow noreferrer">^{<cd1>}</a>在这里可能很有用:</p>
<pre><code>df = spark.createDataFrame([(0, 1, 1, 2,1), (0, 0, 1, 0, 1), (1, 0, 1, 1 ,1)],
['a', 'b', 'c', 'd', 'e'])
cols = ['a', 'b', 'd']
</code></pre>
<p>使用<code>reduce</code>创建筛选器表达式:</p>
<pre><code>from functools import reduce
predicate = reduce(lambda a, b: a | b, [df[x] != 0 for x in cols])
print(predicate)
# Column<b'(((NOT (a = 0)) OR (NOT (b = 0))) OR (NOT (d = 0)))'>
</code></pre>
<p>然后<code>filter</code>和<code>predicate</code>:</p>
<pre><code>df.where(predicate).show()
+---+---+---+---+---+
| a| b| c| d| e|
+---+---+---+---+---+
| 0| 1| 1| 2| 1|
| 1| 0| 1| 1| 1|
+---+---+---+---+---+
</code></pre>