擅长:python、mysql、java
<p>使用UDF将bytearray转换为数组可能会有所帮助</p>
<pre><code>import pyspark.sql.functions as f
from pyspark.sql.types import IntegerType,ArrayType
byte_to_int = lambda x : [int(y) for y in x]
byte_to_int_udf = f.udf(lambda z :byte_to_int(z),ArrayType(IntegerType()))
df = pd.DataFrame({'content': [bytearray(b'\x01%\xeb\x8cH\x89')]})
df1 = spark.createDataFrame(df)
df1.withColumn("content_array",byte_to_int_udf(f.col('content'))).select(f.explode(f.col('content_array'))).show()
</code></pre>