在火花柱上作为argumen操作的函数

2024-09-30 14:22:50 发布

您现在位置：Python中文网/ 问答频道 /正文

7734

网友

男 | 程序猿一只，喜欢编程写python代码。

编辑：终于自己弄明白了。我一直在函数中的select()在column上使用，这就是为什么它不起作用。我将我的解决方案作为注释添加到原始问题中，以防对其他人有用。

我正在做一个在线课程，我应该写下以下函数：

# TODO: Replace <FILL IN> with appropriate code

# Note that you shouldn't use any RDD operations or need to create custom user defined functions (udfs) to accomplish this task

from pyspark.sql.functions import regexp_replace, trim, col, lower

def removePunctuation(column):
    """Removes punctuation, changes to lower case, and strips leading and trailing spaces.

    Note:
        Only spaces, letters, and numbers should be retained.  Other characters should should be
        eliminated (e.g. it's becomes its).  Leading and trailing spaces should be removed after
        punctuation is removed.

    Args:
        column (Column): A Column containing a sentence.

    Returns:
        Column: A Column named 'sentence' with clean-up operations applied.
    """

    # EDIT: MY SOLUTION
    # column = lower(column)
    # column = regexp_replace(column, r'([^a-z\d\s])+', r'')
    # return trim(column).alias('sentence')

    return <FILL IN>

sentenceDF = sqlContext.createDataFrame([('Hi, you!',),
                                         (' No under_score!',),
                                         (' *      Remove punctuation then spaces  * ',)], ['sentence'])
sentenceDF.show(truncate=False)
(sentenceDF
 .select(removePunctuation(col('sentence')))
 .show(truncate=False))

我已经编写了一段代码，它为DataFrame本身的操作提供了所需的输出：

^{pr2}$

我只是不知道如何在我的函数中实现这段代码，因为它不操作DataFrame，而只对给定的column进行操作。我尝试过不同的方法，一种是使用

[...]
df = sqlContext.createDataFrame(column, ['sentence'])
[...]

在函数中，但它不起作用：TypeError: Column is not iterable。其他方法尝试在函数中直接对column进行操作，总是导致TypeError: 'Column' object is not callable。在

几天前，我从(Py)Spark开始讲起，但是对于如何只处理行和列，仍然存在概念上的问题。我真的很感谢在当前问题上的任何帮助。在

Tags： and to 函数 is column be fill select

1条回答

网友

1楼 · 发布于 2024-09-30 14:22:50

你可以在一行中完成。在

return re.sub(r'[^a-z0-9\s]','',text.lower().strip()).strip()

在火花柱上作为argumen操作的函数

相关问题更多 >

编程相关推荐

热门问题

热门文章

在火花柱上作为argumen操作的函数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >