如何将Python-UDF用作没有脚本fi的内联代码

2024-09-28 21:55:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Python包impyla以编程方式连接到配置单元;我没有使用hiveCLI。我正在尝试使用用Python编写的UDF。你知道吗

我看过的所有教程都是这样做的

ADD FILE myscript.py;
...
SELECT TRANSFORM (cols...)
USING 'python myscript.py'
AS ...

我认为USING部分可以是任何做正确事情的可执行程序。因此,我想我可以像这样的字符串飞行脚本

USING 'python -c "import sys; ..."'

这将很好地避免处理到Hadoop的文件传输。但是,我很难让它工作。你知道吗

在有用的代码不起作用之后,我就简化为这个伪代码

USING 'python -c "print 3"

只是为了调试。我得到的错误是

E           impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string

更多细节如下

test_hive_udf_example.py:77: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../src/sunny/sql/hive.py:80: in read
    return super().read(sql, configuration=config)
../../src/sunny/sql/sql.py:136: in read
    self._execute(sql, **kwargs)
../../src/sunny/sql/sql.py:130: in _execute
    self._cursor.execute(sql, **kwargs)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:302: in execute
    configuration=configuration)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:343: in execute_async
    self._execute_async(op)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:362: in _execute_async
    operation_fn()
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:340: in op
    async=True)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:1027: in execute
    return self._operation('ExecuteStatement', req)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:957: in _operation
    resp = self._rpc(kind, request)
/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py:925: in _rpc
    err_if_rpc_not_ok(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

resp = TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=None, sqlState='42000', errorCode=40000, errorMessage=...mpiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string'), operationHandle=None)

    def err_if_rpc_not_ok(resp):
        if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
                resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
                resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
>           raise HiveServer2Error(resp.status.errorMessage)
E           impala.error.HiveServer2Error: Error while compiling statement: FAILED: IllegalArgumentException Can not create a Path from an empty string

它似乎抱怨的是它正在寻找脚本,而我的代码根本没有提到。你知道吗

通过hiveCLI而不是impyla执行这个脚本,我验证了语法不需要使用脚本。USING 'python -c "..."'可以工作。你知道吗

现在的问题似乎是我如何通过impyla使用它。你知道吗

欢迎任何指针!谢谢!你知道吗


Tags: inpyselfexecutesqllibpackagesusr