我已经提到了thistut来开始在windows上使用pyspark。以下是我遵循的步骤:
%SPARK_HOME%
的目录中%SPARK_HOME%\bin
%HADOOP_HOME%
设置为与%SPARK_HOME%
相同的目录%PYSPARK_DRIVER_PYTHON%
设置为ipython%PYSPARK_DRIVER_PYTHON_OPTS%
设置为笔记本;%SPARK_HOME%\bin
添加到%PATH%
但当我跑的时候
> pyspark --master local[2]
我得到以下错误:
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future
Traceback (most recent call last):
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\Scripts\ipython.exe\__main__.py", line 9, in <module>
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\__init__.py", line 125, in start_ipython
return launch_new_instance(argv=argv, **kwargs)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 657, in launch_instance
app.initialize(argv)
File "<decorator-gen-113>", line 2, in initialize
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\terminal\ipapp.py", line 308, in initialize
super(TerminalIPythonApp, self).initialize(argv)
File "<decorator-gen-7>", line 2, in initialize
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\core\application.py", line 450, in initialize
self.parse_command_line(argv)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\terminal\ipapp.py", line 303, in parse_command_line
return super(TerminalIPythonApp, self).parse_command_line(argv)
File "<decorator-gen-4>", line 2, in parse_command_line
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 514, in parse_command_line
return self.initialize_subcommand(subc, subargv)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\IPython\core\application.py", line 243, in initialize_subcommand
return super(BaseIPythonApplication, self).initialize_subcommand(subc, argv)
File "<decorator-gen-3>", line 2, in initialize_subcommand
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\traitlets\config\application.py", line 445, in initialize_subcommand
subapp = import_item(subapp)
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\ipython_genutils\importstring.py", line 31, in import_item
module = __import__(package, fromlist=[obj])
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\notebook\notebookapp.py", line 31, in <module>
from zmq.eventloop import ioloop
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\zmq\eventloop\__init__.py", line 3, in <module>
from zmq.eventloop.ioloop import IOLoop
File "d:\mahesh\softwares\python\winpython-64bit-3.4.4.4qt5\python-3.4.4.amd64\lib\site-packages\zmq\eventloop\ioloop.py", line 21, in <module>
from zmq import (
ImportError: cannot import name 'Poller'
我可以用>spark-shell
命令正确地运行sparkscalashell。在
正如您在堆栈跟踪中看到的,我已经在path上安装了winpython
D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64
因此,我的%PYTHON_HOME%
就是D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5
。
但我的%SPARK_HOME%
是D:\mahesh\Programs\spark-2.3.0-bin-hadoop2.7
。
运行where pyspark
命令将产生以下输出:
D:\mahesh\Programs\spark-2.3.0-bin-hadoop2.7\bin\pyspark
D:\mahesh\Programs\spark-2.3.0-bin-hadoop2.7\bin\pyspark.cmd
D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\Scripts\pyspark
D:\mahesh\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\Scripts\pyspark.cmd
我相信我的问题是我的windows spark环境配置错误。这就是为什么我提供了以上所有信息。那么这里出什么问题了?在
注意,我执行这些步骤时没有使用tut中建议的Anaconda和GOW(windows上的Gnu)。在
将您的
%PYSPARK_DRIVER_PYTHON%
指向一个包含'Poller'
的所有依赖项的虚拟环境,然后检查。 否则,您可以尝试在ipython环境中安装'Poller'
(坦率地说,我不知道如何安装!)在相关问题 更多 >
编程相关推荐