Pyspark启动问题Windows 10 wxith Python 3.6

2024-10-04 01:35:23 发布

您现在位置:Python中文网/ 问答频道 /正文

在安装了python3.x和Anaconda之后,我无法在windows中启动Pyspark。 低于错误-

Python 3.6.0 |Anaconda 4.3.0 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
  File "C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\bin\..\python\pyspark\shell.py", line 30, in <module>
    import pyspark
  File "C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\python\pyspark\__init__.py", line 44, in <module>
    from pyspark.context import SparkContext
  File "C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 36, in <module>
    from pyspark.java_gateway import launch_gateway
  File "C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\python\pyspark\java_gateway.py", line 31, in <module>
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load
  File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 646, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 616, in _load_backward_compatible
  File "C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 18, in <module>
  File "C:\Users\prudra\AppData\Local\Continuum\Anaconda3\lib\pydoc.py", line 62, in <module>
    import pkgutil
  File "C:\Users\prudra\AppData\Local\Continuum\Anaconda3\lib\pkgutil.py", line 22, in <module>
    ModuleInfo = namedtuple('ModuleInfo', 'module_finder name ispkg')
  File "C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\python\pyspark\serializers.py", line 393, in namedtuple
    cls = _old_namedtuple(*args, **kwargs)
TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'

请告诉我怎么解决


Tags: inpyimportbinlinejavausersgateway
2条回答

PySpark 2.1目前不能与python3.6.0一起使用。已报告此问题here。该决议已于2017年1月17日解决,但截至今日(2017年3月17日)尚未发布。但是,查看已提交的更改,您应该能够通过下载以下两个Python文件自行修复此问题:

https://github.com/apache/spark/blob/master/python/pyspark/serializers.pyhttps://github.com/apache/spark/blob/master/python/pyspark/cloudpickle.py

并将其保存到以下位置(覆盖现有文件):

C:\Users\prudra\Desktop\Udemy\spark-2.1.0-bin-hadoop2.7\python\pyspark

或者更通用的文件应该保存到Spark安装的python\pyspark子文件夹中。在

Spark 2.1.1刚刚在5月4日发布。它现在正在使用Python3.6,您可以看到发行说明here。在

相关问题 更多 >