Apache Spark: How to use pyspark with Python 3

Apache Spark: How to use pyspark with Python 3

Asked on November 16, 2018 in Apache-spark.
Add Comment


  • 3 Answer(s)

    This could be done by setting the environment variable:

    export PYSPARK_PYTHON=python3
    
    

    In some conditions  were there is need to be a permanent change simply add this line to pyspark script.

    Answered on November 16, 2018.
    Add Comment

    Try this by using:

    PYSPARK_PYTHON=python3 ./bin/pyspark
    

    When this has to be run  in IPython Notebook, just use:

    PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark
    

    If python3 is not available, Then pass path to it instead.

    Note: The current documentation (as of 1.4.1) has outdate instructions. Luckily, it has been patched.

     

    Answered on November 16, 2018.
    Add Comment

    To use pyspark with Python 3, follow These steps:

    • Here, edit profile :vim ~/.profile
    • Then add the code into the file: export PYSPARK_PYTHON=python3
    • After that execute command : source ~/.profile
    •  ./bin/pyspark

     

    Answered on November 16, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.