How to set Apache Spark Executor memory ?

How to set Apache Spark Executor memory ?

Asked on November 15, 2018 in Apache-spark.
Add Comment


  • 3 Answer(s)

        When the  Spark in local mode is running, setting spark.executor.memory will not have any effect, as we have seen. Actually this is because of  the Worker “lives” within the driver JVM process that you start when you start spark-shell and the default memory used for that is 512M. This can be increased that by setting spark.driver.memory to something higher,

    For instance 5g. This could be done by:

    setting it in the properties file (default is spark-defaults.conf),

    spark.driver.memory 5g
    

    or by supplying configuration setting at runtime

    $ ./bin/spark-shell --driver-memory 5g
    

    Note: It cannot be achieved by setting it in the application, because it is already too late by then, the process has already started with some amount of memory.

    The main reason for 265.4 MB is that Spark dedicates spark.storage.memoryFraction * spark.storage.safetyFraction to the total amount of storage memory and by default they are 0.6 and 0.9.

    512 MB * 0.6 * 0.9 ~ 265.4 MB
    

    Be careful that not the whole amount of driver memory will be available for RDD storage.

        When this starts running  on a cluster, the spark.executor.memory setting will take over when calculating the amount to dedicate to Spark’s memory cache.

    Answered on November 15, 2018.
    Add Comment

       Here the important note is that for local mode we need to set the amount of driver memory before starting jvm:

    bin/spark-submit --driver-memory 2g --class your.class.here app.jar
    

    This will start the JVM with 2G instead of the default 512M.

    Details here:

         In local mode we have only have one executor, and this executor is the driver, So, In this we need to set the driver’s memory instead. *That said, in local mode, by the time we run spark-submit, a JVM has already been launched with the default memory settings, so setting “spark.driver.memory” in the conf won’t actually do anything. Instead, This need to be run in spark-submit as follows

     

    Answered on November 15, 2018.
    Add Comment

    Possibly, this particular question will not tell to run on local mode not on yarn. however I couldnt get spark-default.conf change to work. rather I tried this and this worked very well:

    bin/spark-shell --master yarn --num-executors 6 --driver-memory 5g --executor-memory 7g
    
    

     

    Answered on November 15, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.