Spark java.lang.OutOfMemoryError: Java heap space

Spark java.lang.OutOfMemoryError: Java heap space

Asked on November 14, 2018 in Apache-spark.
Add Comment

  • 3 Answer(s)

    These are some suggestions:

        In this nodes are configured to have 6g maximum for Spark (and are leaving a little for other processes), then use 6g rather than 4g, spark.executor.memory=6g. Make a confirmation that more memory as possible are used by checking the UI (it will say how much mem you’re using)

         Here use more partitions, you should have 2 – 4 per CPU. IME increasing the number of partitions is often the easiest way to make a program more stable (and often faster). For large amounts of data you may need way more than 4 per CPU, I’ve had to use 8000 partitions in some cases!

         Decrease the fraction of memory reserved for caching, using If you don’t use cache() or persist in your code, this might as well be 0. It’s default is 0.6, which means you only get 0.4 * 4g memory for your heap. IME reducing the mem frac often makes OOMs go away. UPDATE: From spark 1.6 apparently we will no longer need to play with these values, spark will determine them automatically.

          As same as to above but shuffle memory fraction. If particular job doesn’t need much shuffle memory then set it to a lower value (this might cause your shuffles to spill to disk which can have catastrophic impact on speed). Sometimes when it’s a shuffle operation that’s OOMing we need to do the opposite i.e. set it to something large, like 0.8, or make sure you allow your shuffles to spill to disk (it’s the default since 1.0.0).

          Watch out for memory leaks, these are often caused by accidentally closing over objects you don’t need in your lambdas. The way to diagnose is to look out for the “task serialized as XXX bytes” in the logs, if XXX is larger than a few k or more than an MB, you may have a memory leak.

       Related to above; use broadcast variables if you really do need large objects.

         If you are caching large RDDs and can sacrifice some access time consider serialising the RDD Or even caching them on disk (which sometimes isn’t that bad if using SSDs).

         Related to above, avoid String and heavily nested structures (like Map and nested case classes). If possible try to only use primitive types and index all non-primitives especially if you expect a lot of duplicates. Choose WrappedArray over nested structures whenever possible. Or even roll out your own serialisation – YOU will have the most information regarding how to efficiently back your data into bytes, USE IT!

         (bit hacky) Again when caching, consider using a Dataset to cache your structure as it will use more efficient serialisation. This should be regarded as a hack when compared to the previous bullet point. Building your domain knowledge into your algo/serialisation can minimise memory/cache-space by 100x or 1000x, whereas all a Dataset will likely give is 2x – 5x in memory and 10x compressed (parquet) on disk.

    EDIT: (So I can google myself easier) The following is also indicative of this problem:

    java.lang.OutOfMemoryError : GC overhead limit exceeded
    Answered on November 14, 2018.
    Add Comment

    This is the start up scripts a Java heap size is set there, it looks like you’re not setting this before running Spark worker.

    # Set SPARK_MEM if it isn't already set since we also use it for this process
    export SPARK_MEM
    # Set JAVA_OPTS to be able to load native libraries and to set heap size
    JAVA_OPTS="$JAVA_OPTS -Djava.library.path=$SPARK_LIBRARY_PATH"

    This is documentation to deploy scripts here.

    Answered on November 14, 2018.
    Add Comment

         Here, In this you should increase the driver memory. In the $SPARK_HOME/conf folder you should find the file spark-defaults.conf, edit and set the spark.driver.memory 4000m depending on the memory on your master, I think. This is best solution which will fix the issue and runs smoothly

    Answered on November 14, 2018.
    Add Comment

  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.