本文介紹了紗線容器內(nèi)存不足的處理方法,對(duì)大家解決問(wèn)題具有一定的參考價(jià)值,需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)吧!
問(wèn)題描述
我的紗線容器內(nèi)存不足:
此特定容器運(yùn)行一個(gè)Apache-Spark驅(qū)動(dòng)程序節(jié)點(diǎn)。
我不理解的部分:我將驅(qū)動(dòng)程序的堆大小限制為512MB(您可以在下面的錯(cuò)誤消息中看到這一點(diǎn))。但是紗線容器抱怨內(nèi)存>1 GB(也請(qǐng)參見(jiàn)下面的消息)。您可以驗(yàn)證YAIN正在啟動(dòng)Java是否與Xmx512M一起運(yùn)行。我的容器設(shè)置為1 GB內(nèi)存,增量為0.5 GB。此外,我托管紗線容器的物理機(jī)器每臺(tái)都有32 GB。我通過(guò)SSH連接到其中一臺(tái)物理機(jī),發(fā)現(xiàn)它有很多可用內(nèi)存…
另一件奇怪的事情是,Java沒(méi)有拋出OutOfMemory異常。當(dāng)我查看驅(qū)動(dòng)程序日志時(shí),我發(fā)現(xiàn)它最終從紗線中獲得了SIGTERM,并很好地關(guān)閉了。如果Yarn內(nèi)部的Java進(jìn)程超過(guò)了512MB,我難道不應(yīng)該在Java嘗試從Yarn分配1 GB之前得到一個(gè)OutOfMemory異常嗎?
我還嘗試了使用1024M的堆運(yùn)行。那一次,容器崩潰了,使用量為1.5 GB。這是始終如一的。因此,很明顯,容器有能力在1 GB限制之外再分配0.5 GB。(非常符合邏輯,因?yàn)槲锢頇C(jī)有30 GB的可用內(nèi)存)
除Java外,紗線容器內(nèi)是否還有其他東西可能會(huì)占用額外的512MB?
我在Yarn上運(yùn)行CDH 5.4.1和ApacheSpark。集群上的Java版本也升級(jí)到了oracleJava 8。我看到一些人聲稱Java 8中的默認(rèn)MaxPermSize已經(jīng)被更改,但我?guī)缀醪幌嘈潘赡軙?huì)占用512MB…
紗線錯(cuò)誤信息:
Diagnostics: Container [pid=23335,containerID=container_1453125563779_0160_02_000001] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1453125563779_0160_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 23335 23333 23335 23335 (bash) 1 0 11767808 432 /bin/bash -c LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native::/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native /usr/lib/jvm/java-8-oracle/bin/java -server -Xmx512m -Djava.io.tmpdir=/var/yarn/nm/usercache/hdfs/appcache/application_1453125563779_0160/container_1453125563779_0160_02_000001/tmp '-Dspark.eventLog.enabled=true' '-Dspark.executor.memory=512m' '-Dspark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar' '-Dspark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native' '-Dspark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native' '-Dspark.shuffle.service.enabled=true' '-Dspark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar' '-Dspark.app.name=not_telling-1453479057517' '-Dspark.shuffle.service.port=7337' '-Dspark.driver.extraClassPath=/etc/hbase/conf:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar' '-Dspark.serializer=org.apache.spark.serializer.KryoSerializer' '-Dspark.yarn.historyServer.address=http://XXXX-cdh-dev-cdh-node2:18088' '-Dspark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native' '-Dspark.eventLog.dir=hdfs://XXXX-cdh-dev-cdh-node1:8020/user/spark/applicationHistory' '-Dspark.master=yarn-cluster' -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'not_telling' --jar file:/home/cloud-user/temp/not_telling.jar --arg '--conf' --arg 'spark.executor.extraClasspath=/opt/cloudera/parcels/CDH/jars/htrace-core-3.0.4.jar' --executor-memory 512m --executor-cores 4 --num-executors 10 1> /var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001/stdout 2> /var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001/stderr
|- 23338 23335 23335 23335 (java) 95290 10928 2786668544 261830 /usr/lib/jvm/java-8-oracle/bin/java -server -Xmx512m -Djava.io.tmpdir=/var/yarn/nm/usercache/hdfs/appcache/application_1453125563779_0160/container_1453125563779_0160_02_000001/tmp -Dspark.eventLog.enabled=true -Dspark.executor.memory=512m -Dspark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar -Dspark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native -Dspark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native -Dspark.shuffle.service.enabled=true -Dspark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar -Dspark.app.name=not_tellin-1453479057517 -Dspark.shuffle.service.port=7337 -Dspark.driver.extraClassPath=/etc/hbase/conf:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar -Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.yarn.historyServer.address=http://XXXX-cdh-dev-cdh-node2:18088 -Dspark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/lib/native -Dspark.eventLog.dir=hdfs://XXXX-cdh-dev-cdh-node1:8020/user/spark/applicationHistory -Dspark.master=yarn-cluster -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1453125563779_0160/container_1453125563779_0160_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class not_telling --jar file:not_telling.jar --arg --conf --arg spark.executor.extraClasspath=/opt/cloudera/parcels/CDH/jars/htrace-core-3.0.4.jar --executor-memory 512m --executor-cores 4 --num-executors 10
推薦答案
查看this article,它有一個(gè)很好的描述。您可能想要注意他們?cè)谀睦镎f(shuō)”在計(jì)算執(zhí)行器的內(nèi)存時(shí),要注意最大(7%,384M)堆外內(nèi)存開(kāi)銷。”
編輯(Eshalev):我接受這個(gè)答案,并詳細(xì)說(shuō)明發(fā)現(xiàn)了什么。Java8使用了不同的內(nèi)存方案。具體地說(shuō),CompressedClass在”Metspace”中保留了1024MB。這比以前版本的Java在”perm-gen”內(nèi)存中分配的內(nèi)存大得多。您可以使用”jmap-heap[id]”來(lái)檢查這一點(diǎn)。我們目前通過(guò)過(guò)度分配超過(guò)堆需求的1024MB來(lái)防止應(yīng)用程序崩潰。這很浪費(fèi),但它可以防止應(yīng)用程序崩潰。
這篇關(guān)于紗線容器內(nèi)存不足的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,