In a busy environment your MapReduce tasks maybe often killed to release the cluster resources to run high priority applications.
By default, when the preemption is enabled (
yarn.resourcemanager.scheduler.monitor.enable is set to
yarn-site.xml) Capacity Scheduler monitors resources every 3 seconds and kills selected containers if they do not gracefully terminate within 15 seconds after receiving a terminate request.
These default settings maybe too aggressive, and you can change them to allow MapReduce tasks to run longer before preemption. From the above example, you can see that the Map task was killed 3 times and 2 times it was killed just after about 30 seconds of execution.
You can edit
yarn-site.xml and set
yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill to a higher value (time in milliseconds, the default is 15000) to allow your MapReduce tasks to complete their work:
<property> <name>yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill</name> <value>300000</value> </property>