I am trying to estimate what the fastest performance you can achieve in a live environment using Hive on Tez. What is the overhead of launching tasks on Tez?
Environment (Live production cluster):
Let’s query a single row/column table dual. Hive settings:
set hive.execution.engine=tez; set hive.prewarm.enabled=true;
Hive on Tez does not automatically allocate a session and containers, you have to launch any query to warm up Tez. For this reason I did not take into account the first execution of the query. After the first attempt, the best attempt is as follows:
select 1 from dual where 1 != 0; Query ID = v-dtolpeko_20141009085353_1303a726-cddb-421c-bdc7-d47db1678fa4 Total jobs = 1 Launching Job 1 out of 1 Status: Running (application id: application_1412375486094_125195) Map 1: -/- Map 1: 0/1 Map 1: 1/1 Status: Finished successfully OK 1 Time taken: 2.05 seconds, Fetched: 1 row(s)
An attempt in a less busy environment:
select 1 from dual where 1 != 0; Query ID = v-dtolpeko_20141009085555_10aa1761-0ccd-422b-b147-b3391b5f512f Total jobs = 1 Launching Job 1 out of 1 Status: Running (application id: application_1412375486094_125195) Map 1: -/- Map 1: 0/1 Map 1: 1/1 Status: Finished successfully OK 1 Time taken: 1.791 seconds, Fetched: 1 row(s)
Just for comparison, let’s run on MapReduce:
set hive.execution.engine=mr; select 1 from dual where 1 != 0; Query ID = v-dtolpeko_20141009090707_43ff36eb-7eb5-46d3-a53a-1c383ce558b1 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1412375486094_125419, Tracking URL = http://chsxedw:8088/proxy/application_1412375486094_125419/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1412375486094_125419 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2014-10-09 09:07:45,439 Stage-1 map = 0%, reduce = 0% 2014-10-09 09:08:04,474 Stage-1 map = 100%, reduce = 0% Ended Job = job_1412375486094_125419 MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK 1 Time taken: 31.837 seconds, Fetched: 1 row(s)
You can see that Tez allows you to reduce the query start up overhead to 2 seconds, but still not to 0.01-0.1 seconds.