PL/HQL 0.03 Released

PL/HQL 0.03, an open-source (Apache License 2.0) procedural SQL-on-Hadoop tool released today. Below are the most interesting new features available:

Managed Temporary Tables

PL/HQL allows you to use DECLARE TEMPORARY TABLE statement even with Hive 0.13 and earlier:

SET plhql.temp.tables = managed;
 
DECLARE TEMPORARY TABLE orders_tmp
(
   id INT,
   name STRING
);
 
INSERT INTO orders_tmp SELECT id, name FROM orders;
SELECT * FROM orders_tmp;

PL/HQL automatically adds an UUID to the table name to prevent name conflicts between multiple sessions, and drops the table at the end of session.

FOR CURSOR Loop

Now you can iterate through the query results using the FOR statement:

FOR item IN (
    SELECT dname, loc as location
    FROM dept
    WHERE dname LIKE '%A%'
    AND deptno > 10
    ORDER BY location)
LOOP
  DBMS_OUTPUT.PUT_LINE('Name = ' || item.dname || ', Location = ' || item.location);
END LOOP;

PL/HQL executes the query and executes the specified block of code for each row. You can print results, add conditional statements, other loops and so on.

On-the-fly SQL Conversion Introduced

PL/HQL introduced a feature to convert SQL statements on the fly, so you need to spend less time making your SQL statements compatible with Hive and other SQL-on-Hadoop tools.

For example, if you execute the following CREATE TABLE statement in Hive:

CREATE TABLE dept 
  (
    deptno NUMBER(2,0),
    dname  NUMBER(14),
    loc    VARCHAR2(13),
    CONSTRAINT pk_dept PRIMARY KEY (deptno)
  );

It is converted to the following Hive-compatible statement before execution:

CREATE TABLE dept 
(
  deptno DECIMAL(2,0),
  dname  DECIMAL(14),
  loc    STRING
);

There are many more new features, for more details, see PL/HQL 0.03 What’s New.