Hive – Selecting columns with regular expression

In Hive there is rather an unique feature that allows to select columns by
regular expression instead of using column by names.

It’s very useful when we need to select all columns except one. In most of the SQL databases we would have to specify all columns, but in Hive there is this feature that can save us typing.

Let’s say there is a people table with column name, age, city, country and created_at. To select all columns except created_at we can write:

set hive.support.quoted.identifiers=none;
 
select 
    `(created_at)?+.+`
from people
limit 10;

This is equivalent to:

select
    name, age, city, county
from people
limit 10;

Please note that in Hive 0.13 or later you have to set hive.support.quoted.identifier to none.
I have never seen such functionality in others SQL databases.

References

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select

Spark SQL

This is one of the Hive-specific features that are not available in Spark SQL.