Skip to content

Guide to Apache Hive

July 10, 2014


Apache Hive is data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

This technology is primarily used on top of Apache Hadoop.


Start Command Line Shell

$ hive


Useful Commands

Show Databases/Schemas

hive> show databases;

Use Database/Schema

hive> use {database_name};

Show Tables in Database/Schemea

hive> show tables;

Show Table Partitions

hive> show partitions {table_name};


Describe Table

hive> desc {table_name}

Run Hive Query

hive> select * from {schema}.{table_name} where hive_entry_timestamp>"{starting_timestamp}" and hive_entry_timestamp<="ending_timestamp" limit 100;

Stop Command Line Shell

hive> quit;


hive> exit;


Ctrl + Z
Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: