How to export a Hive table into a CSV file? - Big Data In Real World

How to export a Hive table into a CSV file?

How does Spark decide the number of tasks and number of tasks to execute in parallel?
August 4, 2021
What does “Stage Skipped” mean in Apache Spark web UI?
August 9, 2021
How does Spark decide the number of tasks and number of tasks to execute in parallel?
August 4, 2021
What does “Stage Skipped” mean in Apache Spark web UI?
August 9, 2021

It is a pretty common use case to export the contents of a Hive table into a CSV file. It’s pretty simple if you are using a recent version of Hive. In this post, we will see who to achieve this with both newer and older versions of Hive.

Hive version 11 or higher

If you are using Hive version 11 or higher below command will do the job. ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ dictates that the columns should be delimited by a comma.

INSERT OVERWRITE LOCAL DIRECTORY '/home/hirw/sales 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
select * from sales_table;

 

Hive versions older than 11

By default selecting the hive table and writing to a file like below will result in a tab separated file and of course this is not what you want as you want a comma separated file.

hive -e 'select * from sales_table' > /home/hirw/sales.tsv

With below, you are selecting the table and piping the results to sed command and passing a regex expression.

hive -e 'select * from sales_table' | sed 's/[\t]/,/g' > /home/hirw/sales.csv

The regex expression does a global match of all tab character [t] and replace with a ,

This approach is not preferred as you might get inconsistent results when dealing with huge datasets. Also you are combing Hive and sed (Linux) tools to get the job done which in our opinion is not a clean approach.

 

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

How to export a Hive table into a CSV file?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X