What is the difference between static partitioning and dynamic partitioning in Hive? - Big Data In Real World

What is the difference between static partitioning and dynamic partitioning in Hive?

How to move a Hive table from one database to another?
January 5, 2022
How to see the current database user is on with Hive cli or Beeline?
January 19, 2022

The difference between static and dynamic partitioning only exists when the partition is being created based on how the partitions are added to the table. Once the partitions are created, the tables won’t have any difference like static and dynamic partitions. All partitions are treated and one and the same.

Static partitioning

Partitioning is referred to as static when you are using LOAD command to create the partition. With the below 2 partitions are created in sales table one with country=”US” and one with country=”UK”. Data under /hdfs/path/US has the data from partition country=”US” and similarly data under /hdfs/path/UK has the data from partition country=”UK”

hive>LOAD DATA INPATH '/hdfs/path/US' 
INTO TABLE sales PARTITION(country="US") 


hive>LOAD DATA INPATH '/hdfs/path/UK' 
INTO TABLE sales PARTITION(country="UK")

Dynamic partitioning

With the below INSERT with a SELECT query we are creating multiple partitions in table sales dynamically. If sales_staging has records from 10 countries then 10 partitions are created in sales tables.

hive> INSERT INTO TABLE sales PARTITION(country) 
SELECT * from sales_staging;

hive.exec.dynamic.partition  control whether to allow dynamic partition or not. The default value is false prior to Hive 0.9.0 and true in Hive 0.9.0 and later.

When to use static partitioning and when to use dynamic partitioning?

  • Use static partitioning when data is already physically categorized/grouped/partitioned and ready to be added as a partition to a table.
  • Static partitioning will not result in MapReduce job execution since the data is already physically categorized/partitioned.
  • Use dynamic partitioning  when data is not already physically categorized/grouped/partitioned.
  • Dynamic partitioning will result in MapReduce job execution to group the data first and then partitions will be added to the table. 
Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

gdpr-image
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X