What is an efficient way to check if a Spark DataFrame is empty?

Where does Hive store files for Hive tables?

October 29, 2021

How to move a Hive table from one database to another?

January 5, 2022

Published by Big Data In Real World at December 29, 2021

Optimal way to check if dataframe is empty

Use the head function in place of count.

df.head(1).isEmpty

Above is efficient because to find whether a dataframe is empty or not, all you need to know is whether the dataframe has at least one record or not.

Note that head() on an empty dataframe will result in java.util.NoSuchElementException exception. So make sure to use head(1).

Big Data In Real World

We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

What is an efficient way to check if a Spark DataFrame is empty?

Where does Hive store files for Hive tables?

How to move a Hive table from one database to another?

Where does Hive store files for Hive tables?

How to move a Hive table from one database to another?

Optimal way to check if dataframe is empty

Big Data In Real World

Related posts

How to kill a running Spark application?

What is the default number of executors in Spark?

What is the default number of cores and amount of memory allocated to an application in Spark?