16

I am looking to get all table definitions in Hive. I know that for single table definition I can use something like -

  describe <<table_name>>
  describe extended <<table_name>>

But, I couldn't find a way to get all table definitions. Is there any table in megastore similar to Information_Schema in mysql or is there command to get all table definitions ?

user4157124
  • 2,809
  • 13
  • 27
  • 42
GoldenPlatinum
  • 427
  • 2
  • 4
  • 12

3 Answers3

25

You can do this by writing a simple bash script and some bash commands.

First, write all table names in a database to a text file using:

$hive -e 'show tables in <dbname>' | tee tables.txt

Then create a bash script (describe_tables.sh) to loop over each table in this list:

while read line
do
 echo "$line"
 eval "hive -e 'describe <dbname>.$line'"
done

Then execute the script:

$chmod +x describe_tables.sh
$./describe_tables.sh < tables.txt > definitions.txt

The definitions.txt file will contain all the table definitions.

mostafazh
  • 4,144
  • 1
  • 20
  • 26
leonida
  • 351
  • 3
  • 4
  • 2
    its a good solution but very slow. if the database has 100s of table, each time the bash script does a “hive -e ....” for 100+ times, the execution will take a huge time to complete. Some looping within “hive -e...” is required... something like “hive -e `for loop desc ` “.
    – aiman Apr 06 '18 at 13:24
  • Hi Apurvaa, @mostafazh or other... And about diretc SQL to show for example 3-columns `` in a CSV file? There are something in Hive similar to [this SELECT FROM INFORMATION_SCHEMA](https://stackoverflow.com/a/109337/287948)? – Peter Krauss Aug 06 '19 at 13:28
  • there is no need to put the eval command. You only need to do hive -e "describe .$line" I am also quite doubtful, the code snippelt in its given form will loop – renjith Mar 24 '22 at 23:37
4

The above processes work, however it will be slow due to the fact that the hive connection is made for each query. Instead you can do what I just did for the same need below.

Use one of the above methods to get your list of tables. Then modify the list to make it a hive query for each table as follows:

describe my_table_01;
describe my_TABLE_02;

So you will have a flat file with the all your describe statements mentioned above. For example, if you have the query in a flat file called my_table_description.hql.

Get the output in one scoop as follows:

"hive -f my_table_description.hql > my_table_description.output

It is super fast and gets the output in one shot.

Soumendra Mishra
  • 3,483
  • 1
  • 12
  • 38
K P
  • 41
  • 1
1
  1. Fetch list of hive databases hive -e 'show databases' > hive_databases.txt

  2. Echo each table's desc:

    cat hive_databases.txt | grep -v '^$' | while read LINE;
    do
      echo "## TableName:" $LINE
      eval "hive -e 'show tables in $LINE' | grep -v ^$ | grep -v Logging | grep -v tab_name | tee $LINE.tables.txt"
      cat $LINE.tables.txt | while read table
      do
        echo "### $LINE.$table" > $LINE.$table.desc.md
        eval "hive -e 'describe $LINE.$table'" >> $LINE.$table.desc.md
        sed -i 's/\t/|/g' ./$LINE.$table.desc.md
        sed -i 's/comment/comment\n|:--:|:--:|:--:|/g' ./$LINE.$table.desc.md
      done
    done
harppu
  • 384
  • 4
  • 13
diggzhang
  • 517
  • 1
  • 6
  • 11