1

I get the following in my log file, but don't know what is the implication:

query:

create table mydb.new as

select a.col1,b.col4,a.col3 from 

sampledb.table1 a
left join sampledb.table2 b
on a.col3=b.col3
;


No Stats for sampledb@table1, Columns: col_1
No Stats for sampledb@table2, Columns: col_4

Can someone help with this or guide me where to look?

Related question with no answers:Hive No Stats for 'database'@'table', Columns

leftjoin
  • 36,950
  • 8
  • 57
  • 116
DS R
  • 235
  • 2
  • 13

1 Answers1

2

No stats can result in suboptimal plan (for join columns) or table scan instead of using statistics only for simple queries:

CBO uses statistics when creating query plan. Absence of statistics or stale statistics may result in sub-optimal query plan. If there is no stats, CBO makes some estimations based on files size/avg row size (see hive.stats.avg.row.size configuration and other statistics related settings).

Also if hive.compute.query.using.stats=true and statistics exists, then optimizer is using statistics for simple query (for example select count(col1) ...) calculation instead of querying table data (this may lead to wrong query results if the stats is stale), see this answer.

Also read this design document for more details about stats: https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-StatisticsinHive

leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • Thanks, this is very useful. – DS R Mar 31 '21 at 12:40
  • A followup; what can I do so that stats from original table are made available for the newly created table as well? https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-StatisticsinHive Had some info but no direct answer – DS R Apr 01 '21 at 09:09
  • 1
    @DSR Try to set this property `set hive.stats.autogather=true; `before loading table, or execute ANALYZE TABLE COMPUTE STATISTICS: https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables%E2%80%93ANALYZE after loading. See also https://stackoverflow.com/a/39914232/2700344 – leftjoin Apr 01 '21 at 09:32