0

Sorry for the vague Problem statement, maybe I could not frame it the right way hence couldn't get much help from the internet.

My basic intent is to select two columns from a table, where the table has been partitioned based on dates, and has huge number of records.

The intent is to select records for each date from jan 1 to nov 30, can't do it using date between statement, I need to input each date separately in the iteration in for loop.

Can anyone help writing a dummy sql query on this?

Thanks,

khanna
  • 718
  • 10
  • 24
  • as a pseudocode, I'd want for (date in DATES)SELECT col1,col2 from TABLE where datecol is date. Thanks! – khanna Dec 02 '14 at 07:13
  • I know this, but datatable is soo huge! I have to access just one partition for a query, it is a query optimization problem basically. – khanna Dec 02 '14 at 07:25
  • I dont think that you will win anything by using a forloop over the "GROUP BY" as the sqldatabase still needs to look throw the hole table. – Benjamin Karlog Dec 02 '14 at 07:26
  • Why do you need to run one query for every date? Do you get a spool error when you run a single query? Do you need to process all rows for a given date or just some? – dnoeth Dec 02 '14 at 17:46

3 Answers3

1

you can use any scripting language to iterate over each day to achieve. below is a sample (untested code)

import pyodbc

conn = pyodbc.connect(DSN='dsnname',autocommit=True,ansi=True)
cur = conn.cursor()
sql = "INSERT INTO tgt_table SELECT col1,col2 FROM src_table WHERE partitioned_date_column = date '2014-01-01' + INTERVAL '%d' DAY"

for i in range(1,30):
    cur.execute(sql % i)
rogue-one
  • 11,259
  • 7
  • 53
  • 75
  • 2
    Thanks man , I dont know who you are, but you have been an immense help, You made my day, Thanks a ton ! – khanna Dec 02 '14 at 11:35
0

I dont know how your tables look like, but have you tryed something like this:

"WHERE {yourRow} > X GROUP BY date"

MySQL Query GROUP BY day / month / year

Hope this can help you :-)

SELECT col1, col2 FROM {table} WHERE datecol between '2000-01-01' and DATE_ADD('2000-01-31',INTERVAL 1 DAY)

OR

SELECT col1, col2, FROM {table} GROUP BY DAY(datecol)
Community
  • 1
  • 1
Benjamin Karlog
  • 330
  • 1
  • 6
  • 22
  • I know this, but datatable is soo huge! I have to access just one partition for a query, it is a query optimization problem basically. – khanna Dec 02 '14 at 07:25
  • Ok,so this interval 1 day, does it only access one partition of the datatable, As in, it shouldnt be spanning a query over few datapartitions. – khanna Dec 02 '14 at 07:26
  • I have not read all the documentation on this subject, but I would assume that the first example is the fastest as it first look at the date, and if date is out of range then break, else it would select. I think you will need to test for yourself of what is the fastest way to do it. (WHERE vs GROUP BY) please let me know what your benchmarks are saying. – Benjamin Karlog Dec 02 '14 at 07:32
0

Assuming you use oracle, this dummy select should give you the list of dates to join

select to_date('01.01.2014','dd.mm.yyyy') + level - 1
from dual
connect by level -1 <= to_date('30.11.2014','dd.mm.yyyy') - to_date('01.01.2014','dd.mm.yyyy')    
Dmitry Grekov
  • 688
  • 3
  • 12
  • Very nice, but then for each such date i want the records eg tuple col1,col2,datecol from the table, where datecol is that date in the iterator. I have to write basically sql query on teradata – khanna Dec 02 '14 at 07:23