0

Searched a bit and didn't see anything that really fit my needs (although I don't really understand SQL beyond the basic select statement so maybe I just missed it). Took a SQL class in school many moons ago.

I have:

  • Virtual Windows Server 2008 R2 running SQL 2008 R2. I connect via MS SQL Server Management Studio
  • 90-100 databases
  • All DB's have the same table structure (each one is a different client)

I want to:

  • Search all databases and return a list of all databases which have a table (TableName) that is larger than say, 5000 records.
  • Some sort of script that I can schedule that will use that list and if it finds a DB that the TableName table is more than 5000 records, will delete anything older than x amount of days (say 30). Any kind of logging to know what happened over the last few days would be a bonus.

Any help would be appreciated. Thank you.

EDIT/UPDATE (2/24/15): Hiren Dhaduk provided a nice stored procedure that works. Thanks!

Eli
  • 153
  • 1
  • 3
  • 20

2 Answers2

0

You can use following store procedure in any database. Then run this store procedure by passing table name and NoOfRows(Minimum number of rows for table , In your case it will be 5000):

CREATE PROCEDURE usp_FindLargeTables
@TableName VARCHAR(256) , @NoOfRows int
AS
BEGIN
    DECLARE @DBName VARCHAR(256)
    DECLARE @varSQL VARCHAR(512)
    DECLARE @getDBName CURSOR

SET @getDBName = CURSOR FOR
SELECT name
FROM sys.databases
WHERE state != 6

CREATE TABLE #TmpTable (TableName VARCHAR(256),
SchemaName VARCHAR(256),
DBName VARCHAR(256))

OPEN @getDBName
FETCH NEXT
FROM @getDBName INTO @DBName
WHILE @@FETCH_STATUS = 0
BEGIN

    SET @varSQL =  'USE ' + @DBName + ';
    INSERT INTO #TmpTable
    SELECT 
    sysobjects.Name as TableName
    , sysindexes.Rows as NoOfRows , '''+ @DBName + ''' AS DBName
    FROM
    sysobjects
    INNER JOIN sysindexes
    ON sysobjects.id = sysindexes.id
    WHERE
    type = ''U''
    AND sysindexes.IndId < 2 
    and sysobjects.Name = '''+ @TableName +''' and sysindexes.Rows > ' + CONVERT(VARCHAR, @NoOfRows) + ''

EXEC (@varSQL)
FETCH NEXT
FROM @getDBName INTO @DBName
END
CLOSE @getDBName
DEALLOCATE @getDBName

SELECT *
FROM #TmpTable
WHERE DBName != 'master'

-- STEP 2
DECLARE @DYNAMICQUERY VARCHAR(MAX)

SET @DYNAMICQUERY = 
REPLACE((
SELECT 'DELETE FROM ['+ DBName +'].[dbo].['+ TableName +'] where Createdate   < DATEADD(day, -30, GETDATE());'
FROM #TmpTable
WHERE DBName != 'master'
FOR XML PATH('')
), '&lt;' , '<');

EXEC(@DYNAMICQUERY);

DROP TABLE #TmpTable
END

Example : usp_FindLargeTables 'DeltaStuds',5000

Clarification on your second point :

Unless you run a trace when the changes happen it is not possible.

So to do this i would suggest you to put one column called createdate in this table and then you will be able to delete record created before 30 days.

Hiren Dhaduk
  • 2,760
  • 2
  • 19
  • 21
  • Thanks @Hiren Dhaduk, I'll also test this out and report back. – Eli Jan 29 '15 at 21:51
  • Ok, Please share if you find something about getting insert date of rows without maintaining a column – Hiren Dhaduk Jan 29 '15 at 21:54
  • Hi @Hiren Dhaduk, the first query you provided is only for the one DB...not for all DB's. It works, but only for the DB I ran the query against. – Eli Jan 30 '15 at 18:11
  • Also, @Hiren Dhaduk, I tried to wrap your first query with a sp_MSforeachdb 'your query' but it returned blank results. – Eli Jan 30 '15 at 18:18
  • Might your tables don't have more then 5000 rows. – Hiren Dhaduk Jan 30 '15 at 18:25
  • I know at least 2 of the databases have DeltaStuds tables that are 150,000+ records. I didn't get those in the list/results of the query. @hiren dhaduk – Eli Jan 30 '15 at 18:29
  • I think you should run this query on master database so that it will give you result of all database. Otherwise it will give result based on selected database. – Hiren Dhaduk Jan 30 '15 at 18:57
  • @llya I just modified my ans. I checked this store procedure in my sql server , it just worked nicely. – Hiren Dhaduk Jan 30 '15 at 20:15
  • Thanks @hiren dhaduk, I'll give this a try on Monday when I'm back in the office. Appreciate the assistance. – Eli Jan 30 '15 at 21:59
  • Thank you for your help @Hiren Dhaduk...looks like that began to work however it failed on a database that was offline. Is there a way to skip any offline databases and only report back against online ones? lol, sorry for asking for so much help. You guys are the best and know a lot more than me. Thank you! – Eli Feb 03 '15 at 18:50
  • You can use below query to check database status. SELECT Name, state_desc FROM sys.databases – Hiren Dhaduk Feb 03 '15 at 21:09
  • @liya I modified my ans by adding filter for offline databases. You can check condition "SELECT name FROM sys.databases WHERE state != 6" this query will select only online databases. – Hiren Dhaduk Feb 03 '15 at 21:12
  • Thanks @Hiren Dhaduk. No errors this time, looks like it omitted the offline databases, which is excellent. Results were blank though...and I know I have at least two DB's with huge amounts of records. Just to clarify, I right click on the SP and do 'execute' and then type in the table name and the value...right? – Eli Feb 03 '15 at 21:48
  • Yes , Tables which have more then 5000 records but those databases might be offline. – Hiren Dhaduk Feb 04 '15 at 08:14
  • The ones I'm trying to see are online, but still resulting in a blank result... @Hiren Dhaduk – Eli Feb 04 '15 at 14:46
  • As a test, I ran the SP with a value of 1 (instead of 5000) and it still returned a blank results list. @Hiren Dhaduk – Eli Feb 04 '15 at 14:47
  • Sorry Liya , I forgot to put dynamic table name in store project. I updated my ans. Modified - http://prntscr.com/60z6hg – Hiren Dhaduk Feb 04 '15 at 15:15
  • Awesome! That worked! Now I just need to figure out how to delete the data. I would need something like this: DELETE FROM TableName WHERE ChangeDateTime > DATEADD(day, -30, GETDATE())...if I was to create a job, would it be possible to have your query above as step 1 and then someone be able to have the delete as step 2? @Hiren Dhaduk – Eli Feb 04 '15 at 16:57
  • @llya I just edited my ans by adding some lines which will do delete operation in save store procedure. http://prntscr.com/610zty – Hiren Dhaduk Feb 04 '15 at 17:33
  • Thanks @Hiren Dhaduk...would I be able to split the scripts up and have one that is the original (search for tables) and another one that is just the dynamicquery declaration/statements for the delete? Would the #TmpTable be a variable that can be accessed if I don't drop it in the original query first? – Eli Feb 04 '15 at 17:56
  • @llya actually that will not be proper approch to share temp table with two store procedure. You can do that though. – Hiren Dhaduk Feb 04 '15 at 18:39
  • Okay, I was running into errors with doing them separate so I tried to do them together and got the follwing error: Msg 4145, Level 15, State 1, Line 1 An expression of non-boolean type specified in a context where a condition is expected, near ';'. @Hiren Dhaduk – Eli Feb 04 '15 at 18:51
  • Put print(@DYNAMICQUERY) instead of EXEC(@DYNAMICQUERY). Let me know which query do you see after running store procedure. – Hiren Dhaduk Feb 04 '15 at 19:12
  • That worked correctly, I got the result I expected @Hiren Dhaduk – Eli Feb 04 '15 at 19:28
  • Do you have any more suggestions on how I can get the stored procedure to delete the contents that are older than X days of the tables returned by your (amazing) query? @Hiren Dhaduk – Eli Feb 09 '15 at 15:28
  • http://prntscr.com/631oip - this query will delete past 30 days records for you. I modified store procedure. Let me know if you need any further help. – Hiren Dhaduk Feb 09 '15 at 17:28
  • Hi @Hiren Dhaduk...I'm back to getting the Msg 4145, Level 15, State 1, Line 1 An expression of non-boolean type specified in a context where a condition is expected, near ';'. error. This was the initial issue you fixed by having my replace EXEC with PRINT...but PRINT I'm assuming isn't actually deleting anything? – Eli Feb 09 '15 at 20:13
  • http://prntscr.com/634kuu - modified it. It should work now. Your table must contain createdate column though. – Hiren Dhaduk Feb 09 '15 at 20:30
  • slightly different issue. It found the table, but didn't delete it. Got this error: Msg 208, Level 16, State 1, Line 1 Invalid object name 'DeltaStuds'. – Eli Feb 11 '15 at 20:22
  • Delete query don't know that which database to use to delete data. - http://prntscr.com/63yyeo I added use statement in delete query. it should work now. I don't have ready made databases structure so its getting difficult for me to verify query. – Hiren Dhaduk Feb 11 '15 at 22:26
  • You're getting close @Hiren Dhaduk...it didn't return an error this time. It listed the table that would be affected in the results window but when I run a basic select against that test table, the data is still there...so it didn't actually delete anything. – Eli Feb 12 '15 at 14:36
  • Createdate > DATEADD(day, -30, GETDATE()) , This query affect delete query. Are you sure that table contain data older then 30 days ? – Hiren Dhaduk Feb 12 '15 at 16:17
  • Yes, because when I do a SELECT * FROM TABLENAME WHERE CreateDate > DATEADD(day, -30, GETDATE()) it returns a list of records. It just won't delete them using the same where clause. @Hiren Dhaduk – Eli Feb 12 '15 at 17:33
  • I don't see any issue with this script. If you can provide remote access of database then i can figure it out. – Hiren Dhaduk Feb 12 '15 at 18:03
  • Okay, I don't think my manager would allow for that so I'll keep tinkering with it. Thanks for all of your help! @Hiren Dhaduk – Eli Feb 12 '15 at 20:29
  • No problem. I will try to generate same senerio in my local machine , will share with you if i find anything. – Hiren Dhaduk Feb 12 '15 at 20:30
  • Have you had a chance to test it on your local machine @Hiren Dhaduk? I haven't been able to figure out why it isn't actually deleting anything. – Eli Feb 20 '15 at 14:17
  • Just tried this in my local environment. I modified store procedure. http://prntscr.com/69a9qw - Remove old store procedure , use latest one that i modified. It worked in my environment. – Hiren Dhaduk Feb 24 '15 at 16:55
0

What you are asking is very custom to your setup...

You could use the following query to identify tables with row count greater than 5000

SELECT sc.name +'.'+ ta.name TableName
 FROM sys.tables ta
 INNER JOIN sys.partitions pa
 ON pa.OBJECT_ID = ta.OBJECT_ID
 INNER JOIN sys.schemas sc
 ON ta.schema_id = sc.schema_id
 WHERE ta.is_ms_shipped = 0 AND pa.index_id IN (1,0)
 and ta.name = 'DeltaStuds'
 GROUP BY sc.name,ta.name
 having SUM(pa.rows)>5000

But you would need something like this link to run it for all DBs. How to find column names for all tables in all databases in SQL Server. I dont know of a easier way...

For the second part, you will have to setup a delete process that takes the resultset from the above query, creates a dynamic query and deletes rows based on your date field. In that process, you could inject an audit mechanism that logs that delete action, row count, datetime, etc...

Community
  • 1
  • 1
SoulTrain
  • 1,904
  • 1
  • 12
  • 11
  • Thanks @SoulTrain. I'll test this out and report back. – Eli Jan 29 '15 at 21:51
  • Hey @SoulTrain, I tried to wrap your query with a sp_MSforeachdb 'your query' but it returned blank results. It ran, didn't error, but just returned a bunch of column headings. – Eli Jan 30 '15 at 18:22
  • I know at least two databases have a DeltaStuds table that is 150,000+ records (that's why I'm trying to figure this out, so I can run a scheduled process that deletes most if not all of those records as it's causing errors once it gets that large). @SoulTrain – Eli Jan 30 '15 at 18:30