Wednesday, April 20, 2011

Using System Tables to Count the Percent of Rows Null in Various Tables

I'm trying to determine the percent of null items for all fields of a table. I have to run this on several tables with a ton of fields, and I was looking for an automated way of doing this.

I know I can query "information_schema.columns" and get a nice clean list of field names ala:

select Column_name 
from information_schema.columns 
where table_name='TableName'

But I can't seem to come up with something quick and dirty to do the percentage count for each field, I'm guessing I'll need some dynamic sql of some sort? Anyone have a suggestion on a good way to approach this?

From stackoverflow
  • AFAIK the only way to do it is to use dynamic sql (e.g., sp_executesql). There are index statistics but nulls aren't stored in indexes...

  • Maybe too simplistic, but the basic idea can be expanded in different ways (I normally have variables for @CRLF and @TAB for code generation):

    DECLARE @sql AS varchar(MAX)
    
    SELECT @sql = COALESCE(@sql + CHAR(13) + CHAR(10) + 'UNION' + CHAR(13) + CHAR(10), '')
        + 'SELECT ''' + QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) + '.' + QUOTENAME(COLUMN_NAME) + ''' AS COLUMN_NAME' + CHAR(13) + CHAR(10)
         + CHAR(9) + ',COUNT(*) AS TotalRows' + CHAR(13) + CHAR(10)
         + CHAR(9) + ',COUNT(' + COLUMN_NAME + ') AS NonNullCount' + CHAR(13) + CHAR(10)
        + 'FROM ' + QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
    FROM INFORMATION_SCHEMA.COLUMNS WHERE IS_NULLABLE = 'YES'
    
    PRINT @sql
    EXEC (@sql)
    

    As far as your percentages, I wasn't sure if that was over the entire table or a particular column only, so I'll leave that as an exercise for the reader.

0 comments:

Post a Comment