© 2017 by awari.com.au

  • Black Twitter Icon
  • Black LinkedIn Icon

Finding duplicate rows in a table

April 26, 2016

Hey all, how many times have you been data profiling and had the question, what data has been duplicated data in a table ?

 

It's pretty easy to find out quickly using the HAVING clause with the GROUP BY statement.

 

Below is an example query that you can use to do this, just replace the table and columns to suit your data requirements.

 

Code ...

 

-- This query will return the business key values that are duplicated
-- across the Member table in the sample fictitious Insurance database, email me if you require the db

 

select
     count(member_biz_key) as BizKeyCount                -- Select the column that is to be tested for duplicates
    ,member_biz_key                                                       -- and show the column(s) required to view                                           ,gender

from    

    dbo.Member                                                               -- Specify your table of interest

group by    

   member_biz_key                                                         -- Ensure you group by the selected columns,

  ,gender

having    

   count(member_biz_key) > 1                                       -- Use the having filter where the count is greater than 1           

 

The below shot is the query result, in all there are 225 duplicated keys out of 10,663 rows, these duplicates are going to mess with your data integrity!

 

 

                                     
 

Please reload

Featured Posts

Over 20K students enrolled across all of my course presentations

April 21, 2017

1/4
Please reload

Recent Posts

February 19, 2017

Please reload

Archive
Please reload

Search By Tags