Monday, September 9, 2013

Remove/Delete Duplicate Records From Database

This post explains how to Remove Delete Duplicate Records Or Rows From Ms Sql Server Database Table.

Different methods of deleting duplicate records.

Delete Duplicate Records In Sql Server
I am using Employees table with FirstName and Department columns.








Remove Duplicate Rows In Sql
First Method.

Delete duplicate records/rows by creating identity column.


duplicate records in table looks like shown in first image.

First of all we need to create a identity column in our table by using code mentioned below.

And table will look like image on the left.



ALTER TABLE dbo.Employees ADD ID INT IDENTITY(1,1)

Now write this query to delete duplicate rows.

DELETE FROM dbo.Employees
WHERE ID NOT IN (SELECT MIN(ID)
FROM dbo.Employees GROUP BY FirstName,Department)

This should remove all duplicate records from table.


Second Method.

Delete duplicate records using Row_Number()


If you do not want to make any changes in table design or don't want to create identity column on table then you can remove duplicate records using Row_Number in sql server 2005 onwards.

for this write below mentioned code and execute.
WITH DuplicateRecords AS
(
SELECT *,row_number() OVER(PARTITION BY FirstName,Department ORDER BY
FirstName)AS RowNumber FROM dbo.Employees
)

DELETE FROM DuplicateRecords WHERE RowNumber > 1

This should remove all duplicate records from table.


Third Method.

Remove duplicate rows/Records using temporary table


Use below mentioned code to delete duplicates by moving them to temporary table using DISTINCT.

SELECT DISTINCT * INTO TempTable FROM dbo.Employees
GROUP BY FirstName,Department
HAVING COUNT(FirstName) > 1
DELETE dbo.Employees WHERE FirstName
IN (SELECT FirstName FROM TempTable)

INSERT dbo.Employees SELECT * FROM TempTable
DROP TABLE TempTable


Remove delete Duplicate Rows In Sql
And result will be as shown.

Have fun.


0 comments:

Post a Comment

Twitter Delicious Facebook Digg Stumbleupon Favorites More