SQL Cheatsheet
Debugging Duplicates In A Dataset
Sometimes you are working on a query and you want to debug whether there are duplicates in your table.
select
id
count(*)
group by id
having count(*) > 1
The query will show all the rows (grouped by id) with counts greater than 1.
Unintentional duplicates can occur after a join. Instead of fixing this by using select distinct
(which is a code smell), try to figure out why the duplication is happening.
Once you find the rows with duplicates, you can then select *
for those records to see if there is any relationship between them that explains the duplication.