刪除表中重複資料的sql
一方法:
查詢表中重複資料 select * from employee where employeeId in (select employeeId from employee group by employeeId having count(employeeId) > 1)
刪除表中多餘的重複記錄,重複記錄是根據單個欄位(employeeId)來判斷,只留有rowid最小的記錄 delete from employee where employeeId in (select employeeId from employee group by employeeId having count(employeeId) > 1) and rowid not in (select min(rowid) from employee group by employeeId having count(employeeId )>1)
查詢表中多餘的重複記錄(多個欄位) select * from employee e where (e.employeeId,e.phoneNo) in (select employeeId,phoneNo from employee group by employeeId,phoneNo having count(*) > 1)
刪除表中多餘的重複記錄(多個欄位),只留有rowid最小的記錄 delete from employee e where (e.employeeId, e.phoneNo) in (select employeeId, phoneNo from employee group by employeeId,phoneNo having count(*) > 1) and rowid not in (select min(rowid) from employee group by employeeId,phoneNo having count(*)>1)
查詢表中多餘的重複記錄(多個欄位),不包含rowid最小的記錄 select * from employee e where (e.employeeId,e.phoneNo) in (select employeeId, phoneNo from employee group by employeeId,phoneNo having count(*) > 1) and rowid not in (select min(rowid) from employee group by employeeId,phoneNo having count(*)>1)
通用:
delete from table t where (t.欄位1, t.欄位2, … , t.欄位n) in (select 欄位1, 欄位2, … , 欄位n from table group by 欄位1, 欄位2, … , 欄位n having count(*) > 1) and rowid not in (select min(rowid) from table group by 欄位1, 欄位2, … , 欄位n having count(*)>1)
此外:
如果只是查詢的時候, 不顯示重複, 只需select distinct 欄位 from table…. ---------------------
二刪除方法:DELETE FROM hr.employees t1 WHERE t1.ROWID NOT IN ( SELECT MIN(t2.ROWID) FROM hr.employees t2 GROUP BY t2.employee_id --按照想要唯一保留的欄位進行分組 );
這個明顯就比方法一好多了,子查詢中我們先選除了rowid,然後按照我們想要保留的唯一欄位進行分組,並取每組最小的rowid(注意是子查詢表的rowid);然後在用not in刪除除開最小的rowid以外的所有記錄。
怎麼樣,這個方法是不是瞬間解決並且非常好理解?但是你以為這樣就結束了?no no no
三刪除方法:DELETE FROM hr.employees t1 WHERE t1.rowid > ( SELECT MIN(t2.rowid) FROM hr.employees t2 WHERE t1.employee_id = t2.employee_id --按照想要唯一保留的欄位進行匹配 );
這個方式看起來和方法二差不多,但是想要說的是,他用的是連線,他用的是連線,不敢說連線一定比group by快,但是基本上不會輸group by,而且在一般的情況下也是最快的了。而且外層的">"可以用到索引,就是各種快。
方法也同樣說一下,子查詢中按照要保留的欄位對t1和t2進行關聯,然後選擇出最小的rowid(注意是子查詢表的rowid),然後在外層用">"只保留每個匹配結果最小的一條記錄。然後就瞬間刪除重複的記錄