oracle大量刪除重複資料(無索引)
阿新 • • 發佈:2020-12-09
現有一張表 dmp_result 無主鍵無索引,DATA_RECORD_ID、ORGANIZATION_ID、DATA_TIME理論上三個唯一,優化此表構建上面3列聯合唯一約束,需要刪除重複資料;
dmp_result 約資料400W 大約有重複資料3W條
查詢表重複資料也就是待刪除資料
select * from dmp_result a
where (a.DATA_RECORD_ID,a.ORGANIZATION_ID,a.DATA_TIME) in (select DATA_RECORD_ID,ORGANIZATION_ID, DATA_TIME from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having
count(*) > 1)
and rowid not in (select min(rowid) from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having count(*)>1)
如果直接執行上述sql delete 執行時間會非常漫長不可取;
可複製兩張臨時表,一張插入需要刪除資料,一張取其差集就是想得到的資料(臨時表無約束小心重複匯入)
建立臨時表dmp_result1、dmp_result2
create table dmp_result1 as select * from dmp_result
create table dmp_result2 as select * from dmp_result
減少日誌產生
alter table dmp_result1 nologging;
alter table dmp_result2 nologging;
建立臨時表dmp_result1插入需要刪除資料
insert /*+ append */ into dmp_result1 (
select * from dmp_result a
where (a.DATA_RECORD_ID,a.ORGANIZATION_ID,a.DATA_TIME) in (select DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having
count(*) > 1)
and rowid not in (select min(rowid) from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having count(*)>1)
);
dmp_result2插入差集
insert /*+ append */ into dmp_result2 (
select * from dmp_result
minus select * from dmp_result1
);
後面可以將dmp_result2 更換成 dmp_result,也可以將dmp_result清空後插入dmp_result2的資料即可
重新構建約束
alter table dmp_result rename to dmp_result3;
alter table dmp_result2 rename to dmp_result;
截斷表有風險請做好備份
truncate table ems_inspect_record;
insert /*+ append */ into ems_inspect_record (
select * from ems_inspect_record2
);