1. 程式人生 > 其它 >oracle大量刪除重複資料(無索引)

oracle大量刪除重複資料(無索引)

技術標籤:oraclejavasql資料庫mysql後端

現有一張表 dmp_result 無主鍵無索引,DATA_RECORD_ID、ORGANIZATION_ID、DATA_TIME理論上三個唯一,優化此表構建上面3列聯合唯一約束,需要刪除重複資料;

dmp_result 約資料400W 大約有重複資料3W條

查詢表重複資料也就是待刪除資料

   select * from dmp_result a
where (a.DATA_RECORD_ID,a.ORGANIZATION_ID,a.DATA_TIME) in   (select DATA_RECORD_ID,ORGANIZATION_ID,
DATA_TIME from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having count(*) > 1) and rowid not in (select min(rowid) from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having count(*)>1)

如果直接執行上述sql delete 執行時間會非常漫長不可取;

可複製兩張臨時表,一張插入需要刪除資料,一張取其差集就是想得到的資料(臨時表無約束小心重複匯入)

建立臨時表dmp_result1、dmp_result2

create table dmp_result1 as select * from dmp_result
create table dmp_result2 as select * from dmp_result

減少日誌產生

alter table dmp_result1 nologging; 
alter table dmp_result2 nologging; 

建立臨時表dmp_result1插入需要刪除資料

  insert /*+ append */ into dmp_result1 (
   select * from dmp_result a
where
(a.DATA_RECORD_ID,a.ORGANIZATION_ID,a.DATA_TIME) in (select DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having count(*) > 1) and rowid not in (select min(rowid) from dmp_result group by DATA_RECORD_ID,ORGANIZATION_ID,DATA_TIME having count(*)>1) );

dmp_result2插入差集

  insert /*+ append */ into dmp_result2 (
select * from dmp_result 
minus select * from dmp_result1
  ); 

後面可以將dmp_result2 更換成 dmp_result,也可以將dmp_result清空後插入dmp_result2的資料即可

重新構建約束

alter table dmp_result rename to dmp_result3;
alter table dmp_result2 rename to dmp_result;

截斷表有風險請做好備份

truncate table ems_inspect_record;
  insert /*+ append */ into ems_inspect_record (
select * from ems_inspect_record2
  );