選擇語句以查找某些字段的重複項

本文翻譯自:Select statement to find duplicates on certain fields

Can you help me with SQL statements to find duplicates on multiple fields? 你能幫我用SQL語句來查找多個字段的重複項嗎?

For example, in pseudo code: 例如,在僞代碼中:

select count(field1,field2,field3) 
from table 
where the combination of field1, field2, field3 occurs multiple times

and from the above statement if there are multiple occurrences I would like to select every record except the first one . 並且從上面的陳述中, 如果有多次出現,我想選擇除第一個以外的每個記錄


#1樓

參考:https://stackoom.com/question/IbW2/選擇語句以查找某些字段的重複項


#2樓

To see duplicate values: 要查看重複值:

with MYCTE  as (
    select row_number() over ( partition by name  order by name) rown, *
    from tmptest  
    ) 
select * from MYCTE where rown <=1

#3樓

CREATE TABLE #tmp
(
    sizeId Varchar(MAX)
)

INSERT  #tmp 
    VALUES ('44'),
        ('44,45,46'),
        ('44,45,46'),
        ('44,45,46'),
        ('44,45,46'),
        ('44,45,46'),
        ('44,45,46')


SELECT * FROM #tmp
DECLARE @SqlStr VARCHAR(MAX)

SELECT @SqlStr = STUFF((SELECT ',' + sizeId
              FROM #tmp
              ORDER BY sizeId
              FOR XML PATH('')), 1, 1, '') 


SELECT TOP 1 * FROM (
select items, count(*)AS Occurrence
  FROM dbo.Split(@SqlStr,',')
  group by items
  having count(*) > 1
  )K
  ORDER BY K.Occurrence DESC    

#4樓

To get the list of fields for which there are multiple records, you can use.. 要獲取有多個記錄的字段列表,您可以使用..

select field1,field2,field3, count(*)
  from table_name
  group by field1,field2,field3
  having count(*) > 1

Check this link for more information on how to delete the rows. 有關如何刪除行的更多信息,請查看此鏈接。

http://support.microsoft.com/kb/139444 http://support.microsoft.com/kb/139444

Edit : As the other users mentioned, there should be a criterion for deciding how you define "first rows" before you use the approach in the link above. 編輯:正如其他用戶所提到的,在使用上述鏈接中的方法之前,應該有一個標準來決定如何定義“第一行”。 Based on that you'll need to use an order by clause and a sub query if needed. 基於此,您需要使用order by子句和子查詢(如果需要)。 If you can post some sample data, it would really help. 如果您可以發佈一些示例數據,那將非常有用。


#5樓

You mention "the first one", so I assume that you have some kind of ordering on your data. 你提到“第一個”,所以我假設你對你的數據有某種排序。 Let's assume that your data is ordered by some field ID . 假設您的數據按某些字段ID排序。

This SQL should get you the duplicate entries except for the first one. 除了第一個條目之外,此SQL應該爲您提供重複的條目。 It basically selects all rows for which another row with (a) the same fields and (b) a lower ID exists. 它基本上選擇具有(a)相同字段和(b)較低ID的另一行的所有行。 Performance won't be great, but it might solve your problem. 性能不會很好,但它可能會解決您的問題。

SELECT A.ID, A.field1, A.field2, A.field3
  FROM myTable A
 WHERE EXISTS (SELECT B.ID
                 FROM myTable B
                WHERE B.field1 = A.field1
                  AND B.field2 = A.field2
                  AND B.field3 = A.field3
                  AND B.ID < A.ID)

#6樓

If you're using SQL Server 2005 or later (and the tags for your question indicate SQL Server 2008), you can use ranking functions to return the duplicate records after the first one if using joins is less desirable or impractical for some reason. 如果您正在使用SQL Server 2005或更高版本(並且您的問題的標記表示SQL Server 2008),則可以使用排名函數在第一個之後返回重複記錄,如果由於某種原因使用連接不太理想或不切實際。 The following example shows this in action, where it also works with null values in the columns examined. 以下示例顯示了此操作,它還可以在檢查的列中使用空值。

create table Table1 (
 Field1 int,
 Field2 int,
 Field3 int,
 Field4 int 
)

insert  Table1 
values    (1,1,1,1)
        , (1,1,1,2)
        , (1,1,1,3)
        , (2,2,2,1)
        , (3,3,3,1)
        , (3,3,3,2)
        , (null, null, 2, 1)
        , (null, null, 2, 3)

select    *
from     (select      Field1
                    , Field2
                    , Field3
                    , Field4
                    , row_number() over (partition by   Field1
                                                      , Field2
                                                      , Field3
                                         order by       Field4) as occurrence
          from      Table1) x
where     occurrence > 1

Notice after running this example that the first record out of every "group" is excluded, and that records with null values are handled properly. 運行此示例後請注意,排除每個“組”中的第一個記錄,並正確處理具有空值的記錄。

If you don't have a column available to order the records within a group, you can use the partition-by columns as the order-by columns. 如果您沒有可用於對組中的記錄進行排序的列,則可以使用partition-by列作爲order-by列。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章