Results 1 to 3 of 3

Thread: identifying duplicate records...

  1. #1
    Join Date
    Oct 2002
    Posts
    5

    identifying duplicate records...

    I want to identify every record where a particular field matches the value in the same field in another record (in the same table).

    Example:

    Say I have the following table:

    id | registrationName | registrationEmail
    ------------------------------------------------------------
    123 | Michael | myoung@lehmans.com
    456 | Michael2 | someOtherEmail@bogus.com
    789 | Michael3 | myoung@lehmans.com

    I want a query to identify the 1st and 3rd record, because those records have identical email addresses. Ideally, it would be really cool if I could select only the 3rd record because it has the higher id number.

    In reality, I am dealing with a database that has over 200,000 records, so I obviously don't know each email address before hand.

    Thanks in advance to anybody who can help me with this!

    Sincerely,
    Michael

  2. #2
    Join Date
    Oct 2002
    Posts
    1
    I've recently done something almost identical to this and found the following MSDN article extremely helpful:

    http://msdn.microsoft.com/library/de...ml/ima0056.asp

    Hope this helps!

  3. #3
    Join Date
    Oct 2002
    Posts
    6
    Michael, do I have an exact answer for your or what. Following is the stored procedure I wrote, does exact same thing. It looks for a duplicate email address in the same database and deletes it. If you provide the parmeter 'List', it will also show you the records it found duplicated that it has deleted.

    /* Bikram Mann, 10/15/2002
    Deletes duplicate email address.
    It uses a 'correlated subquery to query same table and match records.
    */

    CREATE PROCEDURE up_DELETE_DuplicateEmails (@ShowList char(4) = NULL) AS

    IF NOT @ShowList IS NULL
    IF @ShowList <> 'List'
    BEGIN
    PRINT 'Syntax: EXECUTE up_DELETE_DuplicateEmails List'
    RETURN
    END
    ELSE

    BEGIN
    SELECT Email_Address_ID AS ID,
    Email_address AS Email,
    Send,
    State,
    Sent AS Last_Sent,
    Last_Update
    FROM email_address
    WHERE EXISTS
    (SELECT email_address FROM email_address Email_Inner
    WHERE Email_Inner.email_address = email_address.email_address
    AND Email_Inner.email_address_ID < email_address.email_address_ID)
    END

    DELETE FROM email_address
    WHERE EXISTS
    (SELECT email_address FROM email_address Email_Inner
    WHERE Email_Inner.email_address = email_address.email_address
    AND Email_Inner.email_address_ID < email_address.email_address_ID)
    GO

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •