How to find duplicate rows with PostgreSQL?

How to find duplicate rows with PostgreSQL?

Asked on October 26, 2018 in Database.
Add Comment


  • 5 Answer(s)

    Try this code, it will be useful:

    select * from (
        SELECT id,
        ROW_NUMBER() OVER(PARTITION BY merchant_Id, url ORDER BY id asc) AS Row
        FROM Photos
    )   dups
    where
    dups.Row > 1
    
    Answered on October 26, 2018.
    Add Comment

    Here is an updated version of his answer using Postgres

    DELETE FROM Photos AS P1
    USING Photos AS P2
    WHERE P1.id > P2.id
        AND P1.merchant_id = P2.merchant_id
        AND P1.url = P2.url;
    
    Answered on October 26, 2018.
    Add Comment

    In spite  ID column is unique,then lowest id could be:

    DELETE FROM P1
    USING Photos P1, Photos P2
    WHERE P1.id > P2.id
        AND P1.merchant_id = P2.merchant_id
        AND P1.url = P2.url;
    

    This code to Find duplicate rows with PostgreSQL

    Answered on October 26, 2018.
    Add Comment

    I was working on a project where we needed to aggregate information on employees from 10 different tables and make the resulting table clear (no duplicate rows), containing full information on people working in the big company.

    While making this I understood that the emergence of duplicates (or duplicate rows) is inevitable when you work with a large amount of data aggregating several tables into one. Fortunately PostgreSQL has some features that are extremely useful while working with detection and elimination of duplicates.

    I want to put your attention on these features and help you to never have problems with duplicates.

    Duplicate or Duplicate row is a row in a table looking exactly or almost exactly like some another row (original row) in this table.
    So we can deal with absolutely identical rows and almost identical rows. For example theirs ids can differ but all other properties are exactly the same.

    So, what can you do with the duplicates?
    For absolutely identical rows:

    • Find them
    • Delete them

    For almost identical rows (identical except for one or more properties):

    • Combine information from duplicate rows into one row
    • Select one of the rows according to some criteria and delete the remaining ones.
    Answered on January 14, 2019.
    Add Comment

    Finding duplicate rows

    If the table has few rows, you can see which ones are duplicate immediately. However, it is not the case with the big table.

    The find the duplicate rows, you use the following statement:

    Answered on January 14, 2019.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.