MySQL: How to Delete Duplicate Rows But Keep Latest


You can use the following syntax in MySQL to delete all duplicate rows in a table but keep the one with the latest id value:

DELETE t1 FROM athletes t1, athletes t2 
WHERE t1.id < t2.id AND t1.team = t2.team;

This particular example deletes all rows with duplicate values in the team column of the athletes table, but keeps the one with the latest id value.

The following example shows how to use this syntax in practice.

Example: Delete Duplicate Rows But Keep Latest in MySQL

Suppose we have the following table named athletes that contains information about various basketball players:

-- create table 
CREATE TABLE athletes (
  id INT PRIMARY KEY,
  team TEXT NOT NULL,
  points INT NOT NULL
);

-- insert rows into table
INSERT INTO athletes VALUES (0001, 'Mavs', 22);
INSERT INTO athletes VALUES (0002, 'Mavs', 14);
INSERT INTO athletes VALUES (0003, 'Lakers', 37);
INSERT INTO athletes VALUES (0004, 'Knicks', 19);
INSERT INTO athletes VALUES (0005, 'Knicks', 26);
INSERT INTO athletes VALUES (0006, 'Knicks', 40);
INSERT INTO athletes VALUES (0007, 'Lakers', 21);
INSERT INTO athletes VALUES (0008, 'Celtics', 15);
INSERT INTO athletes VALUES (0009, 'Hawks', 18);
INSERT INTO athletes VALUES (0010, 'Celtics', 15);

-- view all rows in table
SELECT * FROM athletes;

Output:

+----+---------+--------+
| id | team    | points |
+----+---------+--------+
|  1 | Mavs    |     22 |
|  2 | Mavs    |     14 |
|  3 | Lakers  |     37 |
|  4 | Knicks  |     19 |
|  5 | Knicks  |     26 |
|  6 | Knicks  |     40 |
|  7 | Lakers  |     21 |
|  8 | Celtics |     15 |
|  9 | Hawks   |     18 |
| 10 | Celtics |     15 |
+----+---------+--------+

We can use the following syntax to delete all rows with duplicate values in the team column of the table, but keeps the one with the latest id value.

DELETE t1 FROM athletes t1, athletes t2 
WHERE t1.id < t2.id AND t1.team = t2.team;

Output:

+----+---------+--------+
| id | team    | points |
+----+---------+--------+
|  2 | Mavs    |     14 |
|  6 | Knicks  |     40 |
|  7 | Lakers  |     21 |
|  9 | Hawks   |     18 |
| 10 | Celtics |     15 |
+----+---------+--------+

Notice that all rows with duplicate values in the team column have been deleted and only the ones with the latest value in the id column were kept.

For example, players on the Knicks team had id values of 4, 5 and 6 in the original table but only the row with an id value of 6 was kept.

If you would instead like to delete duplicate rows but keep the one with the earliest id value, then you can simply use a greater than symbol ( > ) in the syntax instead:

DELETE t1 FROM athletes t1, athletes t2 
WHERE t1.id > t2.id AND t1.team = t2.team;

Output:

+----+---------+--------+
| id | team    | points |
+----+---------+--------+
|  1 | Mavs    |     22 |
|  3 | Lakers  |     37 |
|  4 | Knicks  |     19 |
|  8 | Celtics |     15 |
|  9 | Hawks   |     18 |
+----+---------+--------+

Now all rows with duplicate values in the team column have been deleted and only the ones with the earliest value in the id column were kept.

Additional Resources

The following tutorials explain how to perform other common tasks in MySQL:

MySQL: How to Use DELETE with INNER JOIN
MySQL: How to Delete Rows from Table Based on id
MySQL: How to Add Column with Default Value

Leave a Reply

Your email address will not be published. Required fields are marked *