The best answers to the question “Finding duplicate values in MySQL” in the category Dev.
I have a table with a varchar column, and I would like to find all the records that have duplicate values in this column. What is the best query I can use to find the duplicates?
SELECT varchar_col FROM table GROUP BY varchar_col HAVING COUNT(*) > 1;
SELECT with a
GROUP BY clause. Let’s say name is the column you want to find duplicates in:
SELECT name, COUNT(*) c FROM table GROUP BY name HAVING c > 1;
This will return a result with the name value in the first column, and a count of how many times that value appears in the second.
Building off of levik’s answer to get the IDs of the duplicate rows you can do a
GROUP_CONCAT if your server supports it (this will return a comma separated list of ids).
SELECT GROUP_CONCAT(id), name, COUNT(*) c FROM documents GROUP BY name HAVING c > 1;
SELECT * FROM mytable mto WHERE EXISTS ( SELECT 1 FROM mytable mti WHERE mti.varchar_column = mto.varchar_column LIMIT 1, 1 )
This query returns complete records, not just distinct
This query doesn’t use
COUNT(*). If there are lots of duplicates,
COUNT(*) is expensive, and you don’t need the whole
COUNT(*), you just need to know if there are two rows with same value.
This is achieved by the
LIMIT 1, 1 at the bottom of the correlated query (essentially meaning “return the second row”). EXISTS would only return true if the aforementioned second row exists (i. e. there are at least two rows with the same value of varchar_column) .
Having an index on
varchar_column will, of course, speed up this query greatly.