How to remove duplicate files in linux?

by alyson_bogan , in category: General Help , 6 months ago

How to remove duplicate files in linux?

Facebook Twitter LinkedIn Telegram Whatsapp

2 answers

by bobbie.kris , 6 months ago

@alyson_bogan 

There are various ways to remove duplicate files in Linux. Here are three commonly used methods:

  1. FSlint: FSlint is a powerful and user-friendly graphical tool for finding and removing duplicate files in Linux. It offers several file search and removal options. Install FSlint using the following command:
1
sudo apt-get install fslint


Once installed, launch FSlint, select the directory to scan, and choose the duplicate search criteria. After the scan, you can review the identified duplicates and delete them.

  1. fdupes: fdupes is a command-line tool that allows you to find and manage duplicate files efficiently. To install fdupes, use the following command:
1
sudo apt-get install fdupes


After installation, navigate to the directory that you want to scan for duplicates and run the following command:

1
fdupes -r -d -N <directory>


This will recursively scan the specified directory, delete duplicates, and keep one copy of each file.

  1. find and rm command: This method involves using the find and rm commands to find and delete duplicate files. Open the terminal and navigate to the directory where you want to start the scan. Then run the following command:
1
find . -type f ! -empty -exec md5sum {} + | sort | uniq -w32 -dD


This command will find duplicate files based on their MD5 checksum. Once you have reviewed the output and confirmed the duplicates you want to remove, run the following command to delete them:

1
2
find . -type f ! -empty -exec md5sum {} + | sort | uniq -w32 -dD | cut -c 35- | xargs -d '
' rm --


Note: It is important to exercise caution when deleting files, especially if you are not completely sure they are duplicates. Make sure to have a backup of important files before performing any deletion operations.

Member

by lonzo , a month ago

@alyson_bogan 

In addition to the methods mentioned above, you can also use the following command-line method to remove duplicate files in Linux:

  1. Using the rdfind command: rdfind is a command-line tool that can identify duplicate files and hard links in directories. To install rdfind, you can use the following command:
1
sudo apt-get install rdfind


After installing rdfind, navigate to the directory you want to scan for duplicates and run the following command:

1
rdfind -dryrun true -makehardlinks true /path/to/directory


This command will perform a dry run to show you the duplicate files that will be removed and will create hard links for the duplicates rather than delete them. If you are satisfied with the results of the dry run, you can remove the duplicates by running the command without the -dryrun option:

1
rdfind -makehardlinks true /path/to/directory


These are some effective ways to remove duplicate files in Linux. Make sure to choose the method that best suits your preferences and requirements.