Find an identical file with a different name [duplicate]

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP








up vote
7
down vote

favorite













This question already has an answer here:



  • How to find (and delete) duplicate files

    8 answers



Is it possible to find a without knowing its name?



I created a file with LaTex, then I copied it into another local directory and renamed the pdf. I don't know any more where the original file is located, but I have the renamed file on hand. I would like to make some modification to my latex file and recreate the pdf.



Since I know the original file is exactly the same as the renamed one except for the name, is there any way I can find my original file?










share|improve this question















marked as duplicate by muru command-line
Users with the  command-line badge can single-handedly close command-line questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Feb 28 at 13:53


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • if you created your copy with cp -p as I usually do, you may have the date/time/size of the file, and I'd use that to find it. (or I'd find all files of the same date (or size..) etc..)
    – guiverc
    Feb 28 at 10:51







  • 1




    Choose an appropriate checksum method - like md5. Then calculate it for your renamed file and all potential files in your file system. The same file will have the same checksum.
    – Thorbjørn Ravn Andersen
    Feb 28 at 11:45






  • 1




    Yes, @ThorbjørnRavnAndersen but the same file will also have the same size which is by far easier to determine than its md5. See pa4080's excellent answer.
    – PerlDuck
    Feb 28 at 11:57










  • It's not so difficult to use md5sum comparison (alongside the size comparison). I've updated my answer with this idea.
    – pa4080
    Feb 28 at 13:01







  • 1




    True @Thorbjørn but theres no point in calculating MD5s of files with different sizes. Sure they will be different. First find all files with the same size as your sample file and only then you need to calculate a hash for that handful of files. But not for all files. No need to calc MD5 for all 2GB videos if you're searching a 953 bytes text file.
    – PerlDuck
    Feb 28 at 14:40















up vote
7
down vote

favorite













This question already has an answer here:



  • How to find (and delete) duplicate files

    8 answers



Is it possible to find a without knowing its name?



I created a file with LaTex, then I copied it into another local directory and renamed the pdf. I don't know any more where the original file is located, but I have the renamed file on hand. I would like to make some modification to my latex file and recreate the pdf.



Since I know the original file is exactly the same as the renamed one except for the name, is there any way I can find my original file?










share|improve this question















marked as duplicate by muru command-line
Users with the  command-line badge can single-handedly close command-line questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Feb 28 at 13:53


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • if you created your copy with cp -p as I usually do, you may have the date/time/size of the file, and I'd use that to find it. (or I'd find all files of the same date (or size..) etc..)
    – guiverc
    Feb 28 at 10:51







  • 1




    Choose an appropriate checksum method - like md5. Then calculate it for your renamed file and all potential files in your file system. The same file will have the same checksum.
    – Thorbjørn Ravn Andersen
    Feb 28 at 11:45






  • 1




    Yes, @ThorbjørnRavnAndersen but the same file will also have the same size which is by far easier to determine than its md5. See pa4080's excellent answer.
    – PerlDuck
    Feb 28 at 11:57










  • It's not so difficult to use md5sum comparison (alongside the size comparison). I've updated my answer with this idea.
    – pa4080
    Feb 28 at 13:01







  • 1




    True @Thorbjørn but theres no point in calculating MD5s of files with different sizes. Sure they will be different. First find all files with the same size as your sample file and only then you need to calculate a hash for that handful of files. But not for all files. No need to calc MD5 for all 2GB videos if you're searching a 953 bytes text file.
    – PerlDuck
    Feb 28 at 14:40













up vote
7
down vote

favorite









up vote
7
down vote

favorite












This question already has an answer here:



  • How to find (and delete) duplicate files

    8 answers



Is it possible to find a without knowing its name?



I created a file with LaTex, then I copied it into another local directory and renamed the pdf. I don't know any more where the original file is located, but I have the renamed file on hand. I would like to make some modification to my latex file and recreate the pdf.



Since I know the original file is exactly the same as the renamed one except for the name, is there any way I can find my original file?










share|improve this question
















This question already has an answer here:



  • How to find (and delete) duplicate files

    8 answers



Is it possible to find a without knowing its name?



I created a file with LaTex, then I copied it into another local directory and renamed the pdf. I don't know any more where the original file is located, but I have the renamed file on hand. I would like to make some modification to my latex file and recreate the pdf.



Since I know the original file is exactly the same as the renamed one except for the name, is there any way I can find my original file?





This question already has an answer here:



  • How to find (and delete) duplicate files

    8 answers







command-line files text-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 28 at 10:40









Zanna

48.2k13120228




48.2k13120228










asked Feb 28 at 10:20









dmx

5331520




5331520




marked as duplicate by muru command-line
Users with the  command-line badge can single-handedly close command-line questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Feb 28 at 13:53


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by muru command-line
Users with the  command-line badge can single-handedly close command-line questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Feb 28 at 13:53


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.













  • if you created your copy with cp -p as I usually do, you may have the date/time/size of the file, and I'd use that to find it. (or I'd find all files of the same date (or size..) etc..)
    – guiverc
    Feb 28 at 10:51







  • 1




    Choose an appropriate checksum method - like md5. Then calculate it for your renamed file and all potential files in your file system. The same file will have the same checksum.
    – Thorbjørn Ravn Andersen
    Feb 28 at 11:45






  • 1




    Yes, @ThorbjørnRavnAndersen but the same file will also have the same size which is by far easier to determine than its md5. See pa4080's excellent answer.
    – PerlDuck
    Feb 28 at 11:57










  • It's not so difficult to use md5sum comparison (alongside the size comparison). I've updated my answer with this idea.
    – pa4080
    Feb 28 at 13:01







  • 1




    True @Thorbjørn but theres no point in calculating MD5s of files with different sizes. Sure they will be different. First find all files with the same size as your sample file and only then you need to calculate a hash for that handful of files. But not for all files. No need to calc MD5 for all 2GB videos if you're searching a 953 bytes text file.
    – PerlDuck
    Feb 28 at 14:40

















  • if you created your copy with cp -p as I usually do, you may have the date/time/size of the file, and I'd use that to find it. (or I'd find all files of the same date (or size..) etc..)
    – guiverc
    Feb 28 at 10:51







  • 1




    Choose an appropriate checksum method - like md5. Then calculate it for your renamed file and all potential files in your file system. The same file will have the same checksum.
    – Thorbjørn Ravn Andersen
    Feb 28 at 11:45






  • 1




    Yes, @ThorbjørnRavnAndersen but the same file will also have the same size which is by far easier to determine than its md5. See pa4080's excellent answer.
    – PerlDuck
    Feb 28 at 11:57










  • It's not so difficult to use md5sum comparison (alongside the size comparison). I've updated my answer with this idea.
    – pa4080
    Feb 28 at 13:01







  • 1




    True @Thorbjørn but theres no point in calculating MD5s of files with different sizes. Sure they will be different. First find all files with the same size as your sample file and only then you need to calculate a hash for that handful of files. But not for all files. No need to calc MD5 for all 2GB videos if you're searching a 953 bytes text file.
    – PerlDuck
    Feb 28 at 14:40
















if you created your copy with cp -p as I usually do, you may have the date/time/size of the file, and I'd use that to find it. (or I'd find all files of the same date (or size..) etc..)
– guiverc
Feb 28 at 10:51





if you created your copy with cp -p as I usually do, you may have the date/time/size of the file, and I'd use that to find it. (or I'd find all files of the same date (or size..) etc..)
– guiverc
Feb 28 at 10:51





1




1




Choose an appropriate checksum method - like md5. Then calculate it for your renamed file and all potential files in your file system. The same file will have the same checksum.
– Thorbjørn Ravn Andersen
Feb 28 at 11:45




Choose an appropriate checksum method - like md5. Then calculate it for your renamed file and all potential files in your file system. The same file will have the same checksum.
– Thorbjørn Ravn Andersen
Feb 28 at 11:45




1




1




Yes, @ThorbjørnRavnAndersen but the same file will also have the same size which is by far easier to determine than its md5. See pa4080's excellent answer.
– PerlDuck
Feb 28 at 11:57




Yes, @ThorbjørnRavnAndersen but the same file will also have the same size which is by far easier to determine than its md5. See pa4080's excellent answer.
– PerlDuck
Feb 28 at 11:57












It's not so difficult to use md5sum comparison (alongside the size comparison). I've updated my answer with this idea.
– pa4080
Feb 28 at 13:01





It's not so difficult to use md5sum comparison (alongside the size comparison). I've updated my answer with this idea.
– pa4080
Feb 28 at 13:01





1




1




True @Thorbjørn but theres no point in calculating MD5s of files with different sizes. Sure they will be different. First find all files with the same size as your sample file and only then you need to calculate a hash for that handful of files. But not for all files. No need to calc MD5 for all 2GB videos if you're searching a 953 bytes text file.
– PerlDuck
Feb 28 at 14:40





True @Thorbjørn but theres no point in calculating MD5s of files with different sizes. Sure they will be different. First find all files with the same size as your sample file and only then you need to calculate a hash for that handful of files. But not for all files. No need to calc MD5 for all 2GB videos if you're searching a 953 bytes text file.
– PerlDuck
Feb 28 at 14:40











5 Answers
5






active

oldest

votes

















up vote
5
down vote













When the only difference is the name booth files should have the same content and size.



1. About the content. We can compare two files by the command diff file-1 file-2. Also we can use this command for a test in this way:





diff -q file-1 file-2 > /dev/null && echo 'equal' || echo 'different'


2. About the size. We can find a file with certain size by the command (where 12672 is the file size in bytes):



find /path/to/search -type f -size 12672c -printf '%pn'


Or we can use a range in this way (where 12600-12700 is the file size range in bytes):



find /path/to/search -type f -size -12700c -size +12600c -printf '%pn'


Note that, by default the command find works recursively.



3. Combine the two methods (where file-1 is our pattern file):



find /path/to/search -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ;


4. Example. Let's assume we have the following directory structure:



$ tree /tmp/test
/tmp/test
├── file-1 # this is the pattern file
├── file-2 # this is almost the same file but wit few additional characters
└── file-3 # this is exact copy of file-1


The result of the above command will be:



$ find /tmp/test -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ; 
/tmp/test/file-2 different # OK: here we have added few additional characters
/tmp/test/file-3 equal # OK: this is exact copy of file-1
/tmp/test/file-1 equal # OK: this is file-1 compared to its self


Or we can simplify the output by changing our command in this way:



$ find /tmp/test -type f -not -name "file-1" -size -12700c -size +12600c 
-exec sh -c 'diff -q file-1 "$1" > /dev/null && printf "%stis equaln" "$1"' sh ;
/tmp/test/file-3 is equal



Update from the comments. The following command finding for file with the same size as the file-1, and then the diff commas is involved with --brief and --report-identical-files options:



find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec diff -qs file-1 ;
Files file-1 and /tmp/test/file-3 are identical


We can compare md5sum of the files in this way:




  • Get the md5sum of the pattern file:



    $ md5sum file-1
    d18b61a77779d69e095be5942f6be7a7 file-1



  • Use it with our command:



    $ find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec sh -c 'echo "d18b61a77779d69e095be5942f6be7a7 $1" | md5sum -c -' sh ;
    /tmp/test/file-3: OK






share|improve this answer


















  • 2




    +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
    – sudodus
    Feb 28 at 11:42






  • 2




    @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
    – PerlDuck
    Feb 28 at 11:54






  • 1




    cmp -s is better than diff when the goal is simply to tell if two files are equal.
    – John Kugelman
    Feb 28 at 13:54






  • 1




    You need to learn about xargs
    – Thorbjørn Ravn Andersen
    Feb 28 at 14:10











  • @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
    – pa4080
    Mar 14 at 16:06

















up vote
3
down vote













  • You can search for a particular string with grep -rl "string"
    (-r for recursive, finding the string in files, -l for showing the filename, not the string)





share|improve this answer




















  • you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
    – Simon Van Machin
    Feb 28 at 10:36

















up vote
3
down vote













If you're looking for (or okay with) a GUI application, you may try the "FSlint Janitor" application. You can install it by running



sudo apt-get install fslint


How to use the application:



Once installed, follow the steps below.



  1. Launch the application.

  2. Select the "Duplicates" option (1) to search files with the same content.

  3. Click on the "+ Add" button (2) and select the directories to look for files (make sure the "recurse" option is checked to include sub-directories).

  4. Click on the "Find" button (3) and wait.

enter image description here






share|improve this answer



























    up vote
    3
    down vote













    This might take a while, but it should be effective and reliable. It assumes you are using Bash. Replace file with the name of your renamed file:



    shopt -s globstar
    for i in **; do [ -f "$i" ] && cmp --silent file "$i" && echo "$i"; done



    • shopt -s globstar turns on recursive globbing with **. You can turn it off with shopt -u globstar, but it is off by default and will be off when you open a new shell.


    • for i in ** loop over all files below this one. Run the command from the highest level directory that might contain the file or the directory with the file or the directory... (apply recursion to this sentence!)


    • [ -f "$i" ] && if the file is a regular file that exists then...


    • cmp --silent file "$i" && if there is no difference between file and the file being examined (ie if cmp exits successfully), then...


    • echo $i print the relative path of the file (this also prints the path of file itself, but I didn't see much benefit in fixing that).

    Thanks to this answer on Stack Overflow for the cmp method of comparing files.






    share|improve this answer





























      up vote
      2
      down vote













      Grep can find it quickly



      When used properly, the grep command can find the duplicate quickly. You must be careful not to search the whole file system or it will take days to complete. I recently documented this here: `grep`ing all files for a string takes a long time



      For optimum speed use:



      grep -rnw --exclude-dir=boot,dev,lib,media,mnt,proc,root,run,sys,/tmp,tmpfs,var '/' -e 'String in file'


      If your file might be on a Windows directory remove the mnt directory.



      If you know the file is within the /home directory someplace you can shorten the command:



      grep -rnw '/home' -e 'String in file'





      share|improve this answer



























        5 Answers
        5






        active

        oldest

        votes








        5 Answers
        5






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        5
        down vote













        When the only difference is the name booth files should have the same content and size.



        1. About the content. We can compare two files by the command diff file-1 file-2. Also we can use this command for a test in this way:





        diff -q file-1 file-2 > /dev/null && echo 'equal' || echo 'different'


        2. About the size. We can find a file with certain size by the command (where 12672 is the file size in bytes):



        find /path/to/search -type f -size 12672c -printf '%pn'


        Or we can use a range in this way (where 12600-12700 is the file size range in bytes):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pn'


        Note that, by default the command find works recursively.



        3. Combine the two methods (where file-1 is our pattern file):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ;


        4. Example. Let's assume we have the following directory structure:



        $ tree /tmp/test
        /tmp/test
        ├── file-1 # this is the pattern file
        ├── file-2 # this is almost the same file but wit few additional characters
        └── file-3 # this is exact copy of file-1


        The result of the above command will be:



        $ find /tmp/test -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ; 
        /tmp/test/file-2 different # OK: here we have added few additional characters
        /tmp/test/file-3 equal # OK: this is exact copy of file-1
        /tmp/test/file-1 equal # OK: this is file-1 compared to its self


        Or we can simplify the output by changing our command in this way:



        $ find /tmp/test -type f -not -name "file-1" -size -12700c -size +12600c 
        -exec sh -c 'diff -q file-1 "$1" > /dev/null && printf "%stis equaln" "$1"' sh ;
        /tmp/test/file-3 is equal



        Update from the comments. The following command finding for file with the same size as the file-1, and then the diff commas is involved with --brief and --report-identical-files options:



        find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec diff -qs file-1 ;
        Files file-1 and /tmp/test/file-3 are identical


        We can compare md5sum of the files in this way:




        • Get the md5sum of the pattern file:



          $ md5sum file-1
          d18b61a77779d69e095be5942f6be7a7 file-1



        • Use it with our command:



          $ find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec sh -c 'echo "d18b61a77779d69e095be5942f6be7a7 $1" | md5sum -c -' sh ;
          /tmp/test/file-3: OK






        share|improve this answer


















        • 2




          +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
          – sudodus
          Feb 28 at 11:42






        • 2




          @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
          – PerlDuck
          Feb 28 at 11:54






        • 1




          cmp -s is better than diff when the goal is simply to tell if two files are equal.
          – John Kugelman
          Feb 28 at 13:54






        • 1




          You need to learn about xargs
          – Thorbjørn Ravn Andersen
          Feb 28 at 14:10











        • @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
          – pa4080
          Mar 14 at 16:06














        up vote
        5
        down vote













        When the only difference is the name booth files should have the same content and size.



        1. About the content. We can compare two files by the command diff file-1 file-2. Also we can use this command for a test in this way:





        diff -q file-1 file-2 > /dev/null && echo 'equal' || echo 'different'


        2. About the size. We can find a file with certain size by the command (where 12672 is the file size in bytes):



        find /path/to/search -type f -size 12672c -printf '%pn'


        Or we can use a range in this way (where 12600-12700 is the file size range in bytes):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pn'


        Note that, by default the command find works recursively.



        3. Combine the two methods (where file-1 is our pattern file):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ;


        4. Example. Let's assume we have the following directory structure:



        $ tree /tmp/test
        /tmp/test
        ├── file-1 # this is the pattern file
        ├── file-2 # this is almost the same file but wit few additional characters
        └── file-3 # this is exact copy of file-1


        The result of the above command will be:



        $ find /tmp/test -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ; 
        /tmp/test/file-2 different # OK: here we have added few additional characters
        /tmp/test/file-3 equal # OK: this is exact copy of file-1
        /tmp/test/file-1 equal # OK: this is file-1 compared to its self


        Or we can simplify the output by changing our command in this way:



        $ find /tmp/test -type f -not -name "file-1" -size -12700c -size +12600c 
        -exec sh -c 'diff -q file-1 "$1" > /dev/null && printf "%stis equaln" "$1"' sh ;
        /tmp/test/file-3 is equal



        Update from the comments. The following command finding for file with the same size as the file-1, and then the diff commas is involved with --brief and --report-identical-files options:



        find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec diff -qs file-1 ;
        Files file-1 and /tmp/test/file-3 are identical


        We can compare md5sum of the files in this way:




        • Get the md5sum of the pattern file:



          $ md5sum file-1
          d18b61a77779d69e095be5942f6be7a7 file-1



        • Use it with our command:



          $ find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec sh -c 'echo "d18b61a77779d69e095be5942f6be7a7 $1" | md5sum -c -' sh ;
          /tmp/test/file-3: OK






        share|improve this answer


















        • 2




          +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
          – sudodus
          Feb 28 at 11:42






        • 2




          @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
          – PerlDuck
          Feb 28 at 11:54






        • 1




          cmp -s is better than diff when the goal is simply to tell if two files are equal.
          – John Kugelman
          Feb 28 at 13:54






        • 1




          You need to learn about xargs
          – Thorbjørn Ravn Andersen
          Feb 28 at 14:10











        • @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
          – pa4080
          Mar 14 at 16:06












        up vote
        5
        down vote










        up vote
        5
        down vote









        When the only difference is the name booth files should have the same content and size.



        1. About the content. We can compare two files by the command diff file-1 file-2. Also we can use this command for a test in this way:





        diff -q file-1 file-2 > /dev/null && echo 'equal' || echo 'different'


        2. About the size. We can find a file with certain size by the command (where 12672 is the file size in bytes):



        find /path/to/search -type f -size 12672c -printf '%pn'


        Or we can use a range in this way (where 12600-12700 is the file size range in bytes):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pn'


        Note that, by default the command find works recursively.



        3. Combine the two methods (where file-1 is our pattern file):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ;


        4. Example. Let's assume we have the following directory structure:



        $ tree /tmp/test
        /tmp/test
        ├── file-1 # this is the pattern file
        ├── file-2 # this is almost the same file but wit few additional characters
        └── file-3 # this is exact copy of file-1


        The result of the above command will be:



        $ find /tmp/test -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ; 
        /tmp/test/file-2 different # OK: here we have added few additional characters
        /tmp/test/file-3 equal # OK: this is exact copy of file-1
        /tmp/test/file-1 equal # OK: this is file-1 compared to its self


        Or we can simplify the output by changing our command in this way:



        $ find /tmp/test -type f -not -name "file-1" -size -12700c -size +12600c 
        -exec sh -c 'diff -q file-1 "$1" > /dev/null && printf "%stis equaln" "$1"' sh ;
        /tmp/test/file-3 is equal



        Update from the comments. The following command finding for file with the same size as the file-1, and then the diff commas is involved with --brief and --report-identical-files options:



        find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec diff -qs file-1 ;
        Files file-1 and /tmp/test/file-3 are identical


        We can compare md5sum of the files in this way:




        • Get the md5sum of the pattern file:



          $ md5sum file-1
          d18b61a77779d69e095be5942f6be7a7 file-1



        • Use it with our command:



          $ find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec sh -c 'echo "d18b61a77779d69e095be5942f6be7a7 $1" | md5sum -c -' sh ;
          /tmp/test/file-3: OK






        share|improve this answer














        When the only difference is the name booth files should have the same content and size.



        1. About the content. We can compare two files by the command diff file-1 file-2. Also we can use this command for a test in this way:





        diff -q file-1 file-2 > /dev/null && echo 'equal' || echo 'different'


        2. About the size. We can find a file with certain size by the command (where 12672 is the file size in bytes):



        find /path/to/search -type f -size 12672c -printf '%pn'


        Or we can use a range in this way (where 12600-12700 is the file size range in bytes):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pn'


        Note that, by default the command find works recursively.



        3. Combine the two methods (where file-1 is our pattern file):



        find /path/to/search -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ;


        4. Example. Let's assume we have the following directory structure:



        $ tree /tmp/test
        /tmp/test
        ├── file-1 # this is the pattern file
        ├── file-2 # this is almost the same file but wit few additional characters
        └── file-3 # this is exact copy of file-1


        The result of the above command will be:



        $ find /tmp/test -type f -size -12700c -size +12600c -printf '%pt' -exec sh -c 'diff -q file-1 "$1" > /dev/null && echo "equal" || echo "different"' sh ; 
        /tmp/test/file-2 different # OK: here we have added few additional characters
        /tmp/test/file-3 equal # OK: this is exact copy of file-1
        /tmp/test/file-1 equal # OK: this is file-1 compared to its self


        Or we can simplify the output by changing our command in this way:



        $ find /tmp/test -type f -not -name "file-1" -size -12700c -size +12600c 
        -exec sh -c 'diff -q file-1 "$1" > /dev/null && printf "%stis equaln" "$1"' sh ;
        /tmp/test/file-3 is equal



        Update from the comments. The following command finding for file with the same size as the file-1, and then the diff commas is involved with --brief and --report-identical-files options:



        find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec diff -qs file-1 ;
        Files file-1 and /tmp/test/file-3 are identical


        We can compare md5sum of the files in this way:




        • Get the md5sum of the pattern file:



          $ md5sum file-1
          d18b61a77779d69e095be5942f6be7a7 file-1



        • Use it with our command:



          $ find /path -type f -not -name "file-1" -size $(stat -c%s file-1)c -exec sh -c 'echo "d18b61a77779d69e095be5942f6be7a7 $1" | md5sum -c -' sh ;
          /tmp/test/file-3: OK







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Feb 28 at 13:58









        ilkkachu

        981210




        981210










        answered Feb 28 at 11:20









        pa4080

        12.3k52256




        12.3k52256







        • 2




          +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
          – sudodus
          Feb 28 at 11:42






        • 2




          @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
          – PerlDuck
          Feb 28 at 11:54






        • 1




          cmp -s is better than diff when the goal is simply to tell if two files are equal.
          – John Kugelman
          Feb 28 at 13:54






        • 1




          You need to learn about xargs
          – Thorbjørn Ravn Andersen
          Feb 28 at 14:10











        • @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
          – pa4080
          Mar 14 at 16:06












        • 2




          +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
          – sudodus
          Feb 28 at 11:42






        • 2




          @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
          – PerlDuck
          Feb 28 at 11:54






        • 1




          cmp -s is better than diff when the goal is simply to tell if two files are equal.
          – John Kugelman
          Feb 28 at 13:54






        • 1




          You need to learn about xargs
          – Thorbjørn Ravn Andersen
          Feb 28 at 14:10











        • @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
          – pa4080
          Mar 14 at 16:06







        2




        2




        +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
        – sudodus
        Feb 28 at 11:42




        +1 for using the size in first place. By the way, the diff command line can be simplified: diff -qs file-1 file-2 will do the job :-)
        – sudodus
        Feb 28 at 11:42




        2




        2




        @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
        – PerlDuck
        Feb 28 at 11:54




        @sudodus The -qs probably speeds up diffbecause it can stop comparing the files when the first mismatch is found.
        – PerlDuck
        Feb 28 at 11:54




        1




        1




        cmp -s is better than diff when the goal is simply to tell if two files are equal.
        – John Kugelman
        Feb 28 at 13:54




        cmp -s is better than diff when the goal is simply to tell if two files are equal.
        – John Kugelman
        Feb 28 at 13:54




        1




        1




        You need to learn about xargs
        – Thorbjørn Ravn Andersen
        Feb 28 at 14:10





        You need to learn about xargs
        – Thorbjørn Ravn Andersen
        Feb 28 at 14:10













        @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
        – pa4080
        Mar 14 at 16:06




        @JohnKugelman, I followed your advice and started to use cmp -s for similar cases: askubuntu.com/a/1014921/566421 :-)
        – pa4080
        Mar 14 at 16:06












        up vote
        3
        down vote













        • You can search for a particular string with grep -rl "string"
          (-r for recursive, finding the string in files, -l for showing the filename, not the string)





        share|improve this answer




















        • you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
          – Simon Van Machin
          Feb 28 at 10:36














        up vote
        3
        down vote













        • You can search for a particular string with grep -rl "string"
          (-r for recursive, finding the string in files, -l for showing the filename, not the string)





        share|improve this answer




















        • you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
          – Simon Van Machin
          Feb 28 at 10:36












        up vote
        3
        down vote










        up vote
        3
        down vote









        • You can search for a particular string with grep -rl "string"
          (-r for recursive, finding the string in files, -l for showing the filename, not the string)





        share|improve this answer












        • You can search for a particular string with grep -rl "string"
          (-r for recursive, finding the string in files, -l for showing the filename, not the string)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 28 at 10:35









        Simon Van Machin

        23010




        23010











        • you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
          – Simon Van Machin
          Feb 28 at 10:36
















        • you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
          – Simon Van Machin
          Feb 28 at 10:36















        you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
        – Simon Van Machin
        Feb 28 at 10:36




        you shouldn't use the -i as mentioned in @mrfred489 's link as files are identical and it can slow things down
        – Simon Van Machin
        Feb 28 at 10:36










        up vote
        3
        down vote













        If you're looking for (or okay with) a GUI application, you may try the "FSlint Janitor" application. You can install it by running



        sudo apt-get install fslint


        How to use the application:



        Once installed, follow the steps below.



        1. Launch the application.

        2. Select the "Duplicates" option (1) to search files with the same content.

        3. Click on the "+ Add" button (2) and select the directories to look for files (make sure the "recurse" option is checked to include sub-directories).

        4. Click on the "Find" button (3) and wait.

        enter image description here






        share|improve this answer
























          up vote
          3
          down vote













          If you're looking for (or okay with) a GUI application, you may try the "FSlint Janitor" application. You can install it by running



          sudo apt-get install fslint


          How to use the application:



          Once installed, follow the steps below.



          1. Launch the application.

          2. Select the "Duplicates" option (1) to search files with the same content.

          3. Click on the "+ Add" button (2) and select the directories to look for files (make sure the "recurse" option is checked to include sub-directories).

          4. Click on the "Find" button (3) and wait.

          enter image description here






          share|improve this answer






















            up vote
            3
            down vote










            up vote
            3
            down vote









            If you're looking for (or okay with) a GUI application, you may try the "FSlint Janitor" application. You can install it by running



            sudo apt-get install fslint


            How to use the application:



            Once installed, follow the steps below.



            1. Launch the application.

            2. Select the "Duplicates" option (1) to search files with the same content.

            3. Click on the "+ Add" button (2) and select the directories to look for files (make sure the "recurse" option is checked to include sub-directories).

            4. Click on the "Find" button (3) and wait.

            enter image description here






            share|improve this answer












            If you're looking for (or okay with) a GUI application, you may try the "FSlint Janitor" application. You can install it by running



            sudo apt-get install fslint


            How to use the application:



            Once installed, follow the steps below.



            1. Launch the application.

            2. Select the "Duplicates" option (1) to search files with the same content.

            3. Click on the "+ Add" button (2) and select the directories to look for files (make sure the "recurse" option is checked to include sub-directories).

            4. Click on the "Find" button (3) and wait.

            enter image description here







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Feb 28 at 11:18









            pomsky

            23.3k77299




            23.3k77299




















                up vote
                3
                down vote













                This might take a while, but it should be effective and reliable. It assumes you are using Bash. Replace file with the name of your renamed file:



                shopt -s globstar
                for i in **; do [ -f "$i" ] && cmp --silent file "$i" && echo "$i"; done



                • shopt -s globstar turns on recursive globbing with **. You can turn it off with shopt -u globstar, but it is off by default and will be off when you open a new shell.


                • for i in ** loop over all files below this one. Run the command from the highest level directory that might contain the file or the directory with the file or the directory... (apply recursion to this sentence!)


                • [ -f "$i" ] && if the file is a regular file that exists then...


                • cmp --silent file "$i" && if there is no difference between file and the file being examined (ie if cmp exits successfully), then...


                • echo $i print the relative path of the file (this also prints the path of file itself, but I didn't see much benefit in fixing that).

                Thanks to this answer on Stack Overflow for the cmp method of comparing files.






                share|improve this answer


























                  up vote
                  3
                  down vote













                  This might take a while, but it should be effective and reliable. It assumes you are using Bash. Replace file with the name of your renamed file:



                  shopt -s globstar
                  for i in **; do [ -f "$i" ] && cmp --silent file "$i" && echo "$i"; done



                  • shopt -s globstar turns on recursive globbing with **. You can turn it off with shopt -u globstar, but it is off by default and will be off when you open a new shell.


                  • for i in ** loop over all files below this one. Run the command from the highest level directory that might contain the file or the directory with the file or the directory... (apply recursion to this sentence!)


                  • [ -f "$i" ] && if the file is a regular file that exists then...


                  • cmp --silent file "$i" && if there is no difference between file and the file being examined (ie if cmp exits successfully), then...


                  • echo $i print the relative path of the file (this also prints the path of file itself, but I didn't see much benefit in fixing that).

                  Thanks to this answer on Stack Overflow for the cmp method of comparing files.






                  share|improve this answer
























                    up vote
                    3
                    down vote










                    up vote
                    3
                    down vote









                    This might take a while, but it should be effective and reliable. It assumes you are using Bash. Replace file with the name of your renamed file:



                    shopt -s globstar
                    for i in **; do [ -f "$i" ] && cmp --silent file "$i" && echo "$i"; done



                    • shopt -s globstar turns on recursive globbing with **. You can turn it off with shopt -u globstar, but it is off by default and will be off when you open a new shell.


                    • for i in ** loop over all files below this one. Run the command from the highest level directory that might contain the file or the directory with the file or the directory... (apply recursion to this sentence!)


                    • [ -f "$i" ] && if the file is a regular file that exists then...


                    • cmp --silent file "$i" && if there is no difference between file and the file being examined (ie if cmp exits successfully), then...


                    • echo $i print the relative path of the file (this also prints the path of file itself, but I didn't see much benefit in fixing that).

                    Thanks to this answer on Stack Overflow for the cmp method of comparing files.






                    share|improve this answer














                    This might take a while, but it should be effective and reliable. It assumes you are using Bash. Replace file with the name of your renamed file:



                    shopt -s globstar
                    for i in **; do [ -f "$i" ] && cmp --silent file "$i" && echo "$i"; done



                    • shopt -s globstar turns on recursive globbing with **. You can turn it off with shopt -u globstar, but it is off by default and will be off when you open a new shell.


                    • for i in ** loop over all files below this one. Run the command from the highest level directory that might contain the file or the directory with the file or the directory... (apply recursion to this sentence!)


                    • [ -f "$i" ] && if the file is a regular file that exists then...


                    • cmp --silent file "$i" && if there is no difference between file and the file being examined (ie if cmp exits successfully), then...


                    • echo $i print the relative path of the file (this also prints the path of file itself, but I didn't see much benefit in fixing that).

                    Thanks to this answer on Stack Overflow for the cmp method of comparing files.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Feb 28 at 11:29

























                    answered Feb 28 at 10:58









                    Zanna

                    48.2k13120228




                    48.2k13120228




















                        up vote
                        2
                        down vote













                        Grep can find it quickly



                        When used properly, the grep command can find the duplicate quickly. You must be careful not to search the whole file system or it will take days to complete. I recently documented this here: `grep`ing all files for a string takes a long time



                        For optimum speed use:



                        grep -rnw --exclude-dir=boot,dev,lib,media,mnt,proc,root,run,sys,/tmp,tmpfs,var '/' -e 'String in file'


                        If your file might be on a Windows directory remove the mnt directory.



                        If you know the file is within the /home directory someplace you can shorten the command:



                        grep -rnw '/home' -e 'String in file'





                        share|improve this answer
























                          up vote
                          2
                          down vote













                          Grep can find it quickly



                          When used properly, the grep command can find the duplicate quickly. You must be careful not to search the whole file system or it will take days to complete. I recently documented this here: `grep`ing all files for a string takes a long time



                          For optimum speed use:



                          grep -rnw --exclude-dir=boot,dev,lib,media,mnt,proc,root,run,sys,/tmp,tmpfs,var '/' -e 'String in file'


                          If your file might be on a Windows directory remove the mnt directory.



                          If you know the file is within the /home directory someplace you can shorten the command:



                          grep -rnw '/home' -e 'String in file'





                          share|improve this answer






















                            up vote
                            2
                            down vote










                            up vote
                            2
                            down vote









                            Grep can find it quickly



                            When used properly, the grep command can find the duplicate quickly. You must be careful not to search the whole file system or it will take days to complete. I recently documented this here: `grep`ing all files for a string takes a long time



                            For optimum speed use:



                            grep -rnw --exclude-dir=boot,dev,lib,media,mnt,proc,root,run,sys,/tmp,tmpfs,var '/' -e 'String in file'


                            If your file might be on a Windows directory remove the mnt directory.



                            If you know the file is within the /home directory someplace you can shorten the command:



                            grep -rnw '/home' -e 'String in file'





                            share|improve this answer












                            Grep can find it quickly



                            When used properly, the grep command can find the duplicate quickly. You must be careful not to search the whole file system or it will take days to complete. I recently documented this here: `grep`ing all files for a string takes a long time



                            For optimum speed use:



                            grep -rnw --exclude-dir=boot,dev,lib,media,mnt,proc,root,run,sys,/tmp,tmpfs,var '/' -e 'String in file'


                            If your file might be on a Windows directory remove the mnt directory.



                            If you know the file is within the /home directory someplace you can shorten the command:



                            grep -rnw '/home' -e 'String in file'






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Feb 28 at 11:18









                            WinEunuuchs2Unix

                            36.2k759134




                            36.2k759134












                                Popular posts from this blog

                                Trouble downloading packages list due to a “Hash sum mismatch” error

                                How do so many people here on Academia.SE, and in general, afford lavish higher education programs?

                                How do I move numbers in filenames, in a batch renaming operation?