How to count number of partial occurrences of a string in a file
![Creative The name of the picture](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgO9GURib1T8z7lCwjOGLQaGtrueEthgQ8LO42ZX8cOfTqDK4jvDDpKkLFwf2J49kYCMNW7d4ABih_XCb_2UXdq5fPJDkoyg7-8g_YfRUot-XnaXkNYycsNp7lA5_TW9td0FFpLQ2APzKcZ/s1600/1.jpg)
![Creative The name of the picture](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYQ0N5W1qAOxLP7t7iOM6O6AzbZnkXUy16s7P_CWfOb5UbTQY_aDsc727chyphenhyphen5W4IppVNernMMQeaUFTB_rFzAd95_CDt-tnwN-nBx6JyUp2duGjPaL5-VgNO41AVsA_vu30EJcipdDG409/s400/Clash+Royale+CLAN+TAG%2523URR8PPP.png)
up vote
5
down vote
favorite
I have a file of which I need to count all partial matches for an input string in a file.
I'll show you an easy example of what I need:
In a file with this content:
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
I need to count how many times does the partial string "Good -*-Cat" (Where * could be anything, it doesn't matter) appears. The expected output count is 2.
Any help will be appreciated.
command-line bash
add a comment |Â
up vote
5
down vote
favorite
I have a file of which I need to count all partial matches for an input string in a file.
I'll show you an easy example of what I need:
In a file with this content:
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
I need to count how many times does the partial string "Good -*-Cat" (Where * could be anything, it doesn't matter) appears. The expected output count is 2.
Any help will be appreciated.
command-line bash
add a comment |Â
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I have a file of which I need to count all partial matches for an input string in a file.
I'll show you an easy example of what I need:
In a file with this content:
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
I need to count how many times does the partial string "Good -*-Cat" (Where * could be anything, it doesn't matter) appears. The expected output count is 2.
Any help will be appreciated.
command-line bash
I have a file of which I need to count all partial matches for an input string in a file.
I'll show you an easy example of what I need:
In a file with this content:
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
I need to count how many times does the partial string "Good -*-Cat" (Where * could be anything, it doesn't matter) appears. The expected output count is 2.
Any help will be appreciated.
command-line bash
asked May 26 at 21:47
![](https://lh5.googleusercontent.com/-twJtpBnbhhQ/AAAAAAAAAAI/AAAAAAAAAD8/g47FKxEjRGo/photo.jpg?sz=32)
![](https://lh5.googleusercontent.com/-twJtpBnbhhQ/AAAAAAAAAAI/AAAAAAAAAD8/g47FKxEjRGo/photo.jpg?sz=32)
Rodrigo Andres Nava Lara
414
414
add a comment |Â
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
12
down vote
Given
$ cat file
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
then
$ grep -c 'Good-.*-Cat' file
2
Note that this is a count of matching lines - so for example it won't work for multiple occurrences per line, or for occurrences that span lines.
Alternatively, with awk
awk '/Good-.*-Cat/ n++ END print n' file
If you need to match multiple possible occurrences per line, then I'd suggest perl
:
perl -lne '$c += () = /Good-.*?-Cat/g } wc -l
If you also need to match occurrences that may span a line boundary, then you can do so in perl
by unsetting the record separator (note: this means that that the whole file is slurped into memory) and adding the s
regex modifier e.g.
perl -0777 -nE '$c += () = /Good-.*?-Cat/gs wc -l
If you also need to match occurrences that may span a line boundary, then you can do so in perl
by unsetting the record separator (note: this means that that the whole file is slurped into memory) and adding the s
regex modifier e.g.
perl -0777 -nE '$c += () = /Good-.*?-Cat/gs Â
up vote
12
down vote
Given
$ cat file
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
then
$ grep -c 'Good-.*-Cat' file
2
Note that this is a count of matching lines - so for example it won't work for multiple occurrences per line, or for occurrences that span lines.
Alternatively, with awk
awk '/Good-.*-Cat/ n++ END print n' file
If you need to match multiple possible occurrences per line, then I'd suggest perl
:
perl -lne '$c += () = /Good-.*?-Cat/g wc -l
If you also need to match occurrences that may span a line boundary, then you can do so in perl
by unsetting the record separator (note: this means that that the whole file is slurped into memory) and adding the s
regex modifier e.g.
perl -0777 -nE '$c += () = /Good-.*?-Cat/gs Â
up vote
12
down vote
up vote
12
down vote
Given
$ cat file
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
then
$ grep -c 'Good-.*-Cat' file
2
Note that this is a count of matching lines - so for example it won't work for multiple occurrences per line, or for occurrences that span lines.
Alternatively, with awk
awk '/Good-.*-Cat/ n++ END print n' file
If you need to match multiple possible occurrences per line, then I'd suggest perl
:
perl -lne '$c += () = /Good-.*?-Cat/g wc -l
If you also need to match occurrences that may span a line boundary, then you can do so in perl
by unsetting the record separator (note: this means that that the whole file is slurped into memory) and adding the s
regex modifier e.g.
perl -0777 -nE '$c += () = /Good-.*?-Cat/gs improve this answer
Given
$ cat file
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
then
$ grep -c 'Good-.*-Cat' file
2
Note that this is a count of matching lines - so for example it won't work for multiple occurrences per line, or for occurrences that span lines.
Alternatively, with awk
awk '/Good-.*-Cat/ n++ END print n' file
If you need to match multiple possible occurrences per line, then I'd suggest perl
:
perl -lne '$c += () = /Good-.*?-Cat/g wc -l
If you also need to match occurrences that may span a line boundary, then you can do so in perl
by unsetting the record separator (note: this means that that the whole file is slurped into memory) and adding the s
regex modifier e.g.
perl -0777 -nE '$c += () = /Good-.*?-Cat/gs { say $c' file
edited May 27 at 11:46
answered May 26 at 21:56
steeldriver
62.1k1196163
62.1k1196163
Thank you for your answer. In the case there is the need to count multiple occurrences per line, which would be your recommendation? I've tried the awk code and apparently it counts only matching lines as well.
â Rodrigo Andres Nava Lara
May 26 at 22:16
@RodrigoAndresNavaLara please see updated answer
â steeldriver
May 26 at 22:32
add a comment |Â
Thank you for your answer. In the case there is the need to count multiple occurrences per line, which would be your recommendation? I've tried the awk code and apparently it counts only matching lines as well.
â Rodrigo Andres Nava Lara
May 26 at 22:16
@RodrigoAndresNavaLara please see updated answer
â steeldriver
May 26 at 22:32
Thank you for your answer. In the case there is the need to count multiple occurrences per line, which would be your recommendation? I've tried the awk code and apparently it counts only matching lines as well.
â Rodrigo Andres Nava Lara
May 26 at 22:16
Thank you for your answer. In the case there is the need to count multiple occurrences per line, which would be your recommendation? I've tried the awk code and apparently it counts only matching lines as well.
â Rodrigo Andres Nava Lara
May 26 at 22:16
@RodrigoAndresNavaLara please see updated answer
â steeldriver
May 26 at 22:32
@RodrigoAndresNavaLara please see updated answer
â steeldriver
May 26 at 22:32
add a comment |Â
up vote
4
down vote
awk, multiple occurences, space-separated
$ awk 'for(i=1;i<=NF;i++ ) count+=match($i,/Good-.*-Cat/);ENDprint count' input.txt
4
$ cat input.txt
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog Good-Whatever-Cat Good-Something-Cat
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
sed + wc, non-multiple occurences
This uses negative pattern matching //!
with d
for delete, leaving only lines of interest.
$ sed '/Good-.*-Cat/!d' input.txt
Good-Black-Cat
Good-Golden-Dog Good-Whatever-Cat
Good-Tabby-Cat
$ sed '/Good-.*-Cat/!d' input.txt | wc -l
3
Shell solution, non-multiple occurences
Here's shell way that combines case...esac
and file-reading loop:
$ n=0; while IFS= read -r line || [ -n "$line" ]; do case "$line" in "Good-"*"-Cat") n=$((n+1));; esac; done < input.txt; echo "$n"
2
Or with indientation
n=0
while IFS= read -r line || [ -n "$line" ]; do
case "$line" in
"Good-"*"-Cat") n=$((n+1));;
esac
done < input.txt
echo "$n"
Explanation:
n=0
initializesn
counter variablewhile IFS= read -r line || [ -n "$line" ]; do...done < input.txt
is standard file-reading loop used in shell scripting, with|| [ -n "$line" ]
protection to account for possible files that don't end in newlinecase "$line" in "Good-"*"-Cat") n=$((n+1));; esac
pattern-matching for the desired string with$((...))
arithmetic expansion to increment the counter variable.
add a comment |Â
up vote
4
down vote
awk, multiple occurences, space-separated
$ awk 'for(i=1;i<=NF;i++ ) count+=match($i,/Good-.*-Cat/);ENDprint count' input.txt
4
$ cat input.txt
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog Good-Whatever-Cat Good-Something-Cat
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
sed + wc, non-multiple occurences
This uses negative pattern matching //!
with d
for delete, leaving only lines of interest.
$ sed '/Good-.*-Cat/!d' input.txt
Good-Black-Cat
Good-Golden-Dog Good-Whatever-Cat
Good-Tabby-Cat
$ sed '/Good-.*-Cat/!d' input.txt | wc -l
3
Shell solution, non-multiple occurences
Here's shell way that combines case...esac
and file-reading loop:
$ n=0; while IFS= read -r line || [ -n "$line" ]; do case "$line" in "Good-"*"-Cat") n=$((n+1));; esac; done < input.txt; echo "$n"
2
Or with indientation
n=0
while IFS= read -r line || [ -n "$line" ]; do
case "$line" in
"Good-"*"-Cat") n=$((n+1));;
esac
done < input.txt
echo "$n"
Explanation:
n=0
initializesn
counter variablewhile IFS= read -r line || [ -n "$line" ]; do...done < input.txt
is standard file-reading loop used in shell scripting, with|| [ -n "$line" ]
protection to account for possible files that don't end in newlinecase "$line" in "Good-"*"-Cat") n=$((n+1));; esac
pattern-matching for the desired string with$((...))
arithmetic expansion to increment the counter variable.
add a comment |Â
up vote
4
down vote
up vote
4
down vote
awk, multiple occurences, space-separated
$ awk 'for(i=1;i<=NF;i++ ) count+=match($i,/Good-.*-Cat/);ENDprint count' input.txt
4
$ cat input.txt
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog Good-Whatever-Cat Good-Something-Cat
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
sed + wc, non-multiple occurences
This uses negative pattern matching //!
with d
for delete, leaving only lines of interest.
$ sed '/Good-.*-Cat/!d' input.txt
Good-Black-Cat
Good-Golden-Dog Good-Whatever-Cat
Good-Tabby-Cat
$ sed '/Good-.*-Cat/!d' input.txt | wc -l
3
Shell solution, non-multiple occurences
Here's shell way that combines case...esac
and file-reading loop:
$ n=0; while IFS= read -r line || [ -n "$line" ]; do case "$line" in "Good-"*"-Cat") n=$((n+1));; esac; done < input.txt; echo "$n"
2
Or with indientation
n=0
while IFS= read -r line || [ -n "$line" ]; do
case "$line" in
"Good-"*"-Cat") n=$((n+1));;
esac
done < input.txt
echo "$n"
Explanation:
n=0
initializesn
counter variablewhile IFS= read -r line || [ -n "$line" ]; do...done < input.txt
is standard file-reading loop used in shell scripting, with|| [ -n "$line" ]
protection to account for possible files that don't end in newlinecase "$line" in "Good-"*"-Cat") n=$((n+1));; esac
pattern-matching for the desired string with$((...))
arithmetic expansion to increment the counter variable.
awk, multiple occurences, space-separated
$ awk 'for(i=1;i<=NF;i++ ) count+=match($i,/Good-.*-Cat/);ENDprint count' input.txt
4
$ cat input.txt
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog Good-Whatever-Cat Good-Something-Cat
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat
sed + wc, non-multiple occurences
This uses negative pattern matching //!
with d
for delete, leaving only lines of interest.
$ sed '/Good-.*-Cat/!d' input.txt
Good-Black-Cat
Good-Golden-Dog Good-Whatever-Cat
Good-Tabby-Cat
$ sed '/Good-.*-Cat/!d' input.txt | wc -l
3
Shell solution, non-multiple occurences
Here's shell way that combines case...esac
and file-reading loop:
$ n=0; while IFS= read -r line || [ -n "$line" ]; do case "$line" in "Good-"*"-Cat") n=$((n+1));; esac; done < input.txt; echo "$n"
2
Or with indientation
n=0
while IFS= read -r line || [ -n "$line" ]; do
case "$line" in
"Good-"*"-Cat") n=$((n+1));;
esac
done < input.txt
echo "$n"
Explanation:
n=0
initializesn
counter variablewhile IFS= read -r line || [ -n "$line" ]; do...done < input.txt
is standard file-reading loop used in shell scripting, with|| [ -n "$line" ]
protection to account for possible files that don't end in newlinecase "$line" in "Good-"*"-Cat") n=$((n+1));; esac
pattern-matching for the desired string with$((...))
arithmetic expansion to increment the counter variable.
edited May 26 at 23:24
answered May 26 at 22:58
![](https://i.stack.imgur.com/U1Jy6.jpg?s=32&g=1)
![](https://i.stack.imgur.com/U1Jy6.jpg?s=32&g=1)
Sergiy Kolodyazhnyy
64k9127274
64k9127274
add a comment |Â
add a comment |Â
up vote
3
down vote
Non-fancy sed/grep version
sed 's/(Good-[^ ]*-Cat)/XXXXn/g' input.txt | grep -c XXXX
While XXXX
can be any pattern that does not appear otherwise in your file. This approach replaces all matches with the XXXX
pattern and a newline, so to make it easily countable by a basic grep expression.
By the way if you take "Where * could be anything" literally, at least to my understanding, the output of any such program would always be 0 or 1, so I am assuming that it should not contain a space at least.
add a comment |Â
up vote
3
down vote
Non-fancy sed/grep version
sed 's/(Good-[^ ]*-Cat)/XXXXn/g' input.txt | grep -c XXXX
While XXXX
can be any pattern that does not appear otherwise in your file. This approach replaces all matches with the XXXX
pattern and a newline, so to make it easily countable by a basic grep expression.
By the way if you take "Where * could be anything" literally, at least to my understanding, the output of any such program would always be 0 or 1, so I am assuming that it should not contain a space at least.
add a comment |Â
up vote
3
down vote
up vote
3
down vote
Non-fancy sed/grep version
sed 's/(Good-[^ ]*-Cat)/XXXXn/g' input.txt | grep -c XXXX
While XXXX
can be any pattern that does not appear otherwise in your file. This approach replaces all matches with the XXXX
pattern and a newline, so to make it easily countable by a basic grep expression.
By the way if you take "Where * could be anything" literally, at least to my understanding, the output of any such program would always be 0 or 1, so I am assuming that it should not contain a space at least.
Non-fancy sed/grep version
sed 's/(Good-[^ ]*-Cat)/XXXXn/g' input.txt | grep -c XXXX
While XXXX
can be any pattern that does not appear otherwise in your file. This approach replaces all matches with the XXXX
pattern and a newline, so to make it easily countable by a basic grep expression.
By the way if you take "Where * could be anything" literally, at least to my understanding, the output of any such program would always be 0 or 1, so I am assuming that it should not contain a space at least.
answered May 27 at 7:27
![](https://i.stack.imgur.com/33gTl.jpg?s=32&g=1)
![](https://i.stack.imgur.com/33gTl.jpg?s=32&g=1)
Sebastian Stark
4,603838
4,603838
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1040702%2fhow-to-count-number-of-partial-occurrences-of-a-string-in-a-file%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password