How to delete lines where certain pattern appears on the specific position

up vote
0
down vote

favorite

I am having a file that looks like this:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
 PDB; 5LZW; EM; 3.53 A; ii=1-385.
 PDB; 5LZX; EM; 3.67 A; ii=1-385.
 PDB; 5LZY; EM; 3.99 A; ii=1-385.
 PDB; 5LZZ; EM; 3.47 A; ii=1-385.

From this file I want to match all EM; elements that are found just after PDB; (four letter code); EM;. So under this column either X-ray;, NMR; or EM; can be found. For those lines that have EM; remove them. Is there some bash command that I can use to match these elements and remove these lines?

Importantly when matching it put the space before EM, so match it with space, like EM;.

Expected result is:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:26

dessert

19.9k55795

asked Mar 6 at 12:06

sergio

786

add a commentÂ |Â

up vote
0
down vote

favorite

I am having a file that looks like this:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
 PDB; 5LZW; EM; 3.53 A; ii=1-385.
 PDB; 5LZX; EM; 3.67 A; ii=1-385.
 PDB; 5LZY; EM; 3.99 A; ii=1-385.
 PDB; 5LZZ; EM; 3.47 A; ii=1-385.

Importantly when matching it put the space before EM, so match it with space, like EM;.

Expected result is:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:26

dessert

19.9k55795

asked Mar 6 at 12:06

sergio

786

add a commentÂ |Â

up vote
0
down vote

favorite

I am having a file that looks like this:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
 PDB; 5LZW; EM; 3.53 A; ii=1-385.
 PDB; 5LZX; EM; 3.67 A; ii=1-385.
 PDB; 5LZY; EM; 3.99 A; ii=1-385.
 PDB; 5LZZ; EM; 3.47 A; ii=1-385.

Importantly when matching it put the space before EM, so match it with space, like EM;.

Expected result is:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:26

dessert

19.9k55795

asked Mar 6 at 12:06

sergio

786

I am having a file that looks like this:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
 PDB; 5LZW; EM; 3.53 A; ii=1-385.
 PDB; 5LZX; EM; 3.67 A; ii=1-385.
 PDB; 5LZY; EM; 3.99 A; ii=1-385.
 PDB; 5LZZ; EM; 3.47 A; ii=1-385.

Importantly when matching it put the space before EM, so match it with space, like EM;.

Expected result is:

PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

command-line bash text-processing sed awk

edited Mar 6 at 12:26

dessert

19.9k55795

asked Mar 6 at 12:06

sergio

786

edited Mar 6 at 12:26

dessert

19.9k55795

asked Mar 6 at 12:06

sergio

786

edited Mar 6 at 12:26

dessert

19.9k55795

edited Mar 6 at 12:26

dessert

19.9k55795

edited Mar 6 at 12:26

dessert

19.9k55795

asked Mar 6 at 12:06

sergio

786

asked Mar 6 at 12:06

sergio

786

asked Mar 6 at 12:06

sergio

786

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

awk can do that:

awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile

This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.

Output

$ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:47

answered Mar 6 at 12:17

dessert

19.9k55795

add a commentÂ |Â

up vote
1
down vote

You could do something like this - using perl's paragraph mode:

$ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

answered Mar 6 at 13:18

steeldriver

63.3k1198167

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "89"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');

var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);

;
$window.on('scroll', onScroll);

);

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1012393%2fhow-to-delete-lines-where-certain-pattern-appears-on-the-specific-position%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

awk can do that:

awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile

This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.

Output

$ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:47

answered Mar 6 at 12:17

dessert

19.9k55795

add a commentÂ |Â

up vote
1
down vote

accepted

awk can do that:

awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile

This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.

Output

$ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:47

answered Mar 6 at 12:17

dessert

19.9k55795

add a commentÂ |Â

up vote
1
down vote

accepted

awk can do that:

awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile

This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.

Output

$ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:47

answered Mar 6 at 12:17

dessert

19.9k55795

awk can do that:

awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile

This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.

Output

$ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

edited Mar 6 at 12:47

answered Mar 6 at 12:17

dessert

19.9k55795

edited Mar 6 at 12:47

answered Mar 6 at 12:17

dessert

19.9k55795

answered Mar 6 at 12:17

dessert

19.9k55795

answered Mar 6 at 12:17

dessert

19.9k55795

add a commentÂ |Â

up vote
1
down vote

You could do something like this - using perl's paragraph mode:

$ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

answered Mar 6 at 13:18

steeldriver

63.3k1198167

add a commentÂ |Â

up vote
1
down vote

You could do something like this - using perl's paragraph mode:

$ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

answered Mar 6 at 13:18

steeldriver

63.3k1198167

add a commentÂ |Â

up vote
1
down vote

You could do something like this - using perl's paragraph mode:

$ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

answered Mar 6 at 13:18

steeldriver

63.3k1198167

You could do something like this - using perl's paragraph mode:

$ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
 PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
 PDB; 2L7W; NMR; -; A=1-187.
 PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
 PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
 PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
 PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

answered Mar 6 at 13:18

steeldriver

63.3k1198167

answered Mar 6 at 13:18

steeldriver

63.3k1198167

answered Mar 6 at 13:18

steeldriver

63.3k1198167

answered Mar 6 at 13:18

steeldriver

63.3k1198167

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

Post as a guest

Name

Sign up or log in

Name

搜尋此網誌

Gfilui