How to delete lines where certain pattern appears on the specific position

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP








up vote
0
down vote

favorite












I am having a file that looks like this:



PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
PDB; 2L7W; NMR; -; A=1-187.
PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
PDB; 5LZW; EM; 3.53 A; ii=1-385.
PDB; 5LZX; EM; 3.67 A; ii=1-385.
PDB; 5LZY; EM; 3.99 A; ii=1-385.
PDB; 5LZZ; EM; 3.47 A; ii=1-385.


From this file I want to match all EM; elements that are found just after PDB; (four letter code); EM;. So under this column either X-ray;, NMR; or EM; can be found. For those lines that have EM; remove them. Is there some bash command that I can use to match these elements and remove these lines?



Importantly when matching it put the space before EM, so match it with space, like EM;.



Expected result is:



PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
PDB; 2L7W; NMR; -; A=1-187.
PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.









share|improve this question



























    up vote
    0
    down vote

    favorite












    I am having a file that looks like this:



    PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
    PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
    PDB; 2L7W; NMR; -; A=1-187.
    PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

    PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
    PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
    PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

    PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
    PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
    PDB; 5LZW; EM; 3.53 A; ii=1-385.
    PDB; 5LZX; EM; 3.67 A; ii=1-385.
    PDB; 5LZY; EM; 3.99 A; ii=1-385.
    PDB; 5LZZ; EM; 3.47 A; ii=1-385.


    From this file I want to match all EM; elements that are found just after PDB; (four letter code); EM;. So under this column either X-ray;, NMR; or EM; can be found. For those lines that have EM; remove them. Is there some bash command that I can use to match these elements and remove these lines?



    Importantly when matching it put the space before EM, so match it with space, like EM;.



    Expected result is:



    PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
    PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
    PDB; 2L7W; NMR; -; A=1-187.
    PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

    PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
    PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
    PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

    PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
    PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.









    share|improve this question

























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I am having a file that looks like this:



      PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
      PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
      PDB; 2L7W; NMR; -; A=1-187.
      PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

      PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
      PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
      PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

      PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
      PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
      PDB; 5LZW; EM; 3.53 A; ii=1-385.
      PDB; 5LZX; EM; 3.67 A; ii=1-385.
      PDB; 5LZY; EM; 3.99 A; ii=1-385.
      PDB; 5LZZ; EM; 3.47 A; ii=1-385.


      From this file I want to match all EM; elements that are found just after PDB; (four letter code); EM;. So under this column either X-ray;, NMR; or EM; can be found. For those lines that have EM; remove them. Is there some bash command that I can use to match these elements and remove these lines?



      Importantly when matching it put the space before EM, so match it with space, like EM;.



      Expected result is:



      PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
      PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
      PDB; 2L7W; NMR; -; A=1-187.
      PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

      PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
      PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
      PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

      PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
      PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.









      share|improve this question















      I am having a file that looks like this:



      PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
      PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
      PDB; 2L7W; NMR; -; A=1-187.
      PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

      PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
      PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
      PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

      PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
      PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
      PDB; 5LZW; EM; 3.53 A; ii=1-385.
      PDB; 5LZX; EM; 3.67 A; ii=1-385.
      PDB; 5LZY; EM; 3.99 A; ii=1-385.
      PDB; 5LZZ; EM; 3.47 A; ii=1-385.


      From this file I want to match all EM; elements that are found just after PDB; (four letter code); EM;. So under this column either X-ray;, NMR; or EM; can be found. For those lines that have EM; remove them. Is there some bash command that I can use to match these elements and remove these lines?



      Importantly when matching it put the space before EM, so match it with space, like EM;.



      Expected result is:



      PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
      PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
      PDB; 2L7W; NMR; -; A=1-187.
      PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

      PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
      PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
      PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

      PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
      PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.






      command-line bash text-processing sed awk






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 6 at 12:26









      dessert

      19.9k55795




      19.9k55795










      asked Mar 6 at 12:06









      sergio

      786




      786




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted












          awk can do that:



          awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile


          This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.



          Output



          $ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
          PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
          PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
          PDB; 2L7W; NMR; -; A=1-187.
          PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

          PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
          PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
          PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

          PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
          PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





          share|improve this answer





























            up vote
            1
            down vote













            You could do something like this - using perl's paragraph mode:



            $ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
            PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
            PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
            PDB; 2L7W; NMR; -; A=1-187.
            PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

            PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
            PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
            PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

            PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
            PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





            share|improve this answer




















              Your Answer







              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "89"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: true,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1012393%2fhow-to-delete-lines-where-certain-pattern-appears-on-the-specific-position%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              1
              down vote



              accepted












              awk can do that:



              awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile


              This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.



              Output



              $ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
              PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
              PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
              PDB; 2L7W; NMR; -; A=1-187.
              PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

              PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
              PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
              PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

              PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
              PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





              share|improve this answer


























                up vote
                1
                down vote



                accepted












                awk can do that:



                awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile


                This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.



                Output



                $ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
                PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                PDB; 2L7W; NMR; -; A=1-187.
                PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





                share|improve this answer
























                  up vote
                  1
                  down vote



                  accepted







                  up vote
                  1
                  down vote



                  accepted








                  awk can do that:



                  awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile


                  This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.



                  Output



                  $ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
                  PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                  PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                  PDB; 2L7W; NMR; -; A=1-187.
                  PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                  PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                  PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                  PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                  PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                  PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





                  share|improve this answer
















                  awk can do that:



                  awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <yourfile


                  This tests if the first column (by default whitespaces are taken as the delimiter) of the current line is PDB; and the third column is EM; and prints the line only if not both are true.



                  Output



                  $ awk 'if(!($1=="PDB;"&&$3=="EM;"))print' <test
                  PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                  PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                  PDB; 2L7W; NMR; -; A=1-187.
                  PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                  PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                  PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                  PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                  PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                  PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Mar 6 at 12:47

























                  answered Mar 6 at 12:17









                  dessert

                  19.9k55795




                  19.9k55795






















                      up vote
                      1
                      down vote













                      You could do something like this - using perl's paragraph mode:



                      $ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
                      PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                      PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                      PDB; 2L7W; NMR; -; A=1-187.
                      PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                      PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                      PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                      PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                      PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                      PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





                      share|improve this answer
























                        up vote
                        1
                        down vote













                        You could do something like this - using perl's paragraph mode:



                        $ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
                        PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                        PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                        PDB; 2L7W; NMR; -; A=1-187.
                        PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                        PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                        PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                        PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                        PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                        PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





                        share|improve this answer






















                          up vote
                          1
                          down vote










                          up vote
                          1
                          down vote









                          You could do something like this - using perl's paragraph mode:



                          $ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
                          PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                          PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                          PDB; 2L7W; NMR; -; A=1-187.
                          PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                          PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                          PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                          PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                          PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                          PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.





                          share|improve this answer












                          You could do something like this - using perl's paragraph mode:



                          $ perl -F'n' -00le 'print join "n", grep !/PDB; ....; EM;/ @F' file
                          PEBP1_HUMAN Homo sapiens P30086 PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                          PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                          PDB; 2L7W; NMR; -; A=1-187.
                          PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

                          PECA1_HUMAN Homo sapiens P16284 PDB; 2KY5; NMR; -; A=686-738.
                          PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                          PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

                          PELO_HUMAN Homo sapiens Q9BRX2 PDB; 1X52; NMR; -; A=261-371.
                          PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Mar 6 at 13:18









                          steeldriver

                          63.3k1198167




                          63.3k1198167



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1012393%2fhow-to-delete-lines-where-certain-pattern-appears-on-the-specific-position%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Popular posts from this blog

                              How do so many people here on Academia.SE, and in general, afford lavish higher education programs?

                              Trouble downloading packages list due to a “Hash sum mismatch” error

                              How do I move numbers in filenames, in a batch renaming operation?