How do I make a conditional search and replace that will add a line between two lines with different match criteria?

up vote
0
down vote

favorite

I have a text file many thousands of lines long with ASCII and non-ACII characters. It is supposed to follow a pattern of

First line: only non-ASCII characters
Second line: only non-ASCII characters
Third line: only ASCII characters
Fourth line: mix of ASCII and non-ASCII characters

Unfortunately, the reality is that it looks something like the following example, where in the middle it is missing the line that mixes ASCII and non-ASCII characters:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

Fortunately, as far as I can tell, it is only the line that mixes ASCII and non-ASCII characters that is sometimes absent. Meaning that what should be groups of 4 lines are sometimes groups of only 3.

To fix the file, I need to:

Search for every line with only ASCII characters.

Test the line following to see if it contains only non-ASCII.

If so, insert a placeholder line following the ASCII only line.

The result should be:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

(I chose to make the placeholder +AÃ£ÂÂ‚+ so that it will conform to the mix of ASCII and non-ASCII as the lines it is standing in for.)

I've found I can use sed to insert new lines sed -e "/this is existing text/a'this is a new line'" < file.text. And I've learned I can search for ASCII characters with sed using LC_ALL=C and [d0-d127].

However, I'm unclear on how to make a conditional separate from the search. I mean, I could insert a line after every instance of ASCII only characters, but how do I make a search that inserts a line when an all ASCII line is found and the next line is only non-ASCII?

Please note that I am not particular about using sed. If an answer can be provided using Gedit, LibreOffice, or any command line operation, that would be great.

edited Apr 27 at 5:44

muru

129k19271462

asked Apr 27 at 3:00

Questioner

1,4382480146

add a commentÂ |Â

up vote
0
down vote

favorite

I have a text file many thousands of lines long with ASCII and non-ACII characters. It is supposed to follow a pattern of

First line: only non-ASCII characters
Second line: only non-ASCII characters
Third line: only ASCII characters
Fourth line: mix of ASCII and non-ASCII characters

Unfortunately, the reality is that it looks something like the following example, where in the middle it is missing the line that mixes ASCII and non-ASCII characters:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

To fix the file, I need to:

Search for every line with only ASCII characters.

Test the line following to see if it contains only non-ASCII.

If so, insert a placeholder line following the ASCII only line.

The result should be:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

(I chose to make the placeholder +AÃ£ÂÂ‚+ so that it will conform to the mix of ASCII and non-ASCII as the lines it is standing in for.)

Please note that I am not particular about using sed. If an answer can be provided using Gedit, LibreOffice, or any command line operation, that would be great.

edited Apr 27 at 5:44

muru

129k19271462

asked Apr 27 at 3:00

Questioner

1,4382480146

add a commentÂ |Â

up vote
0
down vote

favorite

I have a text file many thousands of lines long with ASCII and non-ACII characters. It is supposed to follow a pattern of

First line: only non-ASCII characters
Second line: only non-ASCII characters
Third line: only ASCII characters
Fourth line: mix of ASCII and non-ASCII characters

Unfortunately, the reality is that it looks something like the following example, where in the middle it is missing the line that mixes ASCII and non-ASCII characters:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

To fix the file, I need to:

Search for every line with only ASCII characters.

Test the line following to see if it contains only non-ASCII.

If so, insert a placeholder line following the ASCII only line.

The result should be:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

(I chose to make the placeholder +AÃ£ÂÂ‚+ so that it will conform to the mix of ASCII and non-ASCII as the lines it is standing in for.)

Please note that I am not particular about using sed. If an answer can be provided using Gedit, LibreOffice, or any command line operation, that would be great.

edited Apr 27 at 5:44

muru

129k19271462

asked Apr 27 at 3:00

Questioner

1,4382480146

I have a text file many thousands of lines long with ASCII and non-ACII characters. It is supposed to follow a pattern of

First line: only non-ASCII characters
Second line: only non-ASCII characters
Third line: only ASCII characters
Fourth line: mix of ASCII and non-ASCII characters

Unfortunately, the reality is that it looks something like the following example, where in the middle it is missing the line that mixes ASCII and non-ASCII characters:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

To fix the file, I need to:

Search for every line with only ASCII characters.

Test the line following to see if it contains only non-ASCII.

If so, insert a placeholder line following the ASCII only line.

The result should be:

Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

(I chose to make the placeholder +AÃ£ÂÂ‚+ so that it will conform to the mix of ASCII and non-ASCII as the lines it is standing in for.)

Please note that I am not particular about using sed. If an answer can be provided using Gedit, LibreOffice, or any command line operation, that would be great.

edited Apr 27 at 5:44

muru

129k19271462

asked Apr 27 at 3:00

Questioner

1,4382480146

edited Apr 27 at 5:44

muru

129k19271462

edited Apr 27 at 5:44

muru

129k19271462

edited Apr 27 at 5:44

muru

129k19271462

asked Apr 27 at 3:00

Questioner

1,4382480146

asked Apr 27 at 3:00

Questioner

1,4382480146

asked Apr 27 at 3:00

Questioner

1,4382480146

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

Based on your recent questions it sounds like you have an XY problem

Here's a sed solution based on @Zanna's answer to your previous question How do I search for lines in a file that only contain ASCII characters and then act on them?

$ LC_ALL=C sed -E '/^[d0-d127]+$/ $!N; s/n[^d0-d127]+$/n+AÃ£ÂÂ‚+&/;' file
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

answered Apr 27 at 12:21

steeldriver

62.7k1196164

Thank you for this help. Sometimes one doesn't know all the problems one will face until one takes a few steps into the task. Even after applying this, I still had to make many manual edits for exceptions and further search and replace tricks for conditions that had not been previously visible. Just the way it goes sometimes.
â€“Â Questioner
May 5 at 4:41

add a commentÂ |Â

up vote
2
down vote

Using awk:

awk '1; ! /^[x01-x7F]*$/ next getline !/[x01-x7F]/ print "+AÃ£ÂÂ‚+" 1'

Print the input line unconditionally - 1 is a true condition, and the default action in that case is to print.

Then, if it isn't (!) entirely ASCII (/^[x01-x7F]*$/), skip processing more rules (proceeding to the next line, but processing rules from 1).

If it is entirely ASCII, we get the next line getline, and if that doesn't ! have any ASCII characters /[x01-x7F]/ in it, print your placeholder.

Finally print the line we read using getline.

I'm assuming that your Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿ lines don't have half-width spaces or punctuation (. ! vs Ã£Â€Â‚Ã£Â€Â€Ã¯Â¼Â).

answered Apr 27 at 6:01

muru

129k19271462

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "89"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');

var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);

;
$window.on('scroll', onScroll);

);

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1028592%2fhow-do-i-make-a-conditional-search-and-replace-that-will-add-a-line-between-two%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

Based on your recent questions it sounds like you have an XY problem

Here's a sed solution based on @Zanna's answer to your previous question How do I search for lines in a file that only contain ASCII characters and then act on them?

$ LC_ALL=C sed -E '/^[d0-d127]+$/ $!N; s/n[^d0-d127]+$/n+AÃ£ÂÂ‚+&/;' file
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

answered Apr 27 at 12:21

steeldriver

62.7k1196164

Thank you for this help. Sometimes one doesn't know all the problems one will face until one takes a few steps into the task. Even after applying this, I still had to make many manual edits for exceptions and further search and replace tricks for conditions that had not been previously visible. Just the way it goes sometimes.
â€“Â Questioner
May 5 at 4:41

add a commentÂ |Â

up vote
2
down vote

accepted

Based on your recent questions it sounds like you have an XY problem

Here's a sed solution based on @Zanna's answer to your previous question How do I search for lines in a file that only contain ASCII characters and then act on them?

$ LC_ALL=C sed -E '/^[d0-d127]+$/ $!N; s/n[^d0-d127]+$/n+AÃ£ÂÂ‚+&/;' file
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

answered Apr 27 at 12:21

steeldriver

62.7k1196164

Thank you for this help. Sometimes one doesn't know all the problems one will face until one takes a few steps into the task. Even after applying this, I still had to make many manual edits for exceptions and further search and replace tricks for conditions that had not been previously visible. Just the way it goes sometimes.
â€“Â Questioner
May 5 at 4:41

add a commentÂ |Â

up vote
2
down vote

accepted

Based on your recent questions it sounds like you have an XY problem

Here's a sed solution based on @Zanna's answer to your previous question How do I search for lines in a file that only contain ASCII characters and then act on them?

$ LC_ALL=C sed -E '/^[d0-d127]+$/ $!N; s/n[^d0-d127]+$/n+AÃ£ÂÂ‚+&/;' file
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

answered Apr 27 at 12:21

steeldriver

62.7k1196164

Based on your recent questions it sounds like you have an XY problem

Here's a sed solution based on @Zanna's answer to your previous question How do I search for lines in a file that only contain ASCII characters and then act on them?

$ LC_ALL=C sed -E '/^[d0-d127]+$/ $!N; s/n[^d0-d127]+$/n+AÃ£ÂÂ‚+&/;' file
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
+AÃ£ÂÂ‚+
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿
English words only
English and Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂž

answered Apr 27 at 12:21

steeldriver

62.7k1196164

answered Apr 27 at 12:21

steeldriver

62.7k1196164

answered Apr 27 at 12:21

steeldriver

62.7k1196164

answered Apr 27 at 12:21

steeldriver

62.7k1196164

Thank you for this help. Sometimes one doesn't know all the problems one will face until one takes a few steps into the task. Even after applying this, I still had to make many manual edits for exceptions and further search and replace tricks for conditions that had not been previously visible. Just the way it goes sometimes.
â€“Â Questioner
May 5 at 4:41

add a commentÂ |Â

Thank you for this help. Sometimes one doesn't know all the problems one will face until one takes a few steps into the task. Even after applying this, I still had to make many manual edits for exceptions and further search and replace tricks for conditions that had not been previously visible. Just the way it goes sometimes.
â€“Â Questioner
May 5 at 4:41

Thank you for this help. Sometimes one doesn't know all the problems one will face until one takes a few steps into the task. Even after applying this, I still had to make many manual edits for exceptions and further search and replace tricks for conditions that had not been previously visible. Just the way it goes sometimes.
â€“Â Questioner
May 5 at 4:41

add a commentÂ |Â

up vote
2
down vote

Using awk:

awk '1; ! /^[x01-x7F]*$/ next getline !/[x01-x7F]/ print "+AÃ£ÂÂ‚+" 1'

Print the input line unconditionally - 1 is a true condition, and the default action in that case is to print.

Then, if it isn't (!) entirely ASCII (/^[x01-x7F]*$/), skip processing more rules (proceeding to the next line, but processing rules from 1).

If it is entirely ASCII, we get the next line getline, and if that doesn't ! have any ASCII characters /[x01-x7F]/ in it, print your placeholder.

Finally print the line we read using getline.

I'm assuming that your Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿ lines don't have half-width spaces or punctuation (. ! vs Ã£Â€Â‚Ã£Â€Â€Ã¯Â¼Â).

answered Apr 27 at 6:01

muru

129k19271462

add a commentÂ |Â

up vote
2
down vote

Using awk:

awk '1; ! /^[x01-x7F]*$/ next getline !/[x01-x7F]/ print "+AÃ£ÂÂ‚+" 1'

Print the input line unconditionally - 1 is a true condition, and the default action in that case is to print.

Then, if it isn't (!) entirely ASCII (/^[x01-x7F]*$/), skip processing more rules (proceeding to the next line, but processing rules from 1).

If it is entirely ASCII, we get the next line getline, and if that doesn't ! have any ASCII characters /[x01-x7F]/ in it, print your placeholder.

Finally print the line we read using getline.

I'm assuming that your Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿ lines don't have half-width spaces or punctuation (. ! vs Ã£Â€Â‚Ã£Â€Â€Ã¯Â¼Â).

answered Apr 27 at 6:01

muru

129k19271462

add a commentÂ |Â

up vote
2
down vote

Using awk:

awk '1; ! /^[x01-x7F]*$/ next getline !/[x01-x7F]/ print "+AÃ£ÂÂ‚+" 1'

Print the input line unconditionally - 1 is a true condition, and the default action in that case is to print.

Then, if it isn't (!) entirely ASCII (/^[x01-x7F]*$/), skip processing more rules (proceeding to the next line, but processing rules from 1).

If it is entirely ASCII, we get the next line getline, and if that doesn't ! have any ASCII characters /[x01-x7F]/ in it, print your placeholder.

Finally print the line we read using getline.

I'm assuming that your Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿ lines don't have half-width spaces or punctuation (. ! vs Ã£Â€Â‚Ã£Â€Â€Ã¯Â¼Â).

answered Apr 27 at 6:01

muru

129k19271462

Using awk:

awk '1; ! /^[x01-x7F]*$/ next getline !/[x01-x7F]/ print "+AÃ£ÂÂ‚+" 1'

Print the input line unconditionally - 1 is a true condition, and the default action in that case is to print.

Then, if it isn't (!) entirely ASCII (/^[x01-x7F]*$/), skip processing more rules (proceeding to the next line, but processing rules from 1).

If it is entirely ASCII, we get the next line getline, and if that doesn't ! have any ASCII characters /[x01-x7F]/ in it, print your placeholder.

Finally print the line we read using getline.

I'm assuming that your Ã¦Â—Â¥Ã¦ÂœÂ¬Ã¨ÂªÂžÃ£ÂÂ®Ã£ÂÂ¿ lines don't have half-width spaces or punctuation (. ! vs Ã£Â€Â‚Ã£Â€Â€Ã¯Â¼Â).

answered Apr 27 at 6:01

muru

129k19271462

answered Apr 27 at 6:01

muru

129k19271462

answered Apr 27 at 6:01

muru

129k19271462

answered Apr 27 at 6:01

muru

129k19271462

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

Post as a guest

Name

Sign up or log in

Name

搜尋此網誌

Gfilui