9

I'm trying to write part of an add-on for Google Docs that eliminates newlines within selected text using replaceText. The obvious text.replaceText("\n",""); gives the error Invalid argument: searchPattern. I get the same error with text.replaceText("\r","");. The following attempts do nothing: text.replaceText("/\n/","");, text.replaceText("/\r/","");. I don't know why Google App Script does not allow for the recognition of newlines in regex.

I am aware that there is an add-on that does this already, but I want to incorporate this function into my add-on.

This error occurs even with the basic

DocumentApp.getActiveDocument().getBody().textReplace("\n","");

My full function:

function removeLineBreaks() {

var selection = DocumentApp.getActiveDocument().getSelection();
if (selection) {
    var elements = selection.getRangeElements();
    for (var i = 0; i < elements.length; i++) {
        var element = elements[i];

        // Only deal with text elements

        if (element.getElement().editAsText) {
            var text = element.getElement().editAsText();

            if (element.isPartial()) {
                text.replaceText("\n","");
            }

            // Deal with fully selected text
            else {
                text.replaceText("\n","");
            }
        }
    }
}

// No text selected
else {
    DocumentApp.getUi().alert('No text selected. Please select some text and try again.');
}

}

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
DavidR
  • 359
  • 1
  • 4
  • 15
  • 1
    Aren't those newlines carriage returns? Try `text.replaceText("\r","");` or `text.replaceText("[\r\n]+","");` – Wiktor Stribiżew Jun 12 '16 at 06:59
  • 1
    See above "I get the same error with text.replaceText("\r","")". [\n] or [\r] just do nothing. – DavidR Jun 12 '16 at 07:06
  • Do you have something like `DocumenntApp.getActiveDocument().getBody().replaceText("\n","");`? – Wiktor Stribiżew Jun 12 '16 at 07:30
  • @WiktorStribiżew I am operating with selected text, but your example yields the same error. So annoying! – DavidR Jun 12 '16 at 07:38
  • I did not provide examples, I only ask about the code you have. You should provide the whole relevant code that yields the issue. – Wiktor Stribiżew Jun 12 '16 at 07:39
  • I mean running the replaceText function on the document body, as you enquired about, yields the same error. I'll post full code though – DavidR Jun 12 '16 at 07:43
  • Well, I could come up with JS code: `doc.getBody().setText(doc.getBody().getText().replace(/[\r\n]+/g,""));`. No idea if that can help here. – Wiktor Stribiżew Jun 12 '16 at 07:56
  • @WiktorStribiżew Thanks Wiktor. It doesn't work because JS can't be used in GS. Your code converted into GS (i.e. in replaceText) does not work either. I'm starting to think that this is just a Google App Script bug. – DavidR Jun 12 '16 at 08:20
  • Do you want to remove line breaks (inserted with Shift-Enter), or paragraph breaks (inserted with Enter), or both? –  Jun 12 '16 at 08:58
  • @DavidRowthorn: What about just removing all "other" control characters with `.replaceText("\\p{Cc}+", "")`? I think this can be enough. – Wiktor Stribiżew Jun 12 '16 at 20:41
  • Also, what about `.replaceText("\\v", "")`? – Wiktor Stribiżew Jun 12 '16 at 21:24
  • any thoughts on this one https://stackoverflow.com/questions/71553675/how-to-detect-new-line-withn-a-google-sheets-cell-using-appsscript ? – Sergino Mar 21 '22 at 07:05

5 Answers5

6

It seems that in replaceText, to remove soft returns entered with Shift-ENTER, you can use \v:

.replaceText("\\v+", "")

If you want to remove all "other" control characters (C0, DEL and C1 control codes), you may use

.replaceText("\\p{Cc}+", "")

Note that the \v pattern is a construct supported by JavaScript regex engine, and is considered to match a vertical tab character (≡ \013) by the RE2 regex library used in most Google products.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you for your help! There's not much guidance for Google Scripting (compared with, say, JavaScript). – DavidR Jun 13 '16 at 17:35
  • 1
    But this does not address the issue of how to match a "hard" ENTER, i.e. a new paragraph mark. `.replaceText("\\n+", "")` does NOT work. Any clues? – Giuseppe Apr 27 '18 at 20:41
  • @Giuseppe Those are outside of the `.replaceText()` reach. You can only handle them using some code logic. – Wiktor Stribiżew Apr 27 '18 at 20:47
  • @Wiktor, but `.replaceText()` can be even called on the document's **body**, which of course will contain paragraph breaks. How can those be "invisible"? I am baffled. – Giuseppe Apr 29 '18 at 15:13
  • I am as baffled as Giuseppe. "\n" is my important objective. I (would, if I could) love to automate removing double lines (using "\n+\s*(?=\n)" or "\n+\s*\n+"). Baffled! – Vladimir Brasil Jan 29 '22 at 22:30
1

The Google Apps Script function replaceText() still doesn't accept escape characters, but I was able to get around this by using getText(), then the generic JavaScript replace(), then setText():

var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();

var bodyText = body.getText();

//DocumentApp.getUi().alert( "Does document contain \\t? " + /\t/.test( bodyText ) ); // \n true, \r false, \t true

bodyText = bodyText.replace( /\n/g, "" );
bodyText = bodyText.replace( /\t/g, "" );

body.setText( bodyText );

This worked within a Doc. Not sure if the same is possible within a Sheet (and, even if it were, you'd probably have to run this once cell at a time).

Ally Ex
  • 19
  • 1
  • Very clever approach but, in my case specifically, it removes all formatting! Loosing the document format is huge issue. But thanks for the clever approach. – Vladimir Brasil Jan 29 '22 at 22:32
1

here is my pragmatic solution to eliminate newlines in Google Docs, or, more exact, to eliminate newlines from Gmail message.getPlainBody(). It looks that Google uses '\r\n\r\n' as a plain EOL and '\r\n' as a manuell Linefeed (Shift-Enter). The code should be self explainable. It might help to get alone with the newline problem in Docs. A solution possibly not very elegant, but works like a charm :-)

function GetEmails2Doc() { 
var doc = DocumentApp.getActiveDocument(); 
var body = doc.getBody(); 
var pc = 0;  // Paragraph Counter

var label = GmailApp.getUserLabelByName("_Send2Sheet"); 
var threads = label.getThreads(); 
var i = threads.length; 
// LOOP Messages within a THREAT  
for (i=threads.length-1; i>=0; i--) { 
for (var j = 0; j < messages.length; j++) { 
var message = messages[j]; 
/* Here I do some ...
body.insertParagraph(pc++, Utilities.formatDate(message.getDate(), "GMT",
"dd.MM.yyyy (HH:mm)")).setHeading(DocumentApp.ParagraphHeading.HEADING4) 
str = message.getFrom() + ' to: ' + message.getTo(); 
if (message.getCc().length >0) str = str + ", Cc: " + message.getCc(); 
if (message.getBcc().length >0) str = str + ", Bcc: " + message.getBcc(); 
body.insertParagraph(pc++,str);
*/ 
// Body !! 
var str = processBody(message.getPlainBody()).split("pEOL"); 
Logger.log(str.length + " EOLs"); 
for (var k=0; k<str.length; k++) body.insertParagraph(pc++,str[k]);
}
}
}

function processBody(tx) {

var s = tx.split(/\r\n\r\n/g);
// it looks like message.getPlainBody() [of mail] uses \r\n\r\n as EOL
// so, I first substitute the 'EOL's with the string pattern "pEOL"
// to be replaced with body.insertParagraph in the main function 
tx = ''; 
for (k=0; k<s.length; k++) tx = tx + s[k] + "pEOL"; 

// then replace all remaining simple \r\n with a blank 
s = tx.split(/\r\n/g); 
tx = ''; 
for (k=0; k<s.length; k++) tx = tx + s[k] + " ";

return tx;
}
Siggi
  • 11
  • 2
0

I have now found out through much trial and error -- and some much needed help from Wiktor Stribiżew (see other answer) -- that there is a solution to this, but it relies on the fact that Google Script does not recognise \n or \r in regex searches. The solution is as follows:

function removeLineBreaks() {
  var selection = DocumentApp.getActiveDocument()
    .getSelection();
  if (selection) {
    var elements = selection.getRangeElements();
    for (var i = 0; i < elements.length; i++) {
      var element = elements[i];
      // Only deal with text elements
      if (element.getElement()
        .editAsText) {
        var text = element.getElement()
          .editAsText();
        if (element.isPartial()) {
          var start = element.getStartOffset();
          var finish = element.getEndOffsetInclusive();
          var oldText = text.getText()
            .slice(start, finish);
          if (oldText.match(/\r/)) {
            var number = oldText.match(/\r/g)
              .length;
            for (var j = 0; j < number; j++) {
              var location = oldText.search(/\r/);
              text.deleteText(start + location, start + location);
              text.insertText(start + location, ' ');
              var oldText = oldText.replace(/\r/, ' ');
            }
          }
        }
        // Deal with fully selected text
        else {
          text.replaceText("\\v+", " ");
        }
      }
    }
  }
  // No text selected
  else {
    DocumentApp.getUi()
      .alert('No text selected. Please select some text and try again.');
  }
}

Explanation

Google Docs allows searching for vertical tabs (\v), which match newlines.

Partial text is a whole other problem. The solution to dealing with partially selected text above finds the location of newlines by extracting a text string from the text element and searching in that string. It then uses these locations to delete the relevant characters. This is repeated until the number of newlines in the selected text has been reached.

DavidR
  • 359
  • 1
  • 4
  • 15
  • The regex is not correct as inside a bracket expression, and in RE library in general, no lookarounds can be used. If it worked, it is due to the text you have, but not because the regex worked correctly. – Wiktor Stribiżew Jun 12 '16 at 12:39
  • @WiktorStribiżew I was sure the original solution worked when I first tried it, but you are right, it does not. – DavidR Jun 12 '16 at 14:27
-2

This Stack Overflow answer removes, specifically, "\n". It may help, it helped me indeed.

Vladimir Brasil
  • 693
  • 5
  • 8