i want to create a script to capitalize sentences in a google doc, but without changing existing attributes in certain words. for example, in a google doc, there would be several paragraphs, with each paragraph having several sentences. in such google doc, there would be hyperlinks, words in boldface, words in italics, words with underline, etc. i want all of these attributes to stay intact; the script should only capitalize the sentences, without removing the existing attributes for these words.
i wrote the following google doc script that did the job in terms of capitalizing sentences, but the script removed all attributes in others words (hyperlinks, boldface, italics, underline) as mentioned above.
function cap12() {
// define function "replacement" to change the matched pattern to uppercase
function replacement(match) { return match.toUpperCase(); }
// define regex "period, followed by zero or any number
// of blank spaces, followed by any lowercase character"
var regex1 = "(^|\.)(\s*)[a-z]";
var regex2 = /(^|\.)(\s*)[a-z]/;
// Logger.log(regex1, regex2);
// get text matching pattern "regex"
var body = DocumentApp.getActiveDocument().getBody();
var foundElement = body.findText(regex1);
// Logger.log(foundElement);
while (foundElement != null) {
// Get the text object from the element
var foundText = foundElement.getElement().asText();
// capitalize the character after the period
var str1 = foundText.getText();
var str2 = str1.replace(regex2, replacement);
foundText.setText(str2);
// Find the next match
foundElement = body.findText(regex1, foundElement);
}
}
i appreciate any help to point out my errors. thank you.
NOTE: the above script cap12 is a continuation of my project to develop a google script to capitalize sentences as documented in the post google doc script to capitalize sentences. the final script cap7 in this post only worked locally on the selected text (i.e., not over the entire document), but also removed all attributes such as hyperlinks, boldface, italics, underline.
by reading the related posts listed on the right column, in a more or less random fashion, guided perhaps by instinct, i stumbled on a nice script likely written by a pro, from which newbies (like me) could learn a lot. so i described what i did below in case someone would be interested.
from the post, related to my present post, titled Google Apps Script/Javascript search and replace with regex not working, i noticed another related post Find and change unknown strings to UPPERCASE in Google Apps Script Document using JS, in which there was a very nice code posted by Mogsdad. at first, reading through the discussion, i thought that i had to try to understand this code posted in the answer by Mogsdad.
then i noticed the sentence "The following script is part of a document add-on, source available in this gist, in changeCase.js
." (i still need to find out what a "gist" means; something related to GitHub. ok, "gist" is explained in the post What is the difference between github and gist? [closed].)
so i looked into the link this gist, and indeed i found the script changeCase.js
that contained ALMOST what i was trying to develop (i.e., "Sentence Case"; some problems are described below):
changeCase.js - Document add-in, provides case-change operations in the add-in Menu.
onOpen - installs "Change Case" menu
_changeCase - worker function to locate selected text and change text case. Case conversion is managed via callback to a function that accepts a string as a parameter and returns the converted string.
helper functions for five cases
UPPER CASE
lower case
Title Case
Sentence case
camelCase
Fountain-lite, screenplay formatting - see http://fountain.io/
it was an amazing code; it would take me a long time to reach this level to develop such a code.
i installed the script changeCase.js
in my google doc, closed the google doc to reopen it again to activate the add-on changeCase.js
. then i tested the "Sentence case" option; the script worked nicely in the sense that i could select the text within which i wanted to capitalize the sentences, avoiding words with special formatting such as boldface, italics, underline, etc.
but when i selected the text that included words in boldface (and/or italics, and/or underline), then the boldface attribute would be removed (exactly the same problem that i wanted to solve). so the script changeCase.js
did not solve my problem, but provided a work-around.
the script changeCase.js
only worked locally on the selected text, whereas the script cap7
in my post google doc script to capitalize sentences, which worked on the whole paragraph, even though i selected only a portion of the paragraph.
in other words, i could modify my script cap7
to work as the "Sentence Case" option of the script changeCase.js
. i believe that the problem was that i did a "global" search and replace, instead of a "local" search and replace within the selected text.
a problem with the "Sentence Case" option of the script changeCase.js
was that it converted to lowercase all characters within the selected text, which was not what i want, since there were characters i wanted to keep in uppercase (e.g., names of people, etc.). i only wanted to capitalize the sentences within the selected text, without modifying anything within these sentences.
to do what i described above, simply remove the method "toLowerCase" in the code:
// https://stackoverflow.com/a/19089667/1677912
function _toSentenceCase (str) {
var rg = /(^\s*\w{1}|\.\s*\w{1})/gi;
return str.toLowerCase().replace(rg, function(toReplace) {
return toReplace.toUpperCase();
});
}
i.e., use the following modified code:
function _toSentenceCase (str) {
var rg = /(^\s*\w{1}|\.\s*\w{1})/gi;
return str.replace(rg, function(toReplace) {
return toReplace.toUpperCase();
});
}
it worked.
another problem with the script changeCase.js
was that it worked only on one paragraph at a time, which is not so efficient.
i wanted to develop a script that would work on the entire document, and without removing existing attributes (hyperlinks, boldface, italics, underline).
i appreciate if someone could point out the errors / problems in my script cap12b
in the post google doc script, attributes (bold, italics, underline) not shown in log.