0

It's 2023 and I think it's silly that an identifier for a string is something that is found so commonly in text, and as well in other languages like markup.

C++ uses the " symbol to denote the beginning and end of a string.

Is there a way to have this symbol be something else?

For example \xac, which is also a 1-byte character.

As an example, if the string identifier were something other than a double-quote, then using the system() function would be a lot easier to use, where double-quotes would come in quite handy.

This goes for something like PHP as well. I use echo '<div class="adiv"></div>' with single-quotes so that I can write something inside with double-quotes, but sometimes I need to be able to use both double and single quotes.

Are we really limited to quotes and double-quote being our delimiters? Can't we use, or set, a custom character that doesn't appear elsewhere?

If it's doable, please tell me how. If it's a pipe dream, let me know as well.

//Example: (I know the character I'm about to use is 3-bytes (\xe2\x81\x91) but just for demonstration...

#include <iostream>

int main() {

  std::cout << ⁑Hello world!\n⁑;
  system(⁑echo -e '"hello\nworld"' | grep 'hello' | sed s/'"hello'/'"Hello!"'/⁑);

return 0;
}

I tried searching many pages (paginations?) of Google, and I can't even find anyone who asked the question.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
ED818
  • 11
  • 2
  • 2
    Instead of this, maybe you want to use a raw string literal: [https://stackoverflow.com/questions/56710024/what-is-a-raw-string](https://stackoverflow.com/questions/56710024/what-is-a-raw-string) – drescherjm Feb 18 '23 at 23:51
  • 1
    If the problem you are describing is "[quote] is something found so commonly in text", the solution to that is the raw string literal. `R"(echo -e '"hello\nworld"')"` – Drew Dormann Feb 18 '23 at 23:56
  • Speaking from the compiler construction perspective, lexing string literals happens _really_ early. Putting something in the code itself to change this would be rather annoying considering the code is character soup at this point. However, even a compiler option could potentially cause a DFA built at compile-time (in the compiler, not your program) to become runtime, or to keep most of the DFA prebuilt and have another special case for string literals that has to be dealt with separately. (While high-quality compilers probably lex by hand anyway, the DFA option is still viable for the 90%+.) – chris Feb 19 '23 at 00:08
  • Are you looking for the escape character \ ? e.g. `char const * three = "\"three\""; ` – QuentinUK Feb 19 '23 at 00:14
  • 1
    "using the system() function would be a lot easier" - making `system()` easier to use should *not* be a goal - quite the opposite in fact. That function is a security *nightmare* and should *never* be used in new code (and old code should be refactored to get rid of it). – Jesper Juhl Feb 19 '23 at 01:58
  • @JesperJuhl what would you recommend in its place? I want to learn to manipulate files on my system using bash to glue my c++ code. Is bash not just c glue anyway? Also you know that even if this is a bad example, my point still stands...... – ED818 Feb 19 '23 at 02:11
  • @ED818 On Windows `CreateProcessEx()`, on Unix based systems `fork()` + `exec()` (or one of its variants). In both cases you gain control of what environment the new process runs in, what shell (if any) will be used, you can interact with the process's `stdin` stream, you can access the process's `stdout` and `stderr` streams individually, and more. With `system()` you give up all that control and give an attacker lots of options to monkey around with what you try to execute. – Jesper Juhl Feb 19 '23 at 08:46

1 Answers1

4

If I correctly understand your problem, this can be easy solved by using raw string literals, for example:

std::string myStr = R"('Using single quotes', "double quotes")";

Anything inside these brackets from "( to )" will be treated as string and you can write anything you want inside that string, even writing multiline string without \. Make note that symbols like \n, \r also will be treated not as 'newline' or 'return caret', but as they written.

Bolderaysky
  • 268
  • 1
  • 8
  • 3
    "*you can write anything you want inside that string*" - well, *almost* anything. In your example, the string can't contain the character sequence `)"`, otherwise the string will break prematurely. If you needed thpse characters, you would have to specify a custom delimiter to differentiate, eg: `std::string myStr = R"delim('Using single quotes', "double quotes", and even )" too)delim";` – Remy Lebeau Feb 19 '23 at 03:03