13

When I use below code:

#include <stdio.h>

int main(void)
{
    printf("%s","Hello world\nHello world");
    return 0;
}

it prints as:

 Hello world
 Hello world

How can I prevent this and print it as raw string literal in C? I mean it should be displayed as it is in terminal window like below:

Hello world\nHello world

I know I can achieve this by using backslash for printf but is there any other C function or way to do this without backslashing? It would be helpful when reading files.

Jessie
  • 185
  • 1
  • 1
  • 7
  • 1
    If you read in files that contain a backslash, you can print them with `printf` without doing anything different. Have you tried it? In your example, the compiler is interpreting the `\n` and replacing it with a newline. If you fill in your strings some other way, say by reading a line of a file into a string, this doesn't happen. – yellowantphil Apr 06 '15 at 18:38
  • Thanks. I need \n character to be displayed as raw. It defaults not to be visible and creating a new line in terminal window. – Jessie Apr 06 '15 at 18:42
  • Oh, you want a newline in a string to be displayed as a `\n`? So if you read a file that contains a newline, it will replace it with a `\n`? You could write a function to do that. – yellowantphil Apr 06 '15 at 18:44
  • 1
    Oh, is not there any standard function for that in C? Should I replace all escape characters by backslashing by a function? It sounds a bit awkward. – Jessie Apr 06 '15 at 18:48
  • Well, once the text is in a string, there are no escape characters left to replace. You just have a literal newline character in your string. I'll write a short function to show what I'm thinking of. – yellowantphil Apr 06 '15 at 18:52
  • `if(ch == '\n') fputs("\\n", stdout); else putchar(ch);` – BLUEPIXY Apr 06 '15 at 18:57
  • 1
    Consider http://stackoverflow.com/a/22152332/2410359 – chux - Reinstate Monica Apr 06 '15 at 19:02
  • After this string is printed, what will read it? Just humans or other C code? – chux - Reinstate Monica Apr 06 '15 at 21:31

7 Answers7

6

There is no built-in mechanism to do this. You have to do it manually, character-by-character. However, the functions in ctype.h may help. Specifically, in the "C" locale, the function isprint is guaranteed to be true for all of the graphic characters in the basic execution character set, which is effectively the same as all the graphic characters in 7-bit ASCII, plus space; and it is guaranteed not to be true for all the control characters in 7-bit ASCII, which includes tab, carriage return, etc.

Here is a sketch:

#include <stdio.h>
#include <ctype.h>
#include <locale.h>

int main(void)
{
    int x;
    setlocale(LC_ALL, "C"); // (1)

    while ((x = getchar()) != EOF)
    {
        unsigned int c = (unsigned int)(unsigned char)x; // (2)

        if (isprint(c) && c != '\\')
            putchar(c);
        else
            printf("\\x%02x", c);
    }
    return 0;
}

This does not escape ' nor ", but it does escape \, and it is straightforward to extend that if you need it to.

Printing \n for U+000A, \r for U+000D, etc. is left as an exercise. Dealing with characters outside the basic execution character set (e.g. UTF-8 encoding of U+0080 through U+10FFFF) is also left as an exercise.

This program contains two things which are not necessary with a fully standards-compliant C library, but in my experience have been necessary on real operating systems. They are marked with (1) and (2).

1) This explicitly sets the 'locale' configuration the way it is supposed to be set by default.

2) The value returned from getchar is an int. It is supposed to be either a number in the range representable by unsigned char (normally 0-255 inclusive), or the special value EOF (which is not in the range representable by unsigned char). However, buggy C libraries have been known to return negative numbers for characters with their highest bit set. If that happens, the printf will print (for instance) \xffffffa1 when it should've printed \xa1. Casting x to unsigned char and then back to unsigned int corrects this.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • Note: No need for `unsigned int c = (unsigned int)(unsigned char)x;`, just use `x`. `getchar()` returns a value in the `unsigned char` range or `EOF`. `x` is zero extended. – chux - Reinstate Monica Apr 06 '15 at 21:34
  • @chux You are correct as far as the standard goes, but I have personally tripped over at least two C libraries that didn't get that right. It _was_ a long time ago; perhaps the extra defensiveness is no longer necessary. – zwol Apr 06 '15 at 21:41
  • 1
    Scars from earlier battles cover my fingertips too. – chux - Reinstate Monica Apr 06 '15 at 21:47
  • Thanks @zvol for explanation. But I did not understand what is the purpose behind zero extensioning . Why is x type casted to a char first and then int while x is already char. It will be helpful for me if the code is simpler, please, because I am new to C. – Jessie Apr 07 '15 at 08:12
  • @Jessie Unfortunately, that line is necessary in practice (although, as pointed out above, not in principle). I have added some explanation. – zwol Apr 07 '15 at 13:07
1

Something like this might be what you want. Run myprint(c) to print the character C or a printable representation of it:

#include <ctype.h>

void myprint(int c)
{
    if (isprint(c))
        putchar(c); // just print printable characters
    else if (c == '\n')
        printf("\\n"); // display newline as \n
    else
        printf("%02x", c); // print everything else as a number
}

If you're using Windows, I think all your newlines will be CRLF (carriage return, linefeed) so they'll print as 0d\n the way I wrote that function.

yellowantphil
  • 1,483
  • 5
  • 21
  • 30
1

Thank you the user @chunk for contributing to the improvement this answer.


Why did not you write general-purpose solution? It would keep you from many problems in the future.

char *
str_escape(char str[])
{
    char chr[3];
    char *buffer = malloc(sizeof(char));
    unsigned int len = 1, blk_size;

    while (*str != '\0') {
        blk_size = 2;
        switch (*str) {
            case '\n':
                strcpy(chr, "\\n");
                break;
            case '\t':
                strcpy(chr, "\\t");
                break;
            case '\v':
                strcpy(chr, "\\v");
                break;
            case '\f':
                strcpy(chr, "\\f");
                break;
            case '\a':
                strcpy(chr, "\\a");
                break;
            case '\b':
                strcpy(chr, "\\b");
                break;
            case '\r':
                strcpy(chr, "\\r");
                break;
            default:
                sprintf(chr, "%c", *str);
                blk_size = 1;
                break;
        }
        len += blk_size;
        buffer = realloc(buffer, len * sizeof(char));
        strcat(buffer, chr);
        ++str;
    }
    return buffer;
}

How it work!

int
main(const int argc, const char *argv[])
{
    puts(str_escape("\tAnbms\n"));
    puts(str_escape("\tA\v\fZ\a"));
    puts(str_escape("txt \t\n\r\f\a\v 1 \t\n\r\f\a\v tt"));
    puts(str_escape("dhsjdsdjhs hjd hjds "));
    puts(str_escape(""));
    puts(str_escape("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\f\a\v"));
    puts(str_escape("\x0b\x0c\t\n\r\f\a\v"));
    puts(str_escape("\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14"));
}

Output

\tAnbms\n
\tA\v\fZ\a
txt \t\n\r\f\a\v 1 \t\n\r\f\a\v tt
dhsjdsdjhs hjd hjds 

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ \t\n\r\f\a\v
\v\f\t\n\r\f\a\v
\a\b\t\n\v\f\r

This solution based on an information from the Wikipedia https://en.wikipedia.org/wiki/Escape_sequences_in_C#Table_of_escape_sequences and the answers other users of the stackoverflow.com.


Testing environment

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 8.6 (jessie)
Release:    8.6
Codename:   jessie
$ uname -a
Linux localhost 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux
$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Community
  • 1
  • 1
PADYMKO
  • 4,217
  • 2
  • 36
  • 41
  • 1) For a _general_ solution, the answer should also show how it handles printing a `char` outside the 0-127 range. 2) `strcat(buffer, chr);` is UB as `buffer[]` is not initialized. – chux - Reinstate Monica Feb 22 '17 at 15:23
  • Dear @chux, why are you need an answer on "how it handles printing a char outside the 0-127 range?" and why "buffer[] is not initialized?" - I don`t no errors and no warnings about it. I am used for compilation the GCC on GNU/Linux (I updated answer). – PADYMKO Feb 23 '17 at 09:16
  • 1) What is the value of `buffer[0]` the first time `strcat(buffer, chr);` is called? Code does not set it anywhere, so it is uninitialized. 2) The space allocated is 1 too small. – chux - Reinstate Monica Feb 23 '17 at 13:47
  • @chux, may be you are right. Each string always contains the last character '\0'. I updated the answer and added You as a contributor on top the of the content. – PADYMKO Feb 23 '17 at 14:25
  • "Each string always contains the last character '\0'", but `buffer` is not necessarily a _string_. Code never puts a `'\0'` in `buffer[]` before calling `strcat(buffer, chr)`, so that leads to undefined behavior (UB). Simply add `buffer[0] = '\0';` after `char *buffer = malloc(sizeof(char));`. – chux - Reinstate Monica Feb 23 '17 at 16:17
  • @chunk, I think your notice is very advanced, but I don`t this technique at all early and never faced with the UB or other undesirable side effects. I read many resources in web, but never heard about "buffer[0] = '\0'". Where did you find? - a link on an article,a book, a page, etc. – PADYMKO Feb 23 '17 at 16:53
1

Just use,putchar(specialCharName). It displays the entered special character.

Vidya
  • 382
  • 1
  • 6
  • 17
0

What you're looking for is this:

#include <stdio.h>
int main(void)
{
    printf("%s","Hello world\\nHello world");
    return 0;
}

This would produce the following output: Hello world\nHello world

Ivo Valchev
  • 215
  • 3
  • 11
0

If I understand the question, if you have a string containing control characters like newline, tab, backspace, etc., you want to print a text representation of those characters, rather than interpret them as control characters.

Unfortunately, there's no built-in printf conversion specifier that will do that for you. You'll have to walk through the string character by character, test each one to see if it's a control character, and write some text equivalent for it.

Here's a quick, lightly tested example:

#include <stdio.h>
#include <limits.h>
#include <ctype.h>
...
char *src="This\nis\ta\btest";

char *lut[CHAR_MAX] = {0};  // look up table for printable equivalents
                            // of non-printable characters
lut['\n'] = "\\n";
lut['\t'] = "\\t";
lut['\b'] = "\\b";
...
for ( char *p = src; *p != 0; p++ )
{
  if ( isprint( *p ) )
    putchar( *p );
  else
    fputs( lut[ (int) *p], stdout ); // puts adds a newline at the end,
                                     // fputs does not.
}
putchar( '\n' );
John Bode
  • 119,563
  • 19
  • 122
  • 198
  • If `char` is signed by default, the above code will fail if `src` contains any non ASCII characters. The cast `(int) *p` does not address this issue! You should use `UCHAR_MAX` for `lut` size and cast as `(unsigned char)` in `isprint((unsigned char)*p)` and `lut[(unsigned char)*p]`. – chqrlie Apr 06 '15 at 19:29
  • In typical implementations, `isprint(CHAR_MAX)` --> 0 causing `lut[ (int) CHAR_MAX]` to access out of bounds. `char *lut[CHAR_MAX]` is certainly off by 1. I'd expect `char *lut[CHAR_MAX+1]` (and somehow handle negative values) or even better `char *lut[UCHAR_MAX+1]` and use `unsigned char`. – chux - Reinstate Monica Feb 23 '17 at 13:53
0
  /// My experience Win 10 Code blocks GCC MinGW

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <windows.h>
///#include <threads.h>
#include <conio.h>
/// #include <dos.h>
#include <direct.h>

int main(void)

{
  /// This will give your desired result, turn string into Raw string :
  printf(R"(Hello world\nHello world)");
  printf(R"(Raw string support printing  *&^%$#@!~()_+-=,<.>/?:;"' )");
  printf("\n");
  printf(R"(.C with a Capital C file format does not support raw string )");
  printf("\n");
  printf(R"(.c with a small c file format does support raw string )");
  printf("\n");
  printf(R"( Raw string did not support \n new line )");
  printf("\n");

  printf(
      R"(More reading material at - https: // en.wikipedia.org/wiki/String_literal#Raw_strings;)");
  printf("\n");
  printf(
      R"(More reading material at - https: // en.wikipedia.org/wiki/String_literal;)");
  printf("\n");
  printf(
      R"(More reading material at - https://stackoverflow.com/questions/24850244/does-c-support-raw-string-literals;)");
  printf("\n");
  printf(
      R"(More reading material at - https: // learn.microsoft.com/en-us/cpp/c-language/c-string-literals?view=vs-2019)");
  printf("\n");
  printf(
      R"(More reading material at-https: // learn.microsoft.com/en-us/cpp/c-language/string-literal-concatenation?view=vs-2019)");
  printf("\n");
  /// Raw string.

    printf(R"(More reading material at - https://www.geeksforgeeks.org/const-qualifier-in-c/;)");
  printf("\n");
  
  
  return 0;
}
shyed2001
  • 35
  • 6