58

I've got a C python extension, and I would like to print out some diagnostics.

I'm receiving a string as a PyObject*.

What's the canonical way to obtain a string representation of this object, such that it usable as a const char *?

William Miller
  • 9,839
  • 3
  • 25
  • 46
Mark Harrison
  • 297,451
  • 125
  • 333
  • 465

7 Answers7

59

Use PyObject_Repr (to mimic Python's repr function) or PyObject_Str (to mimic str), and then call PyString_AsString to get char * (you can, and usually should, use it as const char*, for example:

PyObject* objectsRepresentation = PyObject_Repr(yourObject);
const char* s = PyString_AsString(objectsRepresentation);

This method is OK for any PyObject. If you are absolutely sure yourObject is a Python string and not something else, like for instance a number, you can skip the first line and just do:

const char* s = PyString_AsString(yourObject);
piokuc
  • 25,594
  • 11
  • 72
  • 102
  • 3
    I am trying PyBytes_AsString(yourObject) for Python 3 and I am getting TypeError: expected bytes, str found – brita_ Feb 06 '15 at 18:08
  • I didn't even mention PyBytes_AsString in my answer. Have you tried what I suggested in my answer? – piokuc Feb 09 '15 at 15:27
  • 24
    I tried, in Py3.x PyString got replaced with PyBytes but with not quite the same functionality. I ended up using: PyUnicode_AsUTF8(objectsRepresentation) – brita_ Feb 09 '15 at 21:22
  • 14
    Don't forget to `Py_DECREF(objectsRepresentation)` since `PyObject_Repr()` returns a new reference! – Steve Feb 19 '16 at 23:47
41

Here is the correct answer if you are using Python 3:

static void reprint(PyObject *obj) {
    PyObject* repr = PyObject_Repr(obj);
    PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
    const char *bytes = PyBytes_AS_STRING(str);

    printf("REPR: %s\n", bytes);

    Py_XDECREF(repr);
    Py_XDECREF(str);
}
Romuald Brunet
  • 5,595
  • 4
  • 38
  • 34
12

If you need just print the object in Python 3 you can use one of these functions:

static void print_str(PyObject *o)
{
    PyObject_Print(o, stdout, Py_PRINT_RAW);
}

static void print_repr(PyObject *o)
{
    PyObject_Print(o, stdout, 0);
}
4

Try PyObject_Repr (to mimic Python's repr) or PyObject_Str (to mimic Python's str).

Docs:

Compute a string representation of object o. Returns the string representation on success, NULL on failure. This is the equivalent of the Python expression repr(o). Called by the repr() built-in function.

Alexander Gessler
  • 45,603
  • 7
  • 82
  • 122
  • this looks like what I need... Once I've got the PyObject returned by one of these functions, how do I access that in a C-friendly way (eg. to call printf, etc) – Mark Harrison Mar 18 '11 at 19:26
2

For python >=3.3:

char* str = PyUnicode_1BYTE_DATA(py_object);

Yes, this is a non-const pointer, you can potentially modify the (immutable) string via it.

Mikhail
  • 20,685
  • 7
  • 70
  • 146
  • python 3.10: error: invalid conversion from ‘Py_UCS1*’ {aka ‘unsigned char*’} to ‘char*’ – sea-kg Jun 27 '23 at 07:44
1

PyObject *module_name; PyUnicode_AsUTF8(module_name)

Pulller
  • 109
  • 1
  • 3
1

For an arbitrary PyObject*, first call PyObject_Repr() or PyObject_Str() to get a PyUnicode* object.

In Python 3.3 and up, call PyUnicode_AsUTF8AndSize. In addition to the Python string you want a const char * for, this function takes an optional address to store the length in.

Python strings are objects with explicit length fields that may contain null bytes, while a const char* by itself is typically a pointer to a null-terminated C string. Converting a Python string to a C string is a potentially lossy operation. For that reason, all the other Python C-API functions that could return a const char* from a string are deprecated.

If you do not care about losing a bunch of the string if it happens to contain an embedded null byte, you can pass NULL for the size argument. For example,

PyObject* foo = PyUnicode_FromStringAndSize("foo\0bar", 7);

printf("As const char*, ignoring length: %s\n",
    PyUnicode_AsUTF8AndSize(foo, NULL));

prints

As const char*, ignoring length: foo

But you can also pass in the address of a size variable, to use with the const char*, to make sure that you’re getting the entire string.

PyObject* foo = PyUnicode_FromStringAndSize("foo\0bar", 7);

printf("Including size: ");
size_t size;
const char* data = PyUnicode_AsUTF8AndSize(foo, &size);
fwrite(data, sizeof(data[0]), size, stdout);
putchar('\n');

On my terminal, that outputs

$ ./main | cat -v
Including size: foo^@bar
andrewdotn
  • 32,721
  • 10
  • 101
  • 130