I've got a C python extension, and I would like to print out some diagnostics.
I'm receiving a string as a PyObject*
.
What's the canonical way to obtain a string representation of this object, such that it usable as a const char *
?
I've got a C python extension, and I would like to print out some diagnostics.
I'm receiving a string as a PyObject*
.
What's the canonical way to obtain a string representation of this object, such that it usable as a const char *
?
Use PyObject_Repr
(to mimic Python's repr
function) or PyObject_Str
(to mimic str
), and then call PyString_AsString
to get char *
(you can, and usually should, use it as const char*
, for example:
PyObject* objectsRepresentation = PyObject_Repr(yourObject);
const char* s = PyString_AsString(objectsRepresentation);
This method is OK for any PyObject
. If you are absolutely sure yourObject
is a Python string and not something else, like for instance a number, you can skip the first line and just do:
const char* s = PyString_AsString(yourObject);
Here is the correct answer if you are using Python 3:
static void reprint(PyObject *obj) {
PyObject* repr = PyObject_Repr(obj);
PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
const char *bytes = PyBytes_AS_STRING(str);
printf("REPR: %s\n", bytes);
Py_XDECREF(repr);
Py_XDECREF(str);
}
If you need just print the object in Python 3 you can use one of these functions:
static void print_str(PyObject *o)
{
PyObject_Print(o, stdout, Py_PRINT_RAW);
}
static void print_repr(PyObject *o)
{
PyObject_Print(o, stdout, 0);
}
Try PyObject_Repr
(to mimic Python's repr
) or PyObject_Str
(to mimic Python's str
).
Docs:
Compute a string representation of object o. Returns the string representation on success, NULL on failure. This is the equivalent of the Python expression repr(o). Called by the repr() built-in function.
For python >=3.3:
char* str = PyUnicode_1BYTE_DATA(py_object);
Yes, this is a non-const pointer, you can potentially modify the (immutable) string via it.
For an arbitrary PyObject*
, first call
PyObject_Repr()
or PyObject_Str()
to get a PyUnicode*
object.
In Python 3.3 and up, call PyUnicode_AsUTF8AndSize
. In addition to the Python string you want a const char *
for, this function takes an optional address to store the length in.
Python strings are objects with explicit length fields that may contain null bytes, while a const char*
by itself is typically a pointer to a null-terminated C string. Converting a Python string to a C string is a potentially lossy operation. For that reason, all the other Python C-API functions that could return a const char*
from a string are deprecated.
If you do not care about losing a bunch of the string if it happens to contain an embedded null byte, you can pass NULL
for the size
argument. For example,
PyObject* foo = PyUnicode_FromStringAndSize("foo\0bar", 7);
printf("As const char*, ignoring length: %s\n",
PyUnicode_AsUTF8AndSize(foo, NULL));
prints
As const char*, ignoring length: foo
But you can also pass in the address of a size
variable, to use with the const char*
, to make sure that you’re getting the entire string.
PyObject* foo = PyUnicode_FromStringAndSize("foo\0bar", 7);
printf("Including size: ");
size_t size;
const char* data = PyUnicode_AsUTF8AndSize(foo, &size);
fwrite(data, sizeof(data[0]), size, stdout);
putchar('\n');
On my terminal, that outputs
$ ./main | cat -v
Including size: foo^@bar