I would not like to keep converting every Python String Object from PyObject*
to std::string
or char*
with PyUnicode_DecodeUTF8 and PyUnicode_AsUTF8 because it is an expensive operation.
On my last question How to extend/reuse Python C Extensions/API implementation?, I managed to use the Python open
function, to directly give me a PyObject*
string. Once doing that, it is very simple to send the string back to the Python program because I can just pass its PyObject*
pointer back, instead of doing a full char-by-char copy as PyUnicode_DecodeUTF8
or PyUnicode_AsUTF8
do.
On the regex
implementation of CPython API, I can find a function like this:
static void* getstring(PyObject* string, Py_ssize_t* p_length,
int* p_isbytes, int* p_charsize,
Py_buffer *view)
{
/* given a python object, return a data pointer, a length (in
characters), and a character size. return NULL if the object
is not a string (or not compatible) */
/* Unicode objects do not support the buffer API. So, get the data directly. */
if (PyUnicode_Check(string)) {
if (PyUnicode_READY(string) == -1)
return NULL;
*p_length = PyUnicode_GET_LENGTH(string);
*p_charsize = PyUnicode_KIND(string);
*p_isbytes = 0;
return PyUnicode_DATA(string);
}
/* get pointer to byte string buffer */
if (PyObject_GetBuffer(string, view, PyBUF_SIMPLE) != 0) {
PyErr_SetString(PyExc_TypeError, "expected string or bytes-like object");
return NULL;
}
*p_length = view->len;
*p_charsize = 1;
*p_isbytes = 1;
if (view->buf == NULL) {
PyErr_SetString(PyExc_ValueError, "Buffer is NULL");
PyBuffer_Release(view);
view->buf = NULL;
return NULL;
}
return view->buf;
}
It does not seem to be using PyUnicode_DecodeUTF8
or PyUnicode_AsUTF8
to work with the PyObject*
coming from the Python Interpreter.
How can I use basic string operations with PyObject*
strings without conversion then to std::string
or char*
?
I would call basic operations the following examples: (Just for exemplifying, I am using Py_BuildValue to build a PyObject*
string from a string as a char*
or std::string
)
static PyObject* PyFastFile_do_concatenation(PyFastFile* self)
{
PyObject* hello = Py_BuildValue( "s", "Hello" );
PyObject* word = Py_BuildValue( "s", "word" );
// I am just guessing the `->value` property
PyObject* hello_world = hello->value + word->value;
hello_world; // return the `PyObject*` string `Hello word`
}
static PyObject* PyFastFile_do_substring(PyFastFile* self)
{
PyObject* hello = Py_BuildValue( "s", "Hello word" );
PyObject* hello_world = hello->value[5:];
hello_world; // return the `PyObject*` string `word`
}
static PyObject* PyFastFile_do_contains(PyFastFile* self)
{
PyObject* hello = Py_BuildValue( "s", "Hello word" );
if( "word" in hello->value ) {
Py_BuildValue( "p", true ); // return the `PyObject*` boolean `true`
}
Py_BuildValue( "p", false ); // return the `PyObject*` boolean `false`
}