3
%%cython
import cython
cdef int k = 65
cdef unsigned char kk = cython.cast("char", k)
print kk

the result is 65. I already tried <char> to convert 65 to 'A'

Anyone have some ideas? I am currently working in Ipython notebook.

Thank you in advance!!

[Edited] I added the first motivation to this question.

In c,

int i = 65;
char c = i;
printf("%c", c); //prints 'A'

because char 'A' is already int, if I correctly understand

But in Cython,

%%cython
import cython
cdef int k = 65
cdef char kk = cython.cast("char", k) 
print <char>kk, <int>kk

same result.

  • I also tried "from libc.stdlib cimport atoi" to check the type strictly, it didn't work. – Sunghyun Kim Jul 17 '17 at 10:48
  • 2
    Possible duplicate of [Convert int to ASCII and back in Python](https://stackoverflow.com/questions/3673428/convert-int-to-ascii-and-back-in-python) – Tom Wyllie Jul 17 '17 at 10:49
  • Try `kk = k`? – cs95 Jul 17 '17 at 10:50
  • 3
    @TomWyllie I don't think so. That has nothing to do with cython? – cs95 Jul 17 '17 at 10:50
  • 1
    Well, um, 65 is the integer value of 'A'. Characters are (usually) signed, 8-byte integers. I'm not seeing what the issue is here: just a lack of understanding of what a char is. – Alex Huszagh Jul 17 '17 at 10:52
  • @AlexanderHuszagh Could you kindly explain something more? I expected that `%%cython cdef int k = 65; print k;` is 'A', thinking about chr(97) is 'a' in python. – Sunghyun Kim Jul 17 '17 at 11:03
  • @KimSungHyun, I wrote a trivial answer to explain how this works, and why I believe Cython's behavior is correct, even if counter-intuitive. Remember, everything is just bytes under the hood. – Alex Huszagh Jul 17 '17 at 11:15

1 Answers1

2

Python doesn't have true character types. Python has strings. The ord() function works by taking a 1-character string as a function argument, and just throws an error if the string's length is longer.

Under the hood, all ord() really does is just cast the char to int. In C, I could write a naive function like this:

#include <string.h>         // for strlen

int c_ord(const char* string)
{
    // declare our variables
    size_t length;
    int ret;

    // check the length
    // note that Python actually stores the length,
    // so this wouldn't be done in real code.
    // This is here for example
    length = strlen(string);
    if (length != 1) {
        // invalid length, set a dummy placeholder
        ret = -1; 
    } else {
        // just grab our value
        ret = string[0];
    }

    return ret;
}

Notice how all ord() is doing is getting the exact value, just getting the character, not the string representation. What Cython is doing is the true behavior: treating char like an integer and therefore printing out it's integer value. To treat a character like a string, we could create a array of characters, and let Python know it's a string. The builtin method chr does this for us all under the hood.

%%cython
import cython
cdef int k = 65
print chr(k)

To write a trivial method in Cython to create a null-terminated C-string, and optionally convert it to a Python string, we can do the following:

Python doesn't have true character types. Python has strings. The ord() function works by taking a 1-character string as a function argument, and just throws an error if the string's length is longer.

Under the hood, all ord() really does is just cast the char to int. In C, I could write a naive function like this:

#include <string.h>         // for strlen

int c_ord(const char* string)
{
    // declare our variables
    size_t length;
    int ret;

    // check the length
    // note that Python actually stores the length,
    // so this wouldn't be done in real code.
    // This is here for example
    length = strlen(string);
    if (length > 1) {
        // invalid length, set a dummy placeholder
        ret = -1; 
    } else {
        // just grab our value
        ret = string[0];
    }

    return ret;
}

Notice how all ord() is doing is getting the exact value, just getting the character, not the string representation. What Cython is doing is the true behavior: treating char like an integer and therefore printing out it's integer value. To treat a character like a string, we could create a array of characters, and let Python know it's a string. The builtin method chr does this for us all under the hood.

%%cython
import cython
cdef int k = 65
print chr(k)

To write a trivial extension to do chr() and create a null-terminated string (commonly referred to as a C-string), we can write the following.

%%cython
# imports
import cython
from libc.stdlib cimport malloc, free

# create null-terminated string, or a C-string
cdef char* c_string = <char*>malloc(2)      # only need 2
c_string[0] = 65                            # 'A'
c_string[1] = 0                             # '\0', null-termination
# ... do what we want with the C-string

# convert to Python object
cdef bytes str = c_string

# remember to free the allocate memory
free(c_string)

# use Python object
print(str)
Alex Huszagh
  • 13,272
  • 3
  • 39
  • 67
  • 1
    Grab some coffee. Good answer overall. – erip Jul 17 '17 at 11:17
  • @AlexanderHuszagh Thank you! Meanwhile, I am writing hash fucntion in cython using openMP. That's why python version `chr` should be avoided `with no-gil` openMP implementation. According to your answer, it seems better to write another helper function to work this way... – Sunghyun Kim Jul 17 '17 at 11:24
  • @KimSungHyun, it would be pretty trivial to do so since you just need a simple extension to do so, I'll edit my answer. – Alex Huszagh Jul 17 '17 at 11:30
  • @danny, because we don't have a Python byte string, apparently? – Alex Huszagh Jul 17 '17 at 17:57
  • Yup, shouldn't post before coffee apparently :) – danny Jul 19 '17 at 10:13