0

I am trying to learn string manipulation in pyopencl. I found an example program that copies a string into an empty string here - How to pass a list of strings to an opencl kernel using pyopencl? The code itself had some errors which I'm not sure if I was able to fix. This is the modified code I am using -

import numpy as np
import pyopencl as cl

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
#The kernel uses one workitem per char transfert
prog_str = """__kernel void foo(__global char *in, __global char *out, const int size){
                  int idx = get_global_id(0);
                  if (idx < size){
                      out[idx] = in[idx];
                  }
           }"""

#Note that the type of the array of strings is '|S40' for the length
#of third element is 40, the shape is 3 and the nbytes is 120 (3 * 40)
original_str = np.array(("this is an average string", 
                         "and another one", 
                         "let's push even more with a third string"))
str_size = len(original_str)   
copied_str = np.empty_like(original_str)                      
mf = cl.mem_flags
#length = (str_size+1) * 200
in_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=original_str)
out_buf = cl.Buffer(ctx, mf.WRITE_ONLY, size=copied_str.nbytes)

#here launch the kernel with str_size number of workitems in this case 120
#this mean that some of the workitems won't process any meaningful char 
#(not all string have a lenght of 40) but it's no biggiea
prog = cl.Program(ctx, prog_str).build()
event = prog.foo(queue, original_str.shape , None, in_buf, out_buf, np.int32(120))
event.wait()
cl.enqueue_copy(queue, copied_str, out_buf)
print(original_str) 
print(copied_str)

However, now I am getting a unicode decode error and I am unable to solve this. If I google it I only get topics where the problem was with escape characters.

Here is the error -

Traceback (most recent call last):
  File "clStringTest.py", line 34, in <module>
    print(copied_str)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 1504, in array_str
    return array2string(a, max_line_width, precision, suppress_small, ' ', "")
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 668, in array2string
    return _array2string(a, options, separator, prefix)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 460, in wrapper
    return f(self, *args, **kwargs)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 495, in _array2string
    summary_insert, options['legacy'])
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 796, in _formatArray
    curr_width=line_width)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 750, in recurser
    word = recurser(index + (-i,), next_hanging_indent, next_width)
  File "/home/user/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py", line 704, in recurser
    return format_function(a[index])
UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 8-11: code point not in range(0x110000)

I've managed to find example programs for integer/float operations and those programs work. But I'm not able to find working examples for string manipulation.

I'd be grateful if someone could help me.


update 1: On my desktop too, I got the unicode error, at first at least-

 In [1]: %run clStringTest.py                                                    
    Choose platform:
    [0] <pyopencl.Platform 'NVIDIA CUDA' at 0x5597858ab040>
    [1] <pyopencl.Platform 'Portable Computing Language' at 0x7fb273e39020>
    Choice [0]:0
    Set the environment variable PYOPENCL_CTX='0' to avoid being asked again.
    ['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
     ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
     'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
     ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
     't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']
    84
    ---------------------------------------------------------------------------
    UnicodeDecodeError                        Traceback (most recent call last)
    ~/Documents/clStringTest.py in <module>
         30 print(original_str)
         31 print(len(original_str))
    ---> 32 print(copied_str)

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in array_str(a, max_line_width, precision, suppress_small)
       1502         return _guarded_str(np.ndarray.__getitem__(a, ()))
       1503 
    -> 1504     return array2string(a, max_line_width, precision, suppress_small, ' ', "")
       1505 
       1506 def set_string_function(f, repr=True):

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in array2string(a, max_line_width, precision, suppress_small, separator, prefix, style, formatter, threshold, edgeitems, sign, floatmode, suffix, **kwarg)
        666         return "[]"
        667 
    --> 668     return _array2string(a, options, separator, prefix)
        669 
        670 

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in wrapper(self, *args, **kwargs)
        458             repr_running.add(key)
        459             try:
    --> 460                 return f(self, *args, **kwargs)
        461             finally:
        462                 repr_running.discard(key)

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in _array2string(a, options, separator, prefix)
        493     lst = _formatArray(a, format_function, options['linewidth'],
        494                        next_line_prefix, separator, options['edgeitems'],
    --> 495                        summary_insert, options['legacy'])
        496     return lst
        497 

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in _formatArray(a, format_function, line_width, next_line_prefix, separator, edge_items, summary_insert, legacy)
        794         return recurser(index=(),
        795                         hanging_indent=next_line_prefix,
    --> 796                         curr_width=line_width)
        797     finally:
        798         # recursive closures have a cyclic reference to themselves, which

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in recurser(index, hanging_indent, curr_width)
        748 
        749             for i in range(trailing_items, 1, -1):
    --> 750                 word = recurser(index + (-i,), next_hanging_indent, next_width)
        751                 s, line = _extendLine(
        752                     s, line, word, elem_width, hanging_indent, legacy)

    ~/miniconda3/envs/pyopencl-env/lib/python3.7/site-packages/numpy/core/arrayprint.py in recurser(index, hanging_indent, curr_width)
        702 
        703         if axes_left == 0:
    --> 704             return format_function(a[index])
        705 
        706         # when recursing, add a space to align with the [ added, and reduce the

    UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 0-3: code point not in range(0x110000)

However, then I installed POCL via miniconda. Suddenly, if I execute the program via GPU, the program works... halfway. At least I don't get the unicode error anymore.

$ python3 clStringTest.py 
Choose platform:
[0] <pyopencl.Platform 'NVIDIA CUDA' at 0x5561780b9f20>
[1] <pyopencl.Platform 'Portable Computing Language' at 0x7f30edb41020>
Choice [0]:0
Set the environment variable PYOPENCL_CTX='0' to avoid being asked again.
(84,)
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']
84
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' ''
 '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' ''
 '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '']

Strangely, executing on the CPU still gives me the same error.

At this point I am at a loss and have to believe that this is a bug. @doqtor what do you think?


update 2: I tried to see what happens if I increased the number of work items and the size argument of the kernel. After some trial and error, I finally get the output as shown by @doqtor, using (400, ) work items and 400 size. I don't know why this is happening.

$ python3 clStringTest.py
Choose platform:
[0] <pyopencl.Platform 'NVIDIA CUDA' at 0x55f0f357ef20>
[1] <pyopencl.Platform 'Portable Computing Language' at 0x7fb8c82f6020>
Choice [0]:
Set the environment variable PYOPENCL_CTX='' to avoid being asked again.
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']
84
['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']

now it also works on the CPU, but I get this after the output array has been printed -

corrupted size vs. prev_size
Aborted (core dumped)

If I reduce the number of works items(300 and below) or the size I get the dreaded unicode error again on the CPU. On the GPU I get missing characters as I showed above.

doqtor
  • 8,414
  • 2
  • 20
  • 36
nova
  • 11
  • 4
  • `original_str` is a multi-dimentional array which cannot be passed directly to the `OpenCL` kernel - it has to be flatten into one dimension. Currently passing `original_str.shape` which is (3,) as a `global_size` fires kernel with 3 work items only. – doqtor Feb 26 '19 at 10:53
  • So a string also counts as an array.. of course. That's what strings are - characters arrays. But then how am I supposed to flatten this array? By concatenating all the strings? – nova Feb 26 '19 at 11:38
  • Here you have array of strings so array of array of characters in other words. Yes, you can concatenate - anything that will make it single dimensional array or characters in order to pass to OpenCL kernel. – doqtor Feb 26 '19 at 11:46
  • Ok. If make original_str to be `original_str = np.array(("this is an average string, and another one, let's push even more with a third string"))` then I get invalid kernel arg error. If I replace original_str.shape in the kernel call to (1,) I get the unicode error again. – nova Feb 26 '19 at 11:53
  • You haven't changed much - still that is multi-dimensional array - array of array of characters - but there is one string instead of 3. See my answer how to solve the problem. – doqtor Feb 26 '19 at 12:45
  • I have updated my post as you advised. Please check it and let me know. – nova Feb 26 '19 at 17:14
  • Can you check what prints `original_str.nbytes`? I suspect you have the case with 4 bytes wide characters. Anyway the Unicode error seems to point to that too: `UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 8-11: code point not in range(0x110000)` – doqtor Feb 26 '19 at 18:05
  • Indeed, original_str.nbytes is 336, which is 84 * 4. Which is also why anything below 300 gave me errors. And of course this changes with the size of the string. How should I change the kernel arguments to incorporate this? size = np.int32(len(string) * 4)? What about the number of work items? – nova Feb 26 '19 at 18:24
  • I think the next step should be to find a way to convert the string into utf-8 as there is no need to process it as utf-32 where every 4 bytes the 3 bytes are empty. – doqtor Feb 26 '19 at 19:38
  • The kernel in my original post(the one with the array of strings) is also working on my end now. I just printed the nbytes and set the work items and size accordingly. Could you try it on your end? – nova Feb 27 '19 at 07:34
  • Indeed, passing (original_str.nbytes,) as global_size is also working for me too with the array of strings. That is because internally numpy in this case keeps all string in contiguous memory layout: `print(original_str.flags)` outputs `C_CONTIGUOUS : True`. Note that this not always must be the case - see [here](https://stackoverflow.com/questions/29947639/cheapest-way-to-get-a-numpy-array-into-c-contiguous-order/29948246). If that was C-style multidimensional array it wouldn't work from the start. – doqtor Feb 27 '19 at 08:57
  • I understand. Now could you just tell me how to(or link me to where I can learn how to) change the encoding of the strings in my environment to utf-8? Then I can pick your answer as accepted and close this. Thanks for all of your help! – nova Feb 27 '19 at 09:03
  • [this](https://stackoverflow.com/questions/16957226/encode-python-list-to-utf-8) may be of help regarding converting to utf-8 – doqtor Feb 27 '19 at 09:38

1 Answers1

0

Running the code rises the following problem in my environment (I don't face Unicode error problem):

Input string (original_str):

['this is an average string' 'and another one'
 "let's push even more with a third string"]

Output string (copied_str):

[ 'thiM\x1b\x7f\x00\x00 c\x0fM\x1b\x7f\x00\x00\xd0i\x0fM\x1b\x7f\x00\x00\xe0b\x0fM\x1b\x7f\x00\x00ph\x0fM\x1b\x7f'
 'pa\x0fM\x1b\x7f\x00\x00\xf0f\x0fM\x1b\x7f\x00\x00\x90\xce\x0eM\x1b\x7f\x00\x00\x80l\x0fM\x1b\x7f\x00\x00\x80k\x0fM\x1b\x7f'
 '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\xb8\x0eM\x1b\x7f\x00\x00\xc0\xbd\x0eM\x1b\x7f']

First 3 characters in the output are correct only and the rest characters are garbage - that's because global_size is set to 3 due to defining original_str as (3,). It should be enough to define original_str as 1 dimensional numpy array as follows to fix above issue:

original_str = np.array(list("this is an average string, and another one, let's push even more with a third string"))

and then the global size is (84,) and everything should work as expected:

Input string (original_str):

['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']

Output string (copied_str):

['t' 'h' 'i' 's' ' ' 'i' 's' ' ' 'a' 'n' ' ' 'a' 'v' 'e' 'r' 'a' 'g' 'e'
 ' ' 's' 't' 'r' 'i' 'n' 'g' ',' ' ' 'a' 'n' 'd' ' ' 'a' 'n' 'o' 't' 'h'
 'e' 'r' ' ' 'o' 'n' 'e' ',' ' ' 'l' 'e' 't' "'" 's' ' ' 'p' 'u' 's' 'h'
 ' ' 'e' 'v' 'e' 'n' ' ' 'm' 'o' 'r' 'e' ' ' 'w' 'i' 't' 'h' ' ' 'a' ' '
 't' 'h' 'i' 'r' 'd' ' ' 's' 't' 'r' 'i' 'n' 'g']

As stated earlier in the comments, passing multidimensional arrays into OpenCL kernel won't work. Only single dimensional C-style array can be processed correctly by the kernel.


As @nova found out, numpy array of string will also work if numpy array has flag C_CONTIGUOUS : True which can be verified by print(original_str.flags). Then it's enough to pass (original_str.nbytes,) as global_size without any other modifications to the original source code.

doqtor
  • 8,414
  • 2
  • 20
  • 36
  • Sorry, but I'm still getting the same unicode decode error. I also tried changing the size argument in the kernel call to `np.int32(len(original_str))` to see if that was the problem, but still the same error. Please note that the script is encountering an error only when trying to print the output string, so maybe the issue is incompatibility with how Python and C handle strings? Although I'm sure passing multi-dimensional array would also cause an error, maybe they are not related? – nova Feb 26 '19 at 13:17
  • I don't experience the unicode error. If I just replace one line with the one in my answer everything works as expected. Note that I'm using pocl in VM which is being run on CPU. Likely you use different OpenCL implementation than I do and that's is probably why you see the unicode error and I don't. I have no possibility to check it on another implementation, sorry. That turned out the problem I fixed was shadowed by unicode error you are still experiencing. – doqtor Feb 26 '19 at 13:27
  • Unicode error may have something to do with `locale` in your env. Have a look at [this thread](https://github.com/PyTables/PyTables/issues/231) – doqtor Feb 26 '19 at 13:30
  • I opened IDLE and did the getprefferedencoding() method and it said utf8. So no problem there. What error was the original program I posted giving you? Was it unicode error or something else? Could you send me the program that is working on your end? I am running Ubuntu 16.04, python version is 3.7, I installed pyopencl and python through miniconda. I am using Intel's opencl SDK and runtime. – nova Feb 26 '19 at 14:01
  • Actually there was no error, just the data in output copied_str was incorrect. First 3 characters were correct and the rest characters were garbage - that's because global_size was set to 3 due to defining original_str as (3,). It's not necessary to provide the whole code to you. Just change the definition of original_str to the one provided in my answer and the global size will be (84,) and the output copied_str will be exactly the same as input - as expected. I updated my answer also. I've installed pyopencl, python and pocl through apt on Ubuntu 18.04. – doqtor Feb 26 '19 at 14:21
  • If I got an output, even if it was garbage, I wouldn't have had a problem(and probably wouldn't have needed to make a post here even). As it stands currently, no matter what I do, if I try to print the output string, I get the unicode error. This leads me to believe there must be something wrong with my system. I will try to run the program on my desktop and let you know. – nova Feb 26 '19 at 15:52
  • Well in OpenCL world errors may appear to be totally different or no errors at all but just wrong output - depending on the implementation. That happened to me many times in the past and therefore I tend to not focus too much on the error itself but on finding the fix for problem that appears in my current environment. Many times such approach was fixing other non-existing in my env problems too - but that's not the case this time as this (unicode problem) turned out to be not related to OpenCL. – doqtor Feb 26 '19 at 16:01
  • Are you using your CPU or your GPU? I'm asking because I have discovered something. – nova Feb 26 '19 at 16:22
  • CPU via POCL under VM – doqtor Feb 26 '19 at 16:23