How to read number of bytes in the ibd file (subprocess.check_output return code)

Question

I want to know why I received error running my command in order to read number of bytes in the ibd file. What might be wrong in my code? Thanks a lot in advance. I want to read my dataset which is in the format of imzML including another complementary file of ibd. More info can be ontained from http://psi.hupo.org/ms/mzml .

python
import subprocess
nbytes_str = subprocess.check_output(['wc -c < \"' + fname + '.ibd\"'], shell=True)
nbytes = int(nbytes_str)
nbytes # number of bytes in the ibd file

my error is:

python
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
<ipython-input-7-381047b77c3f> in <module>
----> 1 nbytes_str = subprocess.check_output(['wc -c < \"' + fname + '.ibd\"'], shell=True)
      2 nbytes = int(nbytes_str)
      3 nbytes # number of bytes in the ibd file

~\.conda\envs\MSI\lib\subprocess.py in check_output(timeout, *popenargs, **kwargs)
    354 
    355     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
--> 356                **kwargs).stdout
    357 
    358 

~\.conda\envs\MSI\lib\subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
    436         if check and retcode:
    437             raise CalledProcessError(retcode, process.args,
--> 438                                      output=stdout, stderr=stderr)
    439     return CompletedProcess(process.args, retcode, stdout, stderr)
    440 

CalledProcessError: Command '['wc -c < "P1 lipids pos mode Processed Norm.ibd"']' returned non-zero exit status 1.

milanbalazs · Answer 1 · 2021-02-27T14:08:20.977

First of all, as the exception says: Your command returned non-zero exit status 1. It means the called command is not correct (and failed). So you should fix your wc -c < "P1 lipids pos mode Processed Norm.ibd" command to make your code working.

On the other hand you can get the number of bytes:

my_str = "hello world"  # In your case : my_str = subprocess.check_output([...
my_str_as_bytes = str.encode(my_str)  # Convert string to byte type
type(my_str_as_bytes)  # ensure it is byte representation
len(my_str_as_bytes)  # Lenght (number) of bytes

BUT in Python3 the subprocess.check_output return bytes by default so the conversion is not needed only get the len of returned value.

For example:

import subprocess
nbytes_byte = subprocess.check_output(['wc -c < test.txt'], shell=True)
print(type(nbytes_byte))
print(len(nbytes_byte))

Content of test.txt:

Hello World

Output:

>>> python3 test.py
<class 'bytes'>
3

Furthermore here is a similar question: Python : Get size of string in bytes

EDIT:

I recommend to define the path of the IDB file based on your Python file path.

For example:

Your Python file path: /home/user/script/my_script.py

You IDB file path: /home/user/idb_files/P1 lipids pos mode Processed Norm.ibd

In the above case you should define the IDB file path:

import os
idb_file_path = os.path.join(os.path.dirname(__file__), "..", "idb_files", "P1 lipids pos mode Processed Norm.ibd")

Here is the complete example:

import os
import subprocess

#  The "os.path.join" joins the path from the inputs
#  The "os.path.dirname(__file__)" returns the path of the directory of the current script
idb_file_path = os.path.join(os.path.dirname(__file__), "..", "idb_files", "P1 lipids pos mode Processed Norm.ibd")
nbytes_byte = subprocess.check_output(['wc -c < "{}"'.format(idb_file_path)], shell=True)
print(type(nbytes_byte))
print(len(nbytes_byte))

Many thanks for your answer. you said "So you should fix your `wc -c < "P1 lipids pos mode Processed Norm.ibd"` command to make your code working.". But how? I am sorry, may you please elaborate it, if you have time, pelase? @milianbalazs — NoneNone, Feb 27 '21 at 12:36
You should get the STDOUT/STDERR to solve the problem. Please change your current line to this: `nbytes_str = subprocess.run(['wc -c < \"' + fname + '.ibd\"'], shell=True, stderr=subprocess.PIPE, stdout=subprocess.PIPE)` and print the STDOUT/STDERR with this: `print(nbytes_byte.stderr, nbytes_byte.stdout)`. Then share with me the result and I will try to help you about solving! :) — milanbalazs, Feb 27 '21 at 12:50
First of all, I deeply thank you to try to help me. Then, it said nbytes_byte is not defined. Therefore, I changed it to nbytes_str and the output is `b'The filename, directory name, or volume label syntax is incorrect.\r\n' b''`@milanbalazs — NoneNone, Feb 27 '21 at 13:57
I better to type all I did. `nbytes_str = subprocess.run(['wc -c < \"' + fname + '.ibd\"'], shell=True, stderr=subprocess.PIPE, stdout=subprocess.PIPE)` `print(nbytes_str.stderr, nbytes_str.stdout)` `nbytes = nbytes_str` `nbytes # number of bytes in the ibd file` and the output is `b'The filename, directory name, or volume label syntax is incorrect.\r\n' b'' CompletedProcess(args=['wc -c < "P1 lipids pos mode Processed Norm.ibd"'], returncode=1, stdout=b'', stderr=b'The filename, directory name, or volume label syntax is incorrect.\r\n')` — NoneNone, Feb 27 '21 at 13:59
Okay, this output is so helpful for debugging. I am 99% sure that the path of the IDB file is not correct. I have updated my answer how you should define the correct (full) path of the IDB file. Could you try that what I mention in my answer under the "EDIT:" section? — milanbalazs, Feb 27 '21 at 14:10

How to read number of bytes in the ibd file (subprocess.check_output return code)

1 Answers1