Import multiple files output from bash script into Python lists

Question

I have a bash script that connects to multiple compute nodes and pulls data from each one depending on some arguments entered after the bash script is called. For simplicity sake, I'm essentially doing this:

for h in node{0..7}; do ssh $h 'fold -w 80 /program.headers | grep "RA" 
| head -600 | tr -d "RA =" > '$h'filename'; done

I'm trying to take the 8 files that come out of this (each have 600 pieces of information) and save them each as a list in Python. I then need to manipulate them in Python (split and convert to float) to be able to plot the data with Matplotlib.

For a bash script that only outputs one file, I can easily make a variable name equal to check_output and then manipulate from there:

test = subprocess.check_output("./bashscript")
test_list = test.split()
test = [float(a) for a in test_list]

I am also able to read a saved file from my bash script by using:

test = subprocess.check_output(['cat', '/path/filename'])
test_list = test.split()
test = [float(a) for a in test_list]

The problem is, I'm working with over 80 files after I get all that I need. Is there some way in Python to say, "for every file made store the contents of that as a list"?

Can't you get the files all at once and store them in a local folder; then after they are locally present, iterate over them using Python? Or are they dynamically created or changing over time which would require rereading them over and over again? — ImportanceOfBeingErnest, Jul 26 '17 at 17:21

score 0 · Answer 1 · answered Jul 26 '17 at 17:29

0

Define a simple interface between your bash script and your python script

It looks like the simple interface used to be a print out of the file, but this solution did not scale to multiple files. Now, I recommend the interface be printing out the names of files created. It would look something like this:

filenames = subprocess.check_output("./bashscript").split()
for filename in filenames:
    with open(filename) as file_obj:
        file_data = [float(a) for a in file_obj.readlines()]

It looks like you are unfamiliar with Python but are familiar with bash. As a result, you are programming hobbled on bash crutches, instead you should embrace Python and use it in your application. You probably do not need the bash script at all.

answered Jul 26 '17 at 17:29

Harrichael

395
3
11

This is fantastic and I totally understand it. Thank you kindly, and I will indeed look into trying to use Python entirely. I suppose the issue I have with python is that I have to SSH into other machines which I have yet to find out how to do within a Python script. – Jesse Kyle Jul 26 '17 at 17:41
See if this helps you: https://stackoverflow.com/questions/3586106/perform-commands-over-ssh-with-python – Harrichael Jul 26 '17 at 17:43
To be fair, using bash isn't 100% undesired because you are reducing the size of the data sent over ssh. – Harrichael Jul 26 '17 at 17:59

score 0 · Accepted Answer · answered Jul 26 '17 at 17:44

Instead of capturing data by using subprocess you can use os.popen() to execute scripts. The benefit of using it is that you can read the output of a command/script as you are reading a file. So you can use read(), readlines(),readline() according to your wish which all will give you a list. By using that you can execute the script and capture output like this

import os
output=os.popen("./bashscript").readlines() #now output has the op of bashsceipt with each line as a seperate item as list.

check this for more info on how to use os.popen(). check this to know difference between read(),readlines(),readline(),xreadlines()

This doesn't solve the original problem: breaking up different machines outputs. It only provides a more convenient way to call the bash script. — Harrichael, Jul 26 '17 at 18:38

Import multiple files output from bash script into Python lists

2 Answers2