-1

I have a large CSV file with multiple columns of integer values:

1122  2222 3333 6664
4588  2122 5555 7747
1155  8844 1147 8895
....  .... .... ....

I want to generate for each column a specific file in this format. Let's have an example of column 1:

sudo google-chrome -c -a tt1
sudo google-chrome -a tt1 -d 1122 -u 1122

sudo google-chrome -c -a tt1
sudo google-chrome -a tt1 -d 4588 -u 4588

sudo google-chrome -c -a tt1
sudo google-chrome -a tt1 -d 1155 -u 1155

Until values of column 1 finish and we store it in specific file.

The same process should be repeated for all columns. At the end, each column will have its corresponding file in format 'columnx.sh'.

How can we reach this functionality using Python?

martineau
  • 119,623
  • 25
  • 170
  • 301
AbouSDN
  • 7
  • 1

1 Answers1

0

I will give an explanation of a possible way (there are many, of course), without coding that for you. I think it's more valuable compared to a bare code snippet. This example assumes you don't have memory constraints in reading the entire file before writing yours.

Assuming you already know the language foundations, you should start from here, where the module csv is explained:

Module csv is the reference module to manipulate CSV files and it's integrated in your standard Python distribution from http://python.org.

Once its usage is clear to you, work with context managers to open and handle the CSV file, as explained in this Python 2 example on the documentation site:

>>> import csv
>>> with open('largefile.csv', 'rb') as csvfile:
...     spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
...     for row in spamreader:
...         # Store data in your data structure

You'll need to specify the delimiter parameter of csv.reader that is used in your specific CSV file. It could be a comma, or a tab, or a group of spaces.

While the spamreader reads each row of the input file you can store each value of each row in a data structure you prepared before your reading operation. In your case a list of list, basically a matrix, is a good choice as later you'll need to fetch your data in a serial-fashion.

You can use the list.append method to push each item of each row to the list tail. As you need to scan the values by column and the spamreader will work by rows create a list of list so you can represent the table in this matrix very easily.

Once the matrix is ready write your file, possibly using context managers.

>>> with open('columnx.sh', 'a') as the_file:
...     the_file.write('Hello\n')

This is an example of a file write from another question on Stackoverflow, at https://stackoverflow.com/a/6160082/3789324

In this case you need to insert a nested for loop before the file.write method in which you can loop all your columns and all elements of columns.

The file.write method shall write the

s1 = "sudo google-chrome -c -a tt1"

string first and then the string

s2 = "sudo google-chrome -a tt1 -d item -u item"

where item is the number you just obtained from the list.

To substitute the item number in the string you can concatenate numbers to the substrings, this way:

s = "sudo google-chrome -a tt1 -d " + str(item) + " -u " + str(item)"

So now you can pass s1 and s2 to the file.write write method at each different iteration for each element of the column:

this_file.write(s1 + "\n" + s2 + "\n")

And you have your code snippet.

turbopapero
  • 932
  • 6
  • 17