1

There are 3 columns, levels 1-3. A file is read, and each line of the file contains various data, including the level to which it belongs, located at the back of the string.

Sample lines from file being read:

thing_1 - level 1
thing_17 - level 3
thing_22 - level 2

I want to assign each "thing" to it's corresponding column. I have looked into pandas, but it would seem that DataFrame columns won't work, as passed data would need to have attributes that match the number of columns, where in my case, I need 3 columns, but each piece of data only has 1 data point.

How could I approach this problem?

Desired output:

level 1     level 2    level 3

thing_1     thing_22   thing_17

Edit:

In looking at suggestion, I can refine my question further. I have up to 3 columns, and the line from file needs to be assigned to one of the 3 columns. Most solutions seem to need something like:

data = [['Mary', 20], ['John', 57]]
columns = ['Name', 'Age']

This does not work for me, since there are 3 columns, and each piece of data goes into only one.

  • Is it possible that one or more levels might each have more than one thing? – CrazyChucky Feb 15 '21 at 19:45
  • The columns will contain more than one thing each, however each line only has one level. –  Feb 15 '21 at 19:48
  • You could use Pandas, but I think you might be better off looking at a lighter-weight output package, like one of the ones mentioned here: https://stackoverflow.com/a/26937531/12975140 – CrazyChucky Feb 15 '21 at 19:48
  • Thanks, I'll try and come up with a solution based on those. I will have to think outside the box as those solutions still require each piece of data to have a relation to the number of columns, where in my case I have a set number of columns, with data that needs to be sorted into appropriate column. –  Feb 15 '21 at 19:59
  • I haven't looked through all of them, but PrettyTable and Tabulate (and probably others) don't require you to supply data row by row; you can supply it column by column instead. I'd recommend parsing the text file into a [dictionary](https://www.w3schools.com/python/python_dictionaries.asp), where each key is a level, and its value is a list of things. Then you can pass that to your preferred output method. (Pandas lets you do this as well, but it's probably overkill for something like this where you're just doing output formatting.) – CrazyChucky Feb 15 '21 at 20:02
  • Alright, I'll look into this now, thanks. –  Feb 15 '21 at 20:05

1 Answers1

1

There's an additional wrinkle here that I didn't notice at first. If each of your levels has the same number of things, then you can build a dictionary and then use it to supply the table's columns to PrettyTable:

from prettytable import PrettyTable

# Create an empty dictionary.
levels = {}
with open('data.txt') as f:
    for line in f:
        # Remove trailing \n and split into the parts we want.
        thing, level = line.rstrip('\n').split(' - ')
        
        # If this is is a new level, set it to a list containing its thing.
        if level not in levels:
            levels[level] = [thing]
        # Otherwise, add the new thing to the level's list.
        else:
            levels[level].append(thing)

# Create the table, and add each level as a column
table = PrettyTable()
for level, things in levels.items():
    table.add_column(level, things)

print(table)

For the example data you showed, this prints:

+---------+----------+----------+
| level 1 | level 3  | level 2  |
+---------+----------+----------+
| thing_1 | thing_17 | thing_22 |
+---------+----------+----------+

The Complication

I probably wouldn't have posted an answer (believing it was covered sufficiently in this answer), except that I realized there's an unintuitive hurdle here. If your levels contain different numbers of things each, you get an error like this:

Exception: Column length 2 does not match number of rows 1!

Because none of the solutions readily available have an obvious, "automatic" solution to this, here is a simple way to do it. Build the dictionary as before, then:

# Find the length of the longest list of things.
longest = max(len(things) for things in levels.values())

table = PrettyTable()
for level, things in levels.items():
    # Pad out the list if it's shorter than the longest.
    things += ['-'] * (longest - len(things))
    table.add_column(level, things)

print(table)

This will print something like this:

+---------+----------+----------+
| level 1 | level 3  | level 2  |
+---------+----------+----------+
| thing_1 | thing_17 | thing_22 |
|    -    |    -     | thing_5  |
+---------+----------+----------+

Extra

If all of that made sense and you'd like to know about a way part of it can be streamlined a little, take a look at Python's defaultdict. It can take care of the "check if this key already exists" process, providing a default (in this case a new list) if nothing's already there.

from collections import defaultdict

levels = defaultdict(list)
with open('data.txt') as f:
    for line in f:
        # Remove trailing \n and split into the parts we want.
        thing, level = line.rstrip('\n').split(' - ')
        
        # Automatically handles adding a new key if needed:
        levels[level].append(thing)
CrazyChucky
  • 3,263
  • 4
  • 11
  • 25