0

I'm working on a text file, "creatures.txt". An example of its contents can be seen below:

Special Type A Sunflower
2016-10-12 18:10:40
Asteraceae
Ingredient in Sunflower Oil
Brought to North America by Europeans
Requires fertile and moist soil
Full sun

Pine Tree
2018-12-15 13:30:45
Pinaceae
Evergreen
Tall and long-lived
Temperate climate

Tropical Sealion
2019-01-20 12:10:05
Otariidae
Found in zoos
Likes fish
Likes balls
Likes zookeepers

Big Honey Badger
2020-06-06 10:10:25
Mustelidae
Eats anything
King of the desert

When its contents are converted to the values of a dictionary input, it ran well.
Input

def TextFileToDictionary():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
        return dataset                          
TextFileToDictionary()


Output

[{'Name': 'Special Type A Sunflower',
  'Date': '2016-10-12 18:10:40',
  'Information': ['Asteraceae',
   'Ingredient in Sunflower Oil',
   'Brought to North America by Europeans',
   'Requires fertile and moist soil',
   'Full sun']},
 {'Name': 'Pine Tree',
  'Date': '2018-12-15 13:30:45',
  'Information': ['Pinaceae',
   'Evergreen',
   'Tall and long-lived',
   'Temperate climate']},
 {'Name': 'Tropical Sealion',
  'Date': '2019-01-20 12:10:05',
  'Information': ['Otariidae',
   'Found in zoos',
   'Likes fish',
   'Likes balls',
   'Likes zookeepers']},
 {'Name': 'Big Honey Badger',
  'Date': '2020-06-06 10:10:25',
  'Information': ['Mustelidae', 'Eats anything', 'King of the desert']}]

As observed, the output comprises multiple dictionaries, without names.

Currently, I'm trying to create functions which will sort dictionaries by
1) 1st key value by alphabetical order and
2) 2nd key value by latest date.

My progress is at:

import itertools
import os

MyFilePath = os.getcwd() 
ActualFile = "creatures.txt"
FinalFilePath = os.path.join(MyFilePath, ActualFile) 

def TextFileToDictionaryName():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
            dataset.sort(key=lambda x: x[0]['Name'], reverse=False)
        return dataset                          
TextFileToDictionaryName()

def TextFileToDictionaryDate():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
            dataset.sort(key=lambda x: x[1]['Date'], reverse=True)
        return dataset                          
TextFileToDictionaryDate()

However, I have encountered an error "KeyError: 0". I am unsure how to resolve it.
I am also unsure as to converting the dictionary output back into string format, like in the "creatures.txt" file's contents earlier.

Does anyone know how to fix the code?

Many thanks!

TropicalMagic
  • 104
  • 2
  • 11
  • You don't need `x[0][...]` and `x[1][...]`, `Name` and `Date` are already 0th and 1st index respectively but you are doing lookup by key name here, not by positional index. Do `x['Name']` and `x['Date']`instead. – Mario Ishac Aug 15 '20 at 06:04

3 Answers3

4

Don't use a dict. Your data appears to have a corresponding model to it.

Instead, create a proper Python class, a Creature:

class Creature:
    __init__(self, name, date, habitat):
        self.name = name
        self.date = date
        self.habitat = habitat
        # etc.

As you're reading the input file, create new Creature instance for each grouping of data. Add each Creature into a collection of some kind:

creatures = list()
with open(FinalFilePath, "r") as textfile:  
    sections = textfile.read().split("\n\n")
    for section in sections:                 
        lines = section.split("\n")      
        creatures.append(Creature(lines[0], lines[1])) # add more params?

Next, add some boiler-plate methods (__lt__, etc.) to your Creature class, so that it will be sortable.

Lastly, just use sorted(creatures), and then your collection of creatures will be sorted according to your __lt__ logic.

The implementation of __lt__ will look like this:

def __lt__(self, other):
    if self.name < other.name:
        return True
    elif self.name > other.name:
        return False
    elif self.date < other.date:
        return True
    elif self.date > other.date:
        return False
    else
        # What happens if name and date are the same?

** Alternately, you could use a creatures = SortedList(), and then each item would be inserted in the correct position when you call creates.add(Creature(...)). You wouldn't need the sorted(creatures) call at the end.

Jameson
  • 6,400
  • 6
  • 32
  • 53
  • You should maybe show how to use a key function as well, although I agree the OP should use a custom type – juanpa.arrivillaga Aug 15 '20 at 06:27
  • 1
    Many thanks on the interesting perspective! I hadn't used class before so this is completely new to me! I will try to use it to sort and edit the contents. – TropicalMagic Aug 15 '20 at 13:59
2

You were almost there, just don't do x[0] or x[1]. Also, I think you shouldn't sort the list at every iteration in the loop but only at the end.

def TextFileToDictionaryName():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
        dataset.sort(key=lambda x: x['Name'], reverse=False)
        return dataset                          
TextFileToDictionaryName()

def TextFileToDictionaryDate():
    dataset = [] 
    with open(FinalFilePath, "r") as textfile:  
        sections = textfile.read().split("\n\n")
        for section in sections:                 
            lines = section.split("\n")      
            dataset.append({                
              "Name": lines[0],                 
              "Date": lines[1],              
              "Information": lines[2:]          
            })
        dataset.sort(key=lambda x: x['Date'], reverse=True)
        return dataset                          
TextFileToDictionaryDate()
Valentin Vignal
  • 6,151
  • 2
  • 33
  • 73
  • Thanks! It was what I was looking for! However, is there a way for the dict output to be converted to a string format much like the original .txt file? – TropicalMagic Aug 15 '20 at 13:39
1

You don't need to sort the list individually by name and then by date. you can do it both at the same time.

The reason for getting the KeyError: The key parameter is used to specify a function which is to be called on each list element prior to making comparisons. The element x will be a dictionary not a list so I am hoping your reasons for using x[0] is you assumed x to be a list but its not.

from datetime import datetime

sample = [
    {
        "Name": "Special Type A Sunflower",
        "Date": "2016-10-12 18:10:40",
        "Information": [...],
    },
    {
        "Name": "Pine Tree",
        "Date": "2018-12-15 13:30:45",
        "Information": [...],
    },
    {
        "Name": "Tropical Sealion",
        "Date": "2019-01-20 12:10:05",
        "Information": [...],
    },
    {
        "Name": "Big Honey Badger",
        "Date": "2020-06-06 10:10:25",
        "Information": [...],
    },
]

sample.sort(
    key=lambda x: (x["Name"], datetime.strptime(x["Date"], "%Y-%m-%d %H:%M:%S"))
)
Vishal Singh
  • 6,014
  • 2
  • 17
  • 33