0
unsorted = [{"#":"1", "files": [("test.txt",1),("example.out",1)]},
             {"#":"2", "files": [("other.txt",0),("test.txt",3)]},
             {"#":"3", "files": [("example.out",0)]},
             {"#":"4", "files": [("test.txt",2)]}]

files : a list of tuples that refer to a file and its rev. For instance, ("test.txt",1) is the file test.txt with rev 1.

Need to sort object unsorted so that its elements are in an order that:

  1. sorted by book names
  2. If multiple elements have the same books names, minor rev comes first

Following is the wanted result:

sorted = [{"#":"3", "files": [("example.out",0)]},
             {"#":"1", "files": [("test.txt",1),("example.out",1)]},
             {"#":"4", "files": [("test.txt",2)]},
              {"#":"2", "files": [("other.txt",0),("test.txt",3)]}]

PS, possible to do with a lambda?

Yang Liu
  • 346
  • 1
  • 6

2 Answers2

0

Your outer dictionary of your sample code seems redundant. It contains a single list with no key. If you remove the list brackets you have your dictionary of dicts, but still with no keys, so it is invalid.

But if you remove the outer dictionary braces then you are left with a list of dictionaries, which you can absolutely sort as you desire with lambda:

unsorted_list = [{"#": "1", "files": [("test.txt", 1), ("example.out", 1)]},
                 {"#": "2", "files": [("other.txt", 0), ("test.txt", 3)]},
                 {"#": "3", "files": [("example.out", 0)]},
                 {"#": "4", "files": [("test.txt", 2)]}]


unsorted_list.sort(key=lambda x: x['files'][-1][1])

Using lambda x where x references each of the dictionaries within the list. Then accesses the 'files' element of the dict x['files'], and then the last element in the internal list of files [-1], and finally the second element of the internal tuple [1]. This is then treated as the key for the purposes of sorting the outer-most list.

Tested with the given code and it seems to work fine. Of course any more entries in the list will have to be consistent with the existing structure or it will not work.

If you do require an outer dictionary, and get it structured appropriately, this should help.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
Demosius
  • 24
  • 5
  • 1
    This sorts by the last revision in the list, but the requirement was files with same name needed to appear in order. What if in #1 it had `("test.txt",4)` in it instead? FWIW some combinations of entries would be impossible to sort by the OP's rule. – Mark Tolonen Jan 15 '21 at 01:34
  • @MarkTolonen True. My answer was in response to the initial unedited question. I made some (incorrect) assumptions here before OP clarified. Thanks. – Demosius Jan 15 '21 at 01:49
0

As has been pointed out, given the nature of your data it is possible that you will get combinations of entries that cannot hold to your rule. It is also not a simple task to sort based on your requirements as each node needs to know the data in other nodes to make a decision.

That being said, I have given it a crack. While not simple, I have likely overthought the process. It also is not an elegant solution, and definitely will not be an efficient process. I imagine it will take quite a while to execute on large data sets.

I have tested it with your sample data, and it seemed to work out just fine. I added in some extra data, including data that cannot hold to your requirements, and I feel the resultant order would suffice given the circumstances. It would be simple to change it to exit and/or result in an error instead if it can't be sorted properly. Assuming that there will always be a correct order, this should work. Definitely requires more testing:

class Node:
    """
    A node representing the inner dictionary of files. Keeps rack of
    lesser and greater nodes so that they can be placed into a list in appropriate order.
    """

    def __init__(self, file_dict: dict):
        self.file_dict = file_dict
        self.files = {i[0]: i[1] for i in file_dict['files']}
        self.greater_nodes = []
        self.lesser_nodes = []
        self.on_stack = False
        self.in_list = False

    def compare_nodes(self, other_node, file_name):
        """Compares the nodes to each other and lists each other in either their greater or lesser groups"""
        if self.files[file_name] < other_node.files[file_name]:
            self.lesser_nodes.append(other_node)
            other_node.greater_nodes.append(self)
        else:
            self.greater_nodes.append(other_node)
            other_node.lesser_nodes.append(self)

    def place_node(self, node_list, lesser_list):
        """Places node in given list, but first checks for any greater node in the list and places those first"""
        self.on_stack = True  # If on stack and referenced as greater, place into list.
        lesser_list += self.lesser_nodes
        for node in self.greater_nodes:
            if node not in lesser_list:  # Don't go to nodes that are in any way lower than those on the stack.
                if node.on_stack and not node.in_list:
                    node_list.append(node.file_dict)
                    node.in_list = True
                    node.on_stack = False
                elif not node.in_list:
                    node.place_node(node_list, lesser_list)
        if not self.in_list:
            node_list.append(self.file_dict)
            self.in_list = True
        self.on_stack = False

    def __repr__(self):
        return f'{self.file_dict["#"]}'


class FileObj:
    """
    Representation for each different file name.
    Tracks nodes with common file name so that they can be appropriately compared.
    """

    def __init__(self, name: str):
        self.name = name
        self.nodes = []

    def add_node(self, new_node: Node):
        """Add another node to the list for the file object, making sure to
        compare any new nodes to old based on file name."""
        for old_node in self.nodes:
            new_node.compare_nodes(old_node, self.name)
        self.nodes.append(new_node)

    def place_nodes(self, node_list):
        """Place all nodes into a given list."""
        # Sorts nodes first (shouldn't be necessary in most cases, as they should all have valid relational data).
        self.nodes.sort(key=lambda x: x.files[self.name])
        for node in self.nodes:
            if not node.in_list:
                node.place_node(node_list, [])

    def __repr__(self):
        return f"{self.name} : {self.nodes}"


def sort_function(unsorted_list):
    # Get dict of file objects.
    file_dict = {str(y[0]): FileObj(y[0]) for x in unsorted_list for y in x['files']}
    file_order = sorted([x for x in file_dict.keys()])  # Reference of file names in sorted order.

    # Get list of dicts from the unsorted list as node-objects.
    node_list = [Node(x) for x in unsorted_list]

    # Get nodes to assign themselves to file_objects
    for node in node_list:
        for file in node.files.keys():
            file_dict[file].add_node(node)

    # Go through file_order to start placing nodes in lexicographical order.
    final_list = []
    for file_name in file_order:
        file_obj = file_dict[file_name]
        file_obj.place_nodes(final_list)

    # Print results to check.
    print(final_list, "\n")
    for d in final_list:
        print(d)


def main():
    unsorted_list = [{"#": "1", "files": [("test.txt", 1), ("example.out", 1)]},
                     {"#": "2", "files": [("other.txt", 0), ("test.txt", 3)]},
                     {"#": "3", "files": [("example.out", 0)]},
                     {"#": "4", "files": [("test.txt", 2)]}]

    sort_function(unsorted_list)


if __name__ == '__main__':
    main()
Demosius
  • 24
  • 5