As a general answer to this sort of testing problem: you run it once on a given, fixed, input and you inspect the result. If it 'looks right' then you write a test that provides exactly the same input every time and blindly compares the output with the expected output, which is the output you 'manually' verified and it 'looks right'. This will prevent a regression in the function implementation.
You can go further and add a number of test cases (empty list, one entry in list, more entries etc) and then write a similar test for each case. These are 'expected result' tests, where the test does not know what the 'looks right' means, but it simply verifies that for a given input the output is the one 'expected' and it does not deviate. The outputs themselves are verified that they 'look right' by the person writing the test, which adds the 'expected output', manually, when the test is written and accepted.
The alternative is you have to write logic to parse the output, perhaps a full blown lexer and grammar, and truly validate the output. For this you need to very precisely define what output is 'right' and write an appropriate test parser for it. Is a worthy goal if the function is important enough, but this does not appear to be your case.
BTW, do not write a function that 'return 1 if the function has bugs', but instead use a test framework and add this as a build phase to your project.