1

I am plotting data from multiple files. I do not want to use the glob module since I need to plot the data from each file separately. The data is plotting, but there are 'traceback' lines on the plot when they are graphed using Matplotlib. The image of the plots is below:

enter image description here

Here are some sample data to help solve the problem and im sorry about the lack of formatting. The data is from unformatted text files. If you split the two data sets into two separate files it should recreate the issue.

Start-Mi, End-Mi,   IRI LWP, IRI R e
194.449,    194.549,    75.1,   92.3
194.549,    194.649,    85.2,   82.8
194.649,    194.749,    90.8,   91.8
194.749,    194.849,    79.3,   73.7
194.849,    194.949,    76.9,   80.1
194.949,    195.049,    82.7,   86.9
195.049,    195.149,    103,    116.7
195.149,    195.249,    81.5,   96.1
195.249,    195.349,    96.7,   92.7
195.349,    195.449,        59.5,   72.2

and

Start-Mi, End-Mi,   IRI LWP, IRI R e
194.449,    194.549,    79.9,   95.7
194.549,    194.649,    87.4,   96.5
194.649,    194.749,    86.5,   105.3
194.749,    194.849,    77, 76
194.849,    194.949,    73.6,   85.2
194.949,    195.049,    81.7,   94.3
195.049,    195.149,    104.6,  128.2
195.149,    195.249,    84.2,   98.6
195.249,    195.349,    94.2,   91.3
195.349,    195.449,    57.5,   72.1

The traceback lines are created when the code begins a new data plot on a new file. Im trying to get rid of the horizontal lines drawn from the end of the plot back to the beginning. I need clean up the plot since the code is designed to iterate over a indefinite number of data files. The code is shown below:

def graphWriterIRIandRut():
    n = 100
    m = 0
    startList = []
    endList = []
    iriRList = []
    iriLList = []
    fileList = []
    for file in os.listdir(os.getcwd()):
        fileList.append(file)
    while m < len(fileList):
        for col in csv.DictReader(open(fileList[m],'rU')):
            startList.append(float(col['Start-Mi']))
            endList.append(float(col['  End-Mi']))
            iriRList.append(float(col[' IRI R e']))
            iriLList.append(float(col['IRI LWP ']))

        plt.subplot(2, 1, 1)
        plt.grid(True)
        colors = np.random.rand(n)
        plt.ylabel('IRI value',fontsize=12)
        plt.title('Right IRI data per mile for 2016 calibrations: ')
        plt.plot(startList,iriRList,c=colors)
        plt.tick_params(axis='both', which='major', labelsize=8)

        plt.subplot(2, 1, 2)
        plt.grid(True)
        colors = np.random.rand(n)
        plt.ylabel('IRI value',fontsize=12)
        plt.title('Left IRI data per mile for 2016 calibrations: ')
        plt.plot(startList,iriLList,c=colors)
        plt.tick_params(axis='both', which='major', labelsize=8)

        m = m + 1
        continue

    plt.show()
    plt.gcf().clear()
    plt.close('all')
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264

1 Answers1

0

Your code is currently doing the following:

  1. Read data from a file, appending it to a list
  2. Plotting the list

The list is not cleared at any point, so you keep plotting the list with more and more data appended to it, most of which is being plotted over and over. This is also why all your lines have the same color: it is the color of the last plot you made, which exactly covers all the previous plots and adds one more line.

As it happens, pyplot has a nifty hold function that lets you ensure that any additional plots you make on a figure won't overwrite the old ones. You don't even need to generate your own color sequence. pyplot will do that for you too.

While your program is functionally sound, there are also a few "stylistic" issues in your code that can be easily corrected. They are un-Pythonic at best and actually problematic at worst:

  1. Files should be closed after being opened. A context manager used in the with keyword is the standard approach for this.
  2. There are better ways to copy the result of os.listdir than a for loop. In fact, you don't need to copy the list at all.
  3. If you are writing a while loop that increments an index on every iteration, it should be a for loop.
  4. You never need a continue at the end of a loop. It is implied.

So here is a solution that combines all of the above. This version assumes that you do not need to keep the contents of a given file around after you plot it:

def graphWriterIRIandRut():
    # Set up the plots
    plt.subplot(2, 1, 1)
    plt.grid(True)
    plt.ylabel('IRI value', fontsize=12)
    plt.title('Right IRI data per mile for 2016 calibrations:')
    plt.tick_params(axis='both', which='major', labelsize=8)
    plt.hold(True)

    plt.subplot(2, 1, 2)
    plt.grid(True)
    plt.ylabel('IRI value', fontsize=12)
    plt.title('Left IRI data per mile for 2016 calibrations:')
    plt.tick_params(axis='both', which='major', labelsize=8)
    plt.hold(True)

    # Iterate over the files in the current directory
    for filename in os.listdir(os.getcwd()):
        # Initialize a new set of lists for each file
        startList = []
        endList = []
        iriRList = []
        iriLList = []

        # Load the file
        with open(filename, 'r') as file:
            for row in csv.DictReader(file):
                startList.append(float(row['Start-Mi']))
                endList.append(float(row[' End-Mi']))
                iriRList.append(float(row[' IRI R e']))
                iriLList.append(float(row['   IRI LWP']))

        # Add new data to the plots
        plt.subplot(2, 1, 1)
        plt.plot(startList, iriRList)
        plt.subplot(2, 1, 2)
        plt.plot(startList, iriLList)

    plt.show()
    plt.close('all')

Running this function on the inputs you provided yields the following figure:

enter image description here

For a more efficient way to work with CSVs and tabular data in general, you may want to check out the pandas library. It is a really powerful tool for analysis which includes plotting and IO routines for most of the use-cases you can probably imagine.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • You may want to change the column headings in my answer. They may or may not be accurate for your case, but I had to change a couple of them to get it to run with your inputs. – Mad Physicist Sep 08 '16 at 15:12
  • Im getting a strange error message: Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode? –  Sep 08 '16 at 15:21
  • I changed with open(filename, 'r') as file: To: with open(filename, 'rU') as file: and that did the trick. –  Sep 08 '16 at 15:27
  • Otherwise thank you for your help and information. Im always willing to learn more and become better at this skill. –  Sep 08 '16 at 15:28
  • Yeah, sounds like you need the `U`. Are you running on Windows perhaps? – Mad Physicist Sep 08 '16 at 16:25
  • Yes im running on Windows 7 –  Sep 08 '16 at 16:27