3

I didn't really find an example related to my question as I don't know Pandas so I post it here. Let me know if this is not clear or already have been responded.

I have a CSV input which I import like this

def import_csv(csvfilename):
    data = []
    with open(csvfilename, "r", encoding="utf-8", errors="ignore") as scraped:
        reader = csv.reader(scraped, delimiter=',')
        row_index = 0
        for row in reader:
            if row:  # avoid blank lines
                row_index += 1
                columns = [str(row_index), row[0], row[1], row[2]]
                data.append(columns)

    return data

I index rows with input_rows (there is probably a better way for this?)

Input example :

[['1',
  '[FirstValue]',
  'FirstText',
  'AB'],

 [...]

 ['12',
  "['LastValue']",
  "LastText",
  'YZ']]

I'm looking to get the last row of this insput list. Is there a simple way to do that without iterating over all the rows ?

Thank you !

Charles Landau
  • 4,187
  • 1
  • 8
  • 24
SidGabriel
  • 205
  • 1
  • 3
  • 12

4 Answers4

5

You could actually get the last line within in your with statment Since python support negative indexing you could just use

with open(csvfilename, "r", encoding="utf-8", errors="ignore") as scraped:
        final_line = scraped.readlines()[-1]
        
Pavel Halko
  • 84
  • 1
  • 5
4

You can get the last element in an array like so:

some_list[-1]

In fact, you can do much more with this syntax. The some_list[-n] syntax gets the nth-to-last element. So some_list[-1] gets the last element, some_list[-2] gets the second to last, etc

So in your case, it would be:

import csv    

def import_csv(csvfilename):
    data = []
    with open(csvfilename, "r", encoding="utf-8", errors="ignore") as scraped:
        reader = csv.reader(scraped, delimiter=',')
        for row in reader:
        if row:  # avoid blank lines
            row_index += 1
            columns = [str(row_index), row[0], row[1], row[2]]
            data.append(columns)
return data

data = import_csv(file_name)
last_row = data[-1]
Raoslaw Szamszur
  • 1,723
  • 12
  • 21
  • 6
    `csv.reader` is an iterator and cannot be subscripted like that. – r.ook Nov 26 '18 at 14:41
  • @Idlehands You're right, good catch. I've corrected the answer. – Raoslaw Szamszur Nov 26 '18 at 14:47
  • Odd that it locked my vote in even after your edit... If you would edit the answer again I would gladly retract my downvote. – r.ook Nov 26 '18 at 14:50
  • 1
    Thank you very much for your answer ! I had a mistake while doing reader[-1] but not its ok! :) – SidGabriel Nov 26 '18 at 15:00
  • 1
    Doesn't this mean you have to load the entire file? This probably isn't a good approach for huge csv files as that data array eventually holds the entire csv file. A 30mb csv file will use 30mb of memory when loaded in the data list. The csv reader returns an iterator object for memory optimization. I'm guessing a more manual approach like this https://stackoverflow.com/a/2138894/3011082 would preserve the memory advantages of an iterator. – www139 Aug 15 '20 at 22:48
1

Python supports negative indexing.

your_list[-1] # Fetch the last value in your list.

Without knowing your use case and more about the data and how you typically access the data, it's hard to say what data structure would be better for you.

Charles Landau
  • 4,187
  • 1
  • 8
  • 24
1

It's worth noting that csv.reader is an Iterator and doesn't contain your data until iterated through. Same is true for the scraped opened I/O Stream object which is also an iterator. The question to "Can I get the last row without iterating through the file/data" is unfortunately no, unless there is a specific stream point that you know can jump to (using scraped.seek()), then you will not be able to get the very last row until all the data have been iterated through.

Once you have consumed all the data however, you can retrieve in your code with data[-1] by means of negative indexing, i.e. returning the last row of the list.

Here is a related question that might be of interest to you, but again, the answers all consume the data (reading the entirety as a list) prior to allowing the reverse() operation, hence, all the data must be read through once at least.

r.ook
  • 13,466
  • 2
  • 22
  • 39