0

I have a while loop that works really well (it does what it's supposed to). When I run the while loop and have it write it's generated dataframe to .CSV it work no problem and keeps looping (albeit overwriting the .CSV file)

But I'm trying to figure out how to write the df to a new file (with a variable generated name) each time the loop runs. I can't seem to get this figured out.

Does anyone have a suggestion?

 f = open('ActionLogLinks.csv')
    csv_f = csv.reader(f)
    action_log_links = []
    for column in csv_f:
      action_log_links.append(column[1])
    cand_ref = []
    for column in csv_f:
      cand_ref.append(column[0])
    position = 0
    while position <len(action_log_links):
        browser.get(action_log_links[position])        
        for cand in cand_ref:
            filename="Loop_Test"+"_"+str(cand_ref)+".csv"
            df.to_csv(filename, index=False)
        position = position + 1
TardisPilot
  • 37
  • 1
  • 9
  • The code you've posted should work, assuming that the values in `cand_ref` are not all the same string. Please [edit] your question to include what that list looks like at the point the `to_csv()` loop happens, and describe how the output is different from what you expect – G. Anderson Nov 13 '19 at 17:29
  • Can you reduce that to a [mcve] with only code relevant to your question : `trying to figure out how to write the df to a new file (with a variable generated name)` – wwii Nov 13 '19 at 17:33
  • Possible duplicate of [Creating a new file, filename contains loop variable, python](https://stackoverflow.com/questions/12560600/creating-a-new-file-filename-contains-loop-variable-python) – wwii Nov 13 '19 at 17:35
  • The problem is it's not currently generating anything. It's not outputting the dataframe to a .csv file at all. Which is part of my confusion here. I will try to get mre code put in. – TardisPilot Nov 13 '19 at 17:45

2 Answers2

0

A common solution I use is to add a timestamp to the filename when creating the output. Since you are pretty happy with the rest of the code, you will only need to update the section where you write the file, although you can put the import at the top of your script.

However, after a close read I think maybe you have an issue with how you are "looping" over the iterable. You are using cand_ref inside the loop instead of the dynamically updated during loop cand

Notice I changed your string concatenation to use the newer "f-string" format:

  from time import strftime   

  for cand in cand_ref:
        filename=f"Loop_Test_{cand}_{strftime('%Y%m%d-%H%M%S')}.csv"
        df.to_csv(filename, index=False)
mgrollins
  • 641
  • 3
  • 9
  • Thank you for the comment! I replaced my code with what you provided and ran it but the problem is its not actually writing out a file at all. Could I be missing something? – TardisPilot Nov 13 '19 at 17:47
  • Are you sure there are still rows/data in `df` when you write the file? Maybe print the first few rows to ensure you have data? – mgrollins Nov 13 '19 at 17:56
  • yes, when I just write the df to a .csv file on its own it works fine (it just will get overwrote each iteration). The issue I'm running into is trying to write a new file each iteration of the loop, (with a dynamic filename from the cand_ref list). – TardisPilot Nov 13 '19 at 17:58
  • So I ran the code without trying to add the cand_ref and it works fine. Though I would really like to figure out how to get the can_ref dynamically added to the filename in the loop. I'm not sure if my indents are off or what the problem might be. I appreciate all the help! – TardisPilot Nov 13 '19 at 18:11
  • Ah, I see the issue now! You are iterating over "can_ref" but you don't want to use it IN the loop: – mgrollins Nov 13 '19 at 23:48
  • I've updated my answer to fix that issue in my example code, and to mention the issue in the body of my reply. Also, thank you for editing original to give easier to read example! – mgrollins Nov 13 '19 at 23:51
  • Ok thanks! I will test it out. Quick question. I might have the indents wrong where does this get placed in relation to the while loop? – TardisPilot Nov 14 '19 at 14:43
  • I think you should be able to just copy/paste it over the existing code from the sample above. I did notice that you have an indent in the above after the 1st line which doesn't look right. Opening a file isn't indented unless you are using a context manager, which you did not. – mgrollins Nov 14 '19 at 22:39
  • So I copied and pasted and ran my code. The problem is it will output the dataframe until all of the cand values are used up. I just want it to pull one cand value for each while iteration. Is there something I need to add to my for loop to just take 1 value per while pass? – TardisPilot Nov 15 '19 at 20:07
  • It sounds like you don't need the for loop then? Since you have the position variable, you could use that instead of the loop to index into the `cand` list instead of the for loop, Are you trying to do something like `cand_list[position]` on each loop through? – mgrollins Nov 15 '19 at 21:13
  • yes I am actually, for each while loop pass I need the cand_list element to get added to the dataframe and the filename. So the first url is pulled and the first cand# is appended/added. Then the 2nd url is pulled and the 2nd cand# is added etc etc. I'm just not sure how to incorporate this into my while loop – TardisPilot Nov 15 '19 at 21:15
  • 1
    @mrgrollins, based on your comment I was able to figure it out and now it works perfectly!! Thank you so much!! I just had to add cand_ref[position] `df['Candidate Ref Number'] = (cand_ref[position]) filename=f"Action_Log_{cand_ref[position]}_{strftime('%Y%m%d-%H%M%S')}.csv"` – TardisPilot Nov 15 '19 at 21:51
  • That's great! Hope you enjoy working with Python. Once you get the hang of how to use lists and indexing, you'll start appreciating how the language really helps. For example, instead of your "position" variable, you can use the `enumerate` function - read about it when you have some time. Should we edit the q & a to better reflect your need, or is it good as is? – mgrollins Nov 15 '19 at 22:32
  • 1
    It's answered now for sure. I just didn't need to use the `for` statement, rather just used position underneath the `while` statement. – TardisPilot Nov 15 '19 at 22:55
  • Great! Would you mind please marking this as the answer? – mgrollins Nov 15 '19 at 23:39
0

You can use the following code to create and open a new file with a random name for writing.

You can also go the timestamp route, in this case make sure the precision of your timestamp is fine enough to not generate the same value in any two adjacent iterations of your loop.

import string
import random
name = "".join([random.choice(string.lowercase) for _ in range(10)]) + ".csv"
f = open(name, "w+")
D. Tastet
  • 11
  • 2