0

I'm about to pull my hair out on this. I'm not sure why the index in my array is not being implemented in the second column.

I created this array - project_information :

    project_information.append([proj_id,project_text])

When I print this out, I get the rows and columns. It contains about 40 rows.

When I iterate through it to print out the contents, everything comes out fine. I am using this:

    for i in range(0,len(project_information)):
        project_id = project_information[i][0]

        project_text = project_information[i][1]
        print(project_id)
        print (project_text)
   

The project_text column contains text, while the project_id contains integers. It prints out perfectly, and the index, changes for both project_id and project_text.

However, I need to use the project_text in a different way, and I am really struggling with this. I need to slice the text to a shorter text for reuse. To do this, I tried:

   for i in range(0,len(project_information)):
        project_id = project_information[i][0]

        project_text = project_information[i][1]
        print(project_id)
        print (project_text)
   
       if len(project_text) > 5000:
          trunc_proj_text = project_text[:1000]
       else:
          trunc_proj_text = project_text

      print (project_id)       
      print(trunc_proj_text)

The problem I'm having here is that though the project_id column is being iterated through properly, the project_text is not. What I am getting is just the text in the first row for the project_text, sliced, and repeated for as many times as the length of the array.

I have tried different ways, and also a while loop, but it is still not working.

I've also looked at these answers for reference - Slicing,indexing and iterating over 2D Numpy arrays,Efficient iteration over slice in Python, iteration over list slices, and I can't seem to see how they can be applied to my problem.

I'm not well-versed in using Numpy, so is this something that it could help with? I'm well aware this might be simple and I'm missing it because I've been working on various aspects of this project for the past weeks, so I would appreciate a bit of consideration in this.

Thanks in advance.

Rose_Trojan
  • 95
  • 1
  • 8
  • Can you provide same input, expected output and the output you getting? – BhusalC_Bipin Jun 08 '22 at 15:07
  • 1
    I tried your code, with this sample input ```project_information = [[1, 'abcdefgh'], [2, 'ijklmnop'], [3, 'qrtsvuwxyz']]```, and then checked it ```len(project_text) > 4```, if yes then ```project_text[:4]```. And it worked fine for me. So, I would like to see your input and output sample. – BhusalC_Bipin Jun 08 '22 at 15:20
  • This needs a [mcve]. The previous commenter create his own sample, and it worked. We should not have to guess anything about the inputs, regardless of whether we get it right or wrong. – hpaulj Jun 08 '22 at 16:15
  • @hpaulj you are not asked to guess. I will reply to his comment in a bit. I'm not sure at what point I asked you to guess. I explained the structure of the array. – Rose_Trojan Jun 09 '22 at 09:03
  • @BhusalC_Bipin the file I'm using is extremely large, so I can't share a true sample here. I will print it out again in a text file and have another look at the structure itself. If it works in your case, might be how the list is configured. I will post a trimmed down section of the data , which is representative, as soon as I've looked at the structure again. Thanks! – Rose_Trojan Jun 09 '22 at 09:08
  • 1
    @BhusalC_Bipin I had to use a different larger system to view the file and I have found that the problem was with the input text. The code is indeed correct, but the same text was being rewritten and the additional text concatenated on at the end, with each list entry. So in fact, the slicing was correct, but because I could only view the end of text when viewing it in a console it looked different for each entry, so I didn't know this. I've fixed the code to create the input file now. Thanks! – Rose_Trojan Jun 09 '22 at 13:31

1 Answers1

0

The problem was with the input list here, so the slicing with this code does in fact work. The code to create the input array has now been fixed. The original code to create the input list was concatenating the strings for each entry, so the project_texts for each appeared different from the end, but all had the same beginning. But viewing this on a console, it was hard to see.

Rose_Trojan
  • 95
  • 1
  • 8