-2

I have a file with a lot of images. Each image is named something like:

  • 100304.jpg
  • 100305.jpg
  • 100306.jpg
  • etc...

I also have a spreadsheet, Each image is a row, the first value in the row is the name, the values after the name are various decimals and 0's to describe features of each image.

The issue is that when I pull the name from the sheet, something is adding a decimal which then results in the file not being able to be transferred via the shutil.move()

import xlrd
import shutil

dataLocation = "C:/Users/User/Documents/Python/Project/sort_solutions_rev1.xlsx"
imageLocBase = "C:/Users/User/Documents/Python/Project/unsorted"
print("Specify which folder to put images in. Type the number only.")
print("1")
print("2")
print("3")
int(typeOfSet) = input("")

#Sorting for folder 1
if int(typeOfSet) == 1:
#Identifying what to move
        name = str(sheet.cell(int(nameRow), 0).value)
        sortDataStorage = (sheet.cell(int(nameRow), 8).value) #float
        sortDataStorageNoFloat = str(sortDataStorage) #non-float
        print("Proccessing: " + name)
        print(name + " has a correlation of " + (sortDataStorageNoFloat))

#sorting for this folder utilizes the information in column 8)
        if sortDataStorage >= sortAc:
                print("test success")
                folderPath = "C:/Users/User/Documents/Python/Project/Image Folder/Folder1"
                shutil.move(imageLocBase + "/" + name, folderPath)
                print(name + " has been sorted.")
        else:
                print(name + " does not meet correlation requirement. Moving to next image.")

The issue I'm having occurs with the shutil.move(imageLocBase + "/" +name, folderPath) For some reason my code takes the name from the spreadsheet (ex: 100304) and then adds a ".0" So when trying to move a file, it is trying to move 100304.0 (which doesn't exist) instead of 100304.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • Welcome to StackOverflow. See [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). We cannot effectively help you until you post your MRE code and accurately specify the problem. We should be able to paste your posted code into a text file and reproduce the problem you specified. – Prune Feb 12 '20 at 22:50
  • Strings are immutable, but you can [create new strings](https://stackoverflow.com/a/3559600) with characters removed: `name_cleaned = name.replace(".", "")`. You could also split on the decimal: `name_cleaned = name.split(str=".")[0]`. – Reticulated Spline Feb 12 '20 at 22:51
  • 1
    As an aside - it might be helpful to use the `pandas.read_excel()` function - [docs here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html). This will store your Excel data in a `DataFrame` which might help with your data typing issue, and be easier to iterate. – S3DEV Feb 12 '20 at 22:55
  • 1
    @S3DEV I think that is actually the right move here, consider adding an answer. – Reticulated Spline Feb 12 '20 at 22:58
  • Thanks, I'll definitely look into pandas. – EnigmaticFractal Feb 12 '20 at 23:00

5 Answers5

1

Using pandas to read your Excel file.

As suggested in a comment on the original question, here is a quick example of how to use pandas to read your Excel file, along with an example of the data structure.

Any questions, feel free to shout, or have a look into the docs.

import pandas as pd

# My path looks a little different as I'm on Linux.
path = '~/Desktop/so/MyImages.xlsx'

df = pd.read_excel(path)

Data Structure

This is completely contrived as I don't have an example of your actual file.

    IMAGE_NAME  FEATURE_1   FEATURE_2   FEATURE_3
0   100304.jpg  0.0111      0.111       1.111
1   100305.jpg  0.0222      0.222       2.222
2   100306.jpg  0.0333      0.333       3.333

Hope this helps get you started.

Suggestion:

Excel likes to think it's clever and does 'unexpected' things, as you're experiencing with the decimal (data type) issue. Perhaps consider storing your image data in a database (SQLite) or as plain old CSV file. Pandas can read from either of these as well! :-)

S3DEV
  • 8,768
  • 3
  • 31
  • 42
0
splitOn = '.'
nameOfFile = text.split(splitOn, 1)[0]

Should work

if we take your file name eg 12345.0 and create a var

name = "12345.0"

Now we need to split this var. In this case we wish to split on . So we save this condition as a second var

splitOn = '.'  

Using the .split for python. Here we offer the text (variable name) and the python split command. so to make it literal

12345.0
split at .
only make one split and save as two vars in a list 
(so we have 12345 at position 0 (1st value) 
and 0 at position 1 (2nd value) in a list)

save 1st var
(as all lists are 0 based we ask for [0] 
(if you ever get confused with list, arrays etc just start counting
from 0 instead of one on your hands and then you know 
ie position 0 1 2 3 4 = 1st value, 2nd value, 3rd value, 4th value, 5th value)

nameOfFile = name.split(splitOn, 1)[0] 
             12345.0 split ( split on . , only one split ) save position 0 ie first value

So.....

name = 12345.0
splitOn = '.'
nameOfFile = name.split(splitOn, 1)[0]   
yield(nameOfFile)

output will be

12345

I hope that helps https://www.geeksforgeeks.org/python-string-split/

OR

as highlighted below, convert to float to in https://www.geeksforgeeks.org/type-conversion-python/

if saved as float

name 12345.0
newName = round(int(name))

this will round the float (as its 0 will round down)

OR if float is saved as a string

print(int(float(name)))
BobMonk
  • 178
  • 1
  • 10
  • As the OP states, they are a new programmer. Please expand this answer to include context to help our new user. Additionally, it might be better to solve the data typing issue at its source rather than patching a funny string later in the code. Just a thought. – S3DEV Feb 12 '20 at 22:57
0

Apparently the value you retrieve from the spreadsheet comes parsed as a float, so when you cast it to string it retains the decimal part.
You can trim the “.0” from the string value, or cast it to integer before casting to string.

You could also check the spreadsheet’s cell format and ensure it is set to normal (idk the setting, but something that is not a number). With that fixed, your data probably wont come with the .0 anymore.

  • @EnigmaticFractal: Have a look at my answer below regarding storing as a CSV file rather than Excel - for this exact reason. – S3DEV Feb 12 '20 at 23:19
0

If always add ".0" to the end of the variable, You need to read the var_string "name" in this way:

shutil.move(imageLocBase + "/" + name[:-2], folderPath)

A string is like a list that we can choose the elements to read. Slicing is colled this method

Sorry for my English. Bye

0

All these people have taken time to reply, please out of politeness rate the replies.

BobMonk
  • 178
  • 1
  • 10