0

WindowsError: [Error 3] The system cannot find the path specified: Im trying to assign the file directory to base-path so the rest of the code can do its thing.

Also let me know if you think there should be a : at the end of this line of code-----> 
for file in sorted(os.listdir(path)) 
i think should be ....
for file in sorted(os.listdir(path)):
the book doesnt have the : at the end

import pyprind #INSTALLED IN ANACONDA TERMINAL 
import pandas as pd
import os

# change the 'basepath' to the directory of unzipped movie dataset

#tried:
#basepath = 'C:\\Users\\zacka\\Downloads\\aclImdb_v1.tar.gz'
#basepath = 'C://Users//zacka//Downloads//aclImdb_v1.tar.gz'
#basepath = 'C:/Users/zacka/Downloads/aclImdb_v1.tar.gz'
#basepath = 'C:\Users\zacka\Downloads\aclImdb_v1.tar.gz'
#not sure if im using the back slash or forward slash incorrectly or if i #need to double up....


labels = {'pos': 1, 'neg': 0}
pbar = pyprind.ProgBar(50000)
df = pd.DataFrame()
for s in ('test', 'train'):
    for l in ('pos', 'neg'):
        path = os.path.join(basepath, s, l)
        for file in sorted(os.listdir(path))
            with open(os.path.join(path, file), 
                      'r', encoding='utf-8') as infile:
                txt = infile.read()
            df = df.append([[txt, labels[1]]],
                           ignore_index=True)
            pbar.update()
df.columns = ['review', 'sentiment']
Mayank Patel
  • 3,868
  • 10
  • 36
  • 59
manicsurfing
  • 111
  • 1
  • 1
  • 3

1 Answers1

0

basepath = 'C:\\Users\\zacka\\Downloads\\aclImdb'

double backslash is necessary especially between'.... Downloads\\aclImdb' i tried print(basepath) without double backslash and it produces an 0x7 for character a in aclImdb.

also i was setting basepath= to a zipped folder rather than an unzipped folder.

Now i need to figure out: TypeError: 'encoding' is an invalid keyword argument for this function for encoding='utf-8'

manicsurfing
  • 111
  • 1
  • 1
  • 3