0

I have referred this but it did not help. Don't duplicate this please. I am trying to count the syllable per word for each text file in my output directory. I wanted to insert a list of syllables for each file.

My approach:

directory = r"..\output" 
result = []
i = 0
for filename in os.listdir(directory):
    if filename.endswith('.txt'):
        filepath = os.path.join(directory, filename)
    with open(filepath, 'rb') as f:
            encoding = chardet.detect(f.read())['encoding']
    with open(filepath, 'r', encoding=encoding) as f:
            text = f.read()
            words = text.split()
            for word in words:
                result.append(count_syllables(word))
    results.at[i,'SYLLABLE PER WORD'] = result
    i += 1

I am getting the following error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3801             try:
-> 3802                 return self._engine.get_loc(casted_key)
   3803             except KeyError as err:

~\AppData\Roaming\Python\Python39\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

~\AppData\Roaming\Python\Python39\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'SYLLABLE PER WORD'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
~\AppData\Roaming\Python\Python39\site-packages\pandas\core\frame.py in _set_value(self, index, col, value, takeable)
   4209             else:
-> 4210                 icol = self.columns.get_loc(col)
   4211                 iindex = self.index.get_loc(index)

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3803             except KeyError as err:
-> 3804                 raise KeyError(key) from err
   3805             except TypeError:

KeyError: 'SYLLABLE PER WORD'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_19000\1445037766.py in <module>
     12             for word in words:
     13                 result.append(count_syllables(word))
---> 14     results.at[i,'SYLLABLE PER WORD'] = result
     15     i += 1

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexing.py in __setitem__(self, key, value)
   2440             return
   2441 
-> 2442         return super().__setitem__(key, value)
   2443 
   2444 

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexing.py in __setitem__(self, key, value)
   2395             raise ValueError("Not enough indexers for scalar access (setting)!")
   2396 
-> 2397         self.obj._set_value(*key, value=value, takeable=self._takeable)
   2398 
   2399 

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\frame.py in _set_value(self, index, col, value, takeable)
   4222                 self.iloc[index, col] = value
   4223             else:
-> 4224                 self.loc[index, col] = value
   4225             self._item_cache.pop(col, None)
   4226 

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexing.py in __setitem__(self, key, value)
    816 
    817         iloc = self if self.name == "iloc" else self.obj.iloc
--> 818         iloc._setitem_with_indexer(indexer, value, self.name)
    819 
    820     def _validate_key(self, key, axis: int):

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexing.py in _setitem_with_indexer(self, indexer, value, name)
   1748                             indexer, self.obj.axes
   1749                         )
-> 1750                         self._setitem_with_indexer(new_indexer, value, name)
   1751 
   1752                         return

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexing.py in _setitem_with_indexer(self, indexer, value, name)
   1793         if take_split_path:
   1794             # We have to operate column-wise
-> 1795             self._setitem_with_indexer_split_path(indexer, value, name)
   1796         else:
   1797             self._setitem_single_block(indexer, value, name)

~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexing.py in _setitem_with_indexer_split_path(self, indexer, value, name)
   1848                     return self._setitem_with_indexer((pi, info_axis[0]), value[0])
   1849 
-> 1850                 raise ValueError(
   1851                     "Must have equal len keys and value "
   1852                     "when setting with an iterable"

ValueError: Must have equal len keys and value when setting with an iterable

I want to iteratively insert the list of syllables per word into a data frame as you can see from my approach.

Here is the link to the code and the dataset I am using: https://github.com/karthikbhandary2/Data-Extraction-and-NLP

Karthik Bhandary
  • 1,305
  • 2
  • 7
  • 16
  • You haven't even shown `results` so it's impossible to answer. It also has nothing to do with machine learning or NLP. That might be the context of what you're trying to solve, but it doesn't factor in here – roganjosh Apr 10 '23 at 16:34

2 Answers2

0

You need to set the data type of the column to 'object' first.

results = pd.DataFrame()
results.at[0, 'Syllables per Word'] = 0
results['Syllables per Word'] = results['Syllables per Word'].astype('object')

results.at[0, 'Syllables per Word'] = [1, 1, 1, 2]
Michael Cao
  • 2,278
  • 1
  • 1
  • 13
0

You have not defined your dataframe 'results'. I think that should help you to solve your problem

import pandas as pd 
results = pd.DataFrame(columns = ["SYLLABLE PER WORD"])
result = [1, 2, 3, 4]
results.loc[0, "SYLLABLE PER WORD"] = result

print(results)

And you'll obtain:

  SYLLABLE PER WORD
0      [1, 2, 3, 4]
Xenepix
  • 11
  • 3