4

The following error is caused by the following code. I've read KeyError: 0 is due to a dictionary file lacking an entry, but I still don't really know what a dictionary file is or how my code is accessing it: I am simply trying to access data in a dataframe. Apparently the problem is that a subset of the dataframe VolValues uses indices beginning around 23000 whereas I am trying to slice it with index '0' for what I thought was python's "first element" syntax.

Can you tell me what is wrong with the code and how to fix it?

runfile('/Users/daniel/Documents/programming/RectumD2Metrics.py', wdir='/Users/daniel/Documents/programming')
Traceback (most recent call last):

  File "<ipython-input-2-d170dca123d7>", line 1, in <module>
    runfile('/Users/daniel/Documents/programming/RectumD2Metrics.py', wdir='/Users/daniel/Documents/programming')

  File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/Users/daniel/Documents/programming/RectumD2Metrics.py", line 37, in <module>
    D2Planned = interpD2('planned',df)

  File "/Users/daniel/Documents/programming/RectumD2Metrics.py", line 30, in interpD2
    if (VolValues[loop] > 2) and (VolValues[loop+1] < 2):

  File "/anaconda3/lib/python3.6/site-packages/pandas/core/series.py", line 766, in __getitem__
    result = self.index.get_value(self, key)

  File "/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3103, in get_value
    tz=getattr(series.dtype, 'tz', None))

  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item

KeyError: 0

The code:

# Import Rectum DVH
import pandas
import numpy
df = pandas.read_csv('/Users/daniel/Documents/data/DVH/RectumData.csv',
                 delimiter=',',header=0)
# Calculate D_2%, defined by ICRU 78 as "the greatest dose which all but
# 2 percent of a [volume of interest] receives." aka D_{near-max}

def interpD2(disttype,df):
# Loop through all patients' plans.
    Dose2Results = numpy.zeros(40)
    for num in range(0,40): 
# We know a priori that there is no DVH data with Volume = 2. Hence we look for
# the two columns less than and greater than Volume = 2.
        if disttype == 'planned':
            DoseValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'planned')].Dose
            VolValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'planned')].Volume
        else:
            DoseValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'blurred')].Dose
            VolValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'blurred')].Volume
        for loop in range(0,len(VolValues)):
            if (VolValues[loop] > 2) and (VolValues[loop+1] < 2):
                LowerVolumeIndex,UpperVolumeIndex = loop,loop+1
                x0,x1,x2 = 2,VolValues[LowerVolumeIndex],VolValues[UpperVolumeIndex]
                y1,y2 = DoseValues[LowerVolumeIndex],DoseValues[UpperVolumeIndex]
                Dose2Results[num] = y1 - ((x1-x0)/(x2 - x1))*(y2 - y1)
    return Dose2Results
D2Planned,D2Blurred = numpy.zeros(40),numpy.zeros(40)
D2Planned = interpD2('planned',df)
D2Blurred = interpD2('blurred',df)

An excerpt of the imported CSV file is at the end of this post.

Attempts to resolve:

  1. Removing the df being passed into the function results in the same error message. (It was originally absent as I thought functions had access to 'global' variables.)

  2. I tried to 'initialize' the variables with zeros.

  3. I explicitly parsed the string using the if blocks trying to resolve the error message, still no change.

  4. Checking the pandas page, I find an update is available. I've installed it via conda install pandas, but still no change in the error message. (Details of this update follow.)

    The following packages will be downloaded:
    
    package                    |            build
    ---------------------------|-----------------
    certifi-2018.4.16          |           py36_0         142 KB
    conda-4.5.8                |           py36_0         1.0 MB
    ------------------------------------------------------------
                                           Total:         1.2 MB
    
    The following packages will be UPDATED:
    
    certifi: 2018.4.16-py36_0 conda-forge --> 2018.4.16-py36_0
    conda:   4.5.6-py36_0     conda-forge --> 4.5.8-py36_0
    

Thank you for your help.

CSV data excerpt including first header line; I have skipped rows to stay within body post character limit, but please be aware DistributionType is listed first, then StudyID. So the numbers go 1,10,11,...,19,2,20,21,... with 'blurred' data preceding 'planned' data.

StudyID,DistributionType,Organ,Dose,Volume,DoseUnit,VolumeUnit
1,blurred,Rectum,0,100,Gy(RBE),%
1,blurred,Rectum,0.1,78.13818,Gy(RBE),%
1,blurred,Rectum,0.2,75.901,Gy(RBE),%
1,blurred,Rectum,0.3,75.01312,Gy(RBE),%
1,blurred,Rectum,0.4,73.38642,Gy(RBE),%
1,blurred,Rectum,0.5,72.36015,Gy(RBE),%
1,blurred,Rectum,0.6,70.81651,Gy(RBE),%
1,blurred,Rectum,7.3,22.60766,Gy(RBE),%
1,blurred,Rectum,7.4,22.4557,Gy(RBE),%
1,blurred,Rectum,7.5,22.31794,Gy(RBE),%
1,blurred,Rectum,7.6,22.19247,Gy(RBE),%
1,blurred,Rectum,7.7,22.09406,Gy(RBE),%
1,blurred,Rectum,32.2,6.99686,Gy(RBE),%
1,blurred,Rectum,32.3,6.96634,Gy(RBE),%
1,blurred,Rectum,32.4,6.94046,Gy(RBE),%
1,blurred,Rectum,32.5,6.89926,Gy(RBE),%
1,blurred,Rectum,32.6,6.85925,Gy(RBE),%
1,blurred,Rectum,32.7,6.83843,Gy(RBE),%
1,blurred,Rectum,32.8,6.8082,Gy(RBE),%
1,blurred,Rectum,32.9,6.76663,Gy(RBE),%
1,blurred,Rectum,33,6.72788,Gy(RBE),%
1,blurred,Rectum,33.1,6.6771,Gy(RBE),%
1,blurred,Rectum,33.2,6.62313,Gy(RBE),%
1,blurred,Rectum,33.3,6.57601,Gy(RBE),%
1,blurred,Rectum,42.5,2.96622,Gy(RBE),%
1,blurred,Rectum,42.6,2.9242,Gy(RBE),%
1,blurred,Rectum,42.7,2.87604,Gy(RBE),%
1,blurred,Rectum,42.8,2.83046,Gy(RBE),%
1,blurred,Rectum,42.9,2.78527,Gy(RBE),%
1,blurred,Rectum,43,2.73564,Gy(RBE),%
1,blurred,Rectum,43.1,2.7077,Gy(RBE),%
1,blurred,Rectum,43.2,2.69686,Gy(RBE),%
1,blurred,Rectum,43.3,2.6505,Gy(RBE),%
1,blurred,Rectum,43.4,2.62119,Gy(RBE),%
1,blurred,Rectum,43.5,2.59528,Gy(RBE),%
1,blurred,Rectum,43.6,2.55359,Gy(RBE),%
1,blurred,Rectum,43.7,2.50786,Gy(RBE),%
1,blurred,Rectum,43.8,2.46692,Gy(RBE),%
1,blurred,Rectum,43.9,2.40788,Gy(RBE),%
1,blurred,Rectum,44,2.37622,Gy(RBE),%
1,blurred,Rectum,44.1,2.34098,Gy(RBE),%
1,blurred,Rectum,44.2,2.30527,Gy(RBE),%
1,blurred,Rectum,44.3,2.26972,Gy(RBE),%
1,blurred,Rectum,44.4,2.2384,Gy(RBE),%
1,blurred,Rectum,44.5,2.20512,Gy(RBE),%
1,blurred,Rectum,44.6,2.14891,Gy(RBE),%
1,blurred,Rectum,44.7,2.12178,Gy(RBE),%
1,blurred,Rectum,44.8,2.06922,Gy(RBE),%
1,blurred,Rectum,44.9,2.02836,Gy(RBE),%
1,blurred,Rectum,45,1.99259,Gy(RBE),%
1,blurred,Rectum,45.1,1.98118,Gy(RBE),%
1,blurred,Rectum,45.2,1.92938,Gy(RBE),%
1,blurred,Rectum,45.3,1.88315,Gy(RBE),%
1,blurred,Rectum,45.4,1.85419,Gy(RBE),%
1,blurred,Rectum,45.5,1.81149,Gy(RBE),%
1,blurred,Rectum,45.6,1.77154,Gy(RBE),%
1,blurred,Rectum,45.7,1.73287,Gy(RBE),%
1,blurred,Rectum,45.8,1.68749,Gy(RBE),%
1,blurred,Rectum,45.9,1.65961,Gy(RBE),%
1,blurred,Rectum,46,1.62265,Gy(RBE),%
1,blurred,Rectum,46.1,1.61065,Gy(RBE),%
1,blurred,Rectum,46.2,1.56712,Gy(RBE),%
1,blurred,Rectum,46.3,1.50282,Gy(RBE),%
1,blurred,Rectum,46.4,1.45122,Gy(RBE),%
1,blurred,Rectum,46.5,1.42696,Gy(RBE),%
1,blurred,Rectum,46.6,1.38877,Gy(RBE),%
1,blurred,Rectum,46.7,1.35886,Gy(RBE),%
1,blurred,Rectum,46.8,1.34022,Gy(RBE),%
1,blurred,Rectum,46.9,1.29308,Gy(RBE),%
1,blurred,Rectum,56.5,NaN,Gy(RBE),%
1,blurred,Rectum,56.6,NaN,Gy(RBE),%
1,blurred,Rectum,56.7,NaN,Gy(RBE),%
1,blurred,Rectum,56.8,NaN,Gy(RBE),%
1,blurred,Rectum,56.9,NaN,Gy(RBE),%
1,blurred,Rectum,57,NaN,Gy(RBE),%
1,blurred,Rectum,57.1,NaN,Gy(RBE),%
1,blurred,Rectum,57.2,NaN,Gy(RBE),%
1,blurred,Rectum,57.3,NaN,Gy(RBE),%
1,blurred,Rectum,57.4,NaN,Gy(RBE),%
1,blurred,Rectum,57.5,NaN,Gy(RBE),%
1,blurred,Rectum,57.6,NaN,Gy(RBE),%
1,blurred,Rectum,57.7,NaN,Gy(RBE),%
1,blurred,Rectum,57.8,NaN,Gy(RBE),%
1,blurred,Rectum,57.9,NaN,Gy(RBE),%
1,blurred,Rectum,58,NaN,Gy(RBE),%
1,blurred,Rectum,58.1,NaN,Gy(RBE),%
1,blurred,Rectum,58.2,NaN,Gy(RBE),%
9,blurred,Rectum,58.2,NaN,Gy(RBE),%
1,planned,Rectum,0,100,Gy(RBE),%
1,planned,Rectum,0.1,78.01999,Gy(RBE),%
1,planned,Rectum,0.2,76.2245,Gy(RBE),%
1,planned,Rectum,14,19.50103,Gy(RBE),%
1,planned,Rectum,14.1,19.4464,Gy(RBE),%
1,planned,Rectum,14.2,19.39261,Gy(RBE),%
1,planned,Rectum,14.3,19.32695,Gy(RBE),%
1,planned,Rectum,14.4,19.25388,Gy(RBE),%
1,planned,Rectum,14.5,19.17049,Gy(RBE),%
1,planned,Rectum,14.6,19.09786,Gy(RBE),%
1,planned,Rectum,14.7,19.04909,Gy(RBE),%
1,planned,Rectum,14.8,18.98888,Gy(RBE),%
1,planned,Rectum,34,9.50553,Gy(RBE),%
1,planned,Rectum,34.1,9.45993,Gy(RBE),%
1,planned,Rectum,34.2,9.42654,Gy(RBE),%
1,planned,Rectum,34.3,9.39345,Gy(RBE),%
1,planned,Rectum,34.4,9.35196,Gy(RBE),%
1,planned,Rectum,34.5,9.30604,Gy(RBE),%
1,planned,Rectum,34.6,9.27235,Gy(RBE),%
1,planned,Rectum,34.7,9.22334,Gy(RBE),%
1,planned,Rectum,34.8,9.18734,Gy(RBE),%
1,planned,Rectum,34.9,9.14867,Gy(RBE),%
1,planned,Rectum,35,9.11402,Gy(RBE),%
1,planned,Rectum,35.1,9.07618,Gy(RBE),%
1,planned,Rectum,35.2,9.04251,Gy(RBE),%
1,planned,Rectum,35.3,9.00141,Gy(RBE),%
1,planned,Rectum,35.4,8.96289,Gy(RBE),%
1,planned,Rectum,35.5,8.92638,Gy(RBE),%
1,planned,Rectum,35.6,8.89506,Gy(RBE),%
1,planned,Rectum,35.7,8.85644,Gy(RBE),%
1,planned,Rectum,35.8,8.81237,Gy(RBE),%
1,planned,Rectum,35.9,8.76545,Gy(RBE),%
1,planned,Rectum,36,8.73692,Gy(RBE),%
1,planned,Rectum,36.1,8.70149,Gy(RBE),%
1,planned,Rectum,36.2,8.66073,Gy(RBE),%
1,planned,Rectum,36.3,8.61303,Gy(RBE),%
1,planned,Rectum,36.4,8.56549,Gy(RBE),%
1,planned,Rectum,36.5,8.51527,Gy(RBE),%
1,planned,Rectum,36.6,8.47214,Gy(RBE),%
1,planned,Rectum,36.7,8.41663,Gy(RBE),%
1,planned,Rectum,36.8,8.37863,Gy(RBE),%
1,planned,Rectum,36.9,8.35041,Gy(RBE),%
1,planned,Rectum,37,8.31595,Gy(RBE),%
1,planned,Rectum,37.1,8.288,Gy(RBE),%
1,planned,Rectum,37.2,8.26272,Gy(RBE),%
1,planned,Rectum,37.3,8.23171,Gy(RBE),%
1,planned,Rectum,37.4,8.19804,Gy(RBE),%
1,planned,Rectum,37.5,8.1594,Gy(RBE),%
1,planned,Rectum,37.6,8.11729,Gy(RBE),%
1,planned,Rectum,37.7,8.06844,Gy(RBE),%
1,planned,Rectum,37.8,8.02818,Gy(RBE),%
1,planned,Rectum,37.9,7.96257,Gy(RBE),%
1,planned,Rectum,38,7.90243,Gy(RBE),%
1,planned,Rectum,38.1,7.84717,Gy(RBE),%
1,planned,Rectum,38.2,7.80889,Gy(RBE),%
1,planned,Rectum,38.3,7.77623,Gy(RBE),%
1,planned,Rectum,38.4,7.74385,Gy(RBE),%
1,planned,Rectum,38.5,7.71867,Gy(RBE),%
1,planned,Rectum,38.6,7.70076,Gy(RBE),%
1,planned,Rectum,38.7,7.6754,Gy(RBE),%
1,planned,Rectum,38.8,7.64753,Gy(RBE),%
1,planned,Rectum,38.9,7.59392,Gy(RBE),%
1,planned,Rectum,39,7.53856,Gy(RBE),%
1,planned,Rectum,39.1,7.4879,Gy(RBE),%
1,planned,Rectum,39.2,7.4423,Gy(RBE),%
1,planned,Rectum,39.3,7.40429,Gy(RBE),%
1,planned,Rectum,39.4,7.35858,Gy(RBE),%
1,planned,Rectum,39.5,7.30843,Gy(RBE),%
1,planned,Rectum,39.6,7.25325,Gy(RBE),%
1,planned,Rectum,39.7,7.22353,Gy(RBE),%
1,planned,Rectum,39.8,7.19164,Gy(RBE),%
1,planned,Rectum,39.9,7.16789,Gy(RBE),%
1,planned,Rectum,40,7.13184,Gy(RBE),%
1,planned,Rectum,40.1,7.09953,Gy(RBE),%
1,planned,Rectum,40.2,7.04322,Gy(RBE),%
1,planned,Rectum,40.3,6.98051,Gy(RBE),%
1,planned,Rectum,40.4,6.93635,Gy(RBE),%
1,planned,Rectum,40.5,6.90025,Gy(RBE),%
1,planned,Rectum,40.6,6.87001,Gy(RBE),%
1,planned,Rectum,40.7,6.83943,Gy(RBE),%
1,planned,Rectum,40.8,6.81393,Gy(RBE),%
1,planned,Rectum,40.9,6.7731,Gy(RBE),%
1,planned,Rectum,41,6.74696,Gy(RBE),%
1,planned,Rectum,41.1,6.71209,Gy(RBE),%
1,planned,Rectum,41.2,6.64682,Gy(RBE),%
1,planned,Rectum,41.3,6.5857,Gy(RBE),%
1,planned,Rectum,41.4,6.53214,Gy(RBE),%
1,planned,Rectum,41.5,6.48609,Gy(RBE),%
1,planned,Rectum,41.6,6.44336,Gy(RBE),%
1,planned,Rectum,41.7,6.3864,Gy(RBE),%
1,planned,Rectum,41.8,6.33488,Gy(RBE),%
1,planned,Rectum,41.9,6.30537,Gy(RBE),%
1,planned,Rectum,42,6.28613,Gy(RBE),%
1,planned,Rectum,42.1,6.27749,Gy(RBE),%
1,planned,Rectum,42.2,6.26234,Gy(RBE),%
1,planned,Rectum,42.3,6.23083,Gy(RBE),%
1,planned,Rectum,42.4,6.18859,Gy(RBE),%
1,planned,Rectum,42.5,6.12637,Gy(RBE),%
1,planned,Rectum,50,2.63461,Gy(RBE),%
1,planned,Rectum,50.1,2.61684,Gy(RBE),%
1,planned,Rectum,50.2,2.55227,Gy(RBE),%
1,planned,Rectum,50.3,2.48541,Gy(RBE),%
1,planned,Rectum,50.4,2.46586,Gy(RBE),%
1,planned,Rectum,50.5,2.39354,Gy(RBE),%
1,planned,Rectum,50.6,2.33448,Gy(RBE),%
1,planned,Rectum,50.7,2.28168,Gy(RBE),%
1,planned,Rectum,50.8,2.25787,Gy(RBE),%
1,planned,Rectum,50.9,2.19108,Gy(RBE),%
1,planned,Rectum,51,2.12473,Gy(RBE),%
1,planned,Rectum,51.1,2.11024,Gy(RBE),%
1,planned,Rectum,51.2,2.03551,Gy(RBE),%
1,planned,Rectum,51.3,1.98004,Gy(RBE),%
1,planned,Rectum,51.4,1.92951,Gy(RBE),%
1,planned,Rectum,51.5,1.89144,Gy(RBE),%
1,planned,Rectum,51.6,1.82465,Gy(RBE),%
1,planned,Rectum,51.7,1.77709,Gy(RBE),%
1,planned,Rectum,51.8,1.71624,Gy(RBE),%
1,planned,Rectum,51.9,1.65075,Gy(RBE),%
1,planned,Rectum,52,1.61509,Gy(RBE),%
1,planned,Rectum,52.1,1.58169,Gy(RBE),%
1,planned,Rectum,52.2,1.52462,Gy(RBE),%
1,planned,Rectum,52.3,1.44352,Gy(RBE),%
1,planned,Rectum,52.4,1.39243,Gy(RBE),%
1,planned,Rectum,52.5,1.34659,Gy(RBE),%
1,planned,Rectum,52.6,1.33099,Gy(RBE),%
1,planned,Rectum,52.7,1.27496,Gy(RBE),%
1,planned,Rectum,52.8,1.23031,Gy(RBE),%
1,planned,Rectum,52.9,1.15298,Gy(RBE),%
1,planned,Rectum,53,1.0894,Gy(RBE),%
1,planned,Rectum,53.1,1.05667,Gy(RBE),%
1,planned,Rectum,53.2,1.03679,Gy(RBE),%
1,planned,Rectum,53.3,1.00334,Gy(RBE),%
1,planned,Rectum,53.4,0.92593,Gy(RBE),%
1,planned,Rectum,53.5,0.85545,Gy(RBE),%
1,planned,Rectum,53.6,0.81901,Gy(RBE),%
1,planned,Rectum,53.7,0.77809,Gy(RBE),%
1,planned,Rectum,53.8,0.75188,Gy(RBE),%
1,planned,Rectum,57.6,NaN,Gy(RBE),%
1,planned,Rectum,57.7,NaN,Gy(RBE),%
1,planned,Rectum,57.8,NaN,Gy(RBE),%
1,planned,Rectum,57.9,NaN,Gy(RBE),%
1,planned,Rectum,58,NaN,Gy(RBE),%
1,planned,Rectum,58.1,NaN,Gy(RBE),%
1,planned,Rectum,58.2,NaN,Gy(RBE),%
1,planned,Rectum,58.3,NaN,Gy(RBE),%
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
DBinJP
  • 247
  • 5
  • 13
  • This is probably what's causing the KeyError:- `(VolValues[loop] > 2) and (VolValues[loop+1] < 2)`. In the first iteration, the value of `loop` is `0`, and then in the next line you're accessing `VolValues[loop]`, which becomes `VolValues[0]` and that is what the last line in the traceback says, `KeyError: 0`. Because `VolValues` doesn't have a key called `0`. – xyres Jul 19 '18 at 04:42
  • It worked before I defined it into a function... You appear to be correct; I've just replicated the error using less code. VolValues[23332] accesses the first value, not [0]. How do I reset the indices rather than have pandas preserve them from the larger array? Or, how am I supposed to work with data subsets in pandas? (In this case, given a large 'Tidy Table' of many patients and data from two methods, accessing only the data from a given patient resulting from a given method.) – DBinJP Jul 19 '18 at 04:47
  • Is `VolValues` a `list` or a `dictionary`? – xyres Jul 19 '18 at 04:51
  • `VolValues` is a "Series object of pandas.core.series module". – DBinJP Jul 19 '18 at 05:10
  • I haven't any experience with pandas. I've upvoted your question, hopefully it will get more attention. – xyres Jul 19 '18 at 05:44
  • i think what you're trying to achieve the code can be simplified a lot. If you provide us with some sample data and the expected output - also take a look at this [SO](https://stackoverflow.com/questions/31593201/pandas-iloc-vs-ix-vs-loc-explanation-how-are-they-different) regards to accessing data in a Series - i cant help there without seeing how your data looks like. – gyx-hh Jul 19 '18 at 11:05
  • @gyx-hh, thank you. I have added sample CSV data to the end of the post. (I didn't see how to attach a CSV file here, but perhaps you can copy-paste into e.g. TextEdit and save it as CSV.) I would very much appreciate learning better code. (I am essentially out of time and trying both to learn and code as quickly as possible.) I will try to resume working on this -- and come back here to check for new responses -- in about 12 hours. – DBinJP Jul 19 '18 at 11:29

3 Answers3

1

Key Error:0 occurs when pandas there is a row in a series which doesn't have an index 0 or a dataframe which doesn't have a column named 0.

In your case it's the series. Consider the example

df = pd.DataFrame({'vl':[1,2,3,4],'bh':[5,6,4,7]},index=[10,11,12,13])

df['bh'][0] #<-- leads to key error zero as the index doesn't contain 0. 

So you can change it to df['bh'].iloc[0] which will return 5 or you can change it to df['bh'].values[0] which returns the same.

In your case it would be VolValues.iloc[loop] or VolValues.values[loop]

Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108
0

VolValues[loop] probably does not want to take a zero as an index, most likely you need to start from 1 (one). Something along the lines of:

for loop in range(1,len(VolValues)):
lenik
  • 23,228
  • 4
  • 34
  • 43
  • As I commented above, VolValues[23332] accesses the first value (in the case of patient 1's planned data), so your suggestion will not work, I think. Apparently when I grab the relevant slice of a large CSV-imported dataframe, it's preserving the index relative to that dataframe rather than renumbering the rows. I think I tried appending .copy() to try to create a properly-indexed new array and it did not change the error message, but I shall check again in a few hours. – DBinJP Jul 19 '18 at 20:36
  • Did you try to reset the index in place? – Mad Physicist Jul 20 '18 at 03:59
  • Effectively yes, technically no: Redditor 'Nikota' recommended using 'values' which effectively does so by putting the data values in a new array ignoring the original array's indices. Another Redditor suggested a method to preserve the index column while resetting indices, but I have no need for the original table's indices so I didn't use this method. – DBinJP Jul 21 '18 at 11:04
  • @DBinJP as one person in JP to another person in JP, I'd recommend you to load up interactive python prompt and play with your data manually a bit, trying different indices and looking for the correct ranges. I bet you could solve your problem that way in just a few minutes, definitely less than an hour =) – lenik Jul 21 '18 at 11:07
0

Reddit user Nikota commented that to merely extract the values without retaining index information, as I was trying to do, one simply needs to use the .values suffix. This has appeared to solve my problem.

Hence, the following code appears to work as intended:

# Import Rectum DVH
import pandas
import numpy
df = pandas.read_csv('/Users/daniel/Documents/data/DVH/RectumData.csv',
                 delimiter=',',header=0)
# Calculate D_2%, defined by ICRU 78 as "the greatest dose which all but
# 2 percent of a [volume of interest] receives." aka D_{near-max}    
def interpD2(disttype):
# Loop through all patients' plans.
    Dose2Results = numpy.zeros(40)
    for num in range(0,40): 
# We know a priori that there is no DVH data with Volume = 2. Hence we look for
# the two columns less than and greater than Volume = 2.
        DoseValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == disttype)].Dose.values
        VolValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == disttype)].Volume.values
        for loop in range(0,len(VolValues)):
            if (VolValues[loop] > 2) and (VolValues[loop+1] < 2):
                LowerVolumeIndex,UpperVolumeIndex = loop,loop+1
                x0,x1,x2 = 2,VolValues[LowerVolumeIndex],VolValues[UpperVolumeIndex]
                y1,y2 = DoseValues[LowerVolumeIndex],DoseValues[UpperVolumeIndex]
                Dose2Results[num] = y1 - ((x1-x0)/(x2 - x1))*(y2 - y1)
    return Dose2Results
D2Planned = interpD2('planned')
D2Blurred = interpD2('blurred')
DBinJP
  • 247
  • 5
  • 13
  • Or you can change it to `VolValues.iloc[loop]`, basically what you are accessing is a row which ain't present in the series. If you want regular indices to work as in array then you have to go for `.iloc`. – Bharath M Shetty Jul 20 '18 at 03:45