2

I have a small dataframe - six rows (not counting the header) and 53 columns (a store name, and the rest weekly sales for the past year). Each row contains a particular store and each column the store's name and sales for each week. I need to transpose the data so that the weeks appear as rows, the stores appear as columns, and their sales appear as the rows.

To generate the input data:

df_store = pd.read_excel(SourcePath+SourceFile, sheet_name='StoreSales', header=0, usecols=['StoreName'])
# Number rows of all irrelevant stores.
row_numbers = [x+1 for x in df_stores[(df_store['StoreName'] != 'Store1') & (df_store['StoreName'] != 'Store2')
 & (df_store['StoreName'] !='Store3')].index]

# Read in entire Excel file, skipping the rows of irrelevant stores.
df_store = pd.read_excel(SourcePath+SourceFile, sheet_name='StoreSales', header=0, usecols = "A:BE", 
    skiprows = row_numbers, converters = {'StoreName' : str})

# Transpose dataframe
df_store_t = df_store.transpose()

My output puts index numbers above each store name ( 0 to 5), and then each column starts out as StoreName (above the week), then each store name. Yet, I cannot manipulate them by their names.

Is there a way to clear those index numbers so that I can work directly with the resulting column names (e.g., rename "StoreName" to "WeekEnding" and make reference to each store columns ("Store1", "Store2", etc.?)

AJCaffarini
  • 87
  • 1
  • 7

1 Answers1

2

IIUC, you need to set_index first, then transpose, T:

See this example:

df  = pd.DataFrame({'Store':[*'ABCDE'],
                    'Week 1':np.random.randint(50,200, 5), 
                    'Week 2':np.random.randint(50,200, 5),
                    'Week 3':np.random.randint(50,200, 5)})

Input Dataframe:

  Store  Week 1  Week 2  Week 3
0     A      99     163     148
1     B     119      86      92
2     C     145      98     162
3     D     144     143     199
4     E      50     181     177

Now, set_index and transpose:

df_out = df.set_index('Store').T
df_out

Output:

Store     A    B    C    D    E
Week 1   99  119  145  144   50
Week 2  163   86   98  143  181
Week 3  148   92  162  199  177
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • Thanks Scott. That certainly took care of the Store name portion. What I was hoping for was for the Week-Ending dates columns to become a column I could call WeekEnding. So in my original dataframe I have a column called Sales2018/01/06, another Sales2018/01/13, and so forth, for 52 weeks. How can I turn those into a single column called "WeekEnding," because I'll also want to strip "Sales" from that and turn them into date values for manipulation. Thanks again! – AJCaffarini Oct 01 '19 at 15:21
  • @AJCaffarini Add `.reset_index()` after .T.. Could you add code to your question generating a small part of you input dataframe and also show your expected outputs. – Scott Boston Oct 01 '19 at 15:51
  • Scott, I added the code I used to import from Excel. I wasn't sure how to depict my expected output. Do I use CTRL-K for that too? – AJCaffarini Oct 01 '19 at 16:18
  • @AJCaffarini Not quite what i was talking about. You should create code that will create a small test sample of your input data so, we can duplicate your input, then also, provide the expected output from this test sample. Note, how in my answer, I created this dummy test set, you could do the same that way you can include your problem such as the labeling with Sales and weekending. – Scott Boston Oct 01 '19 at 16:47