You can use numpy broadcasting for easier computation
import numpy as np
import pandas as pd
# Set Seed for reproducibility
np.random.seed(0)
# Set your n
n = 9
# Randomize a Numpy Array with 100 x n array
rand_nos = np.random.rand(100, n)
Now, to get the sum per row,
rand_nos.sum(axis=1)
Now you have a 100 x n
array. To broadcast array operations, you have to match the the column count in the left array to the row count of the right array. So to match, we use the .transpose()
method of a numpy.ndarray
. That will result into a n x 100
array. In order to bring it back to a 100 x n
array, we just apply the .transpose()
method on the resulting array. Code will look like so:
onehundred_by_n = rand_nos.transpose() / rand_nos.sum(axis=1) # This is a 100 x 1 array
n_by_onehundred = onehundred_by_n.transpose() # This is now a n x 100 array
Once you have n_by_onehundred, it's easy to read it into a dataframe (since pandas.DataFrames
are essentially numpy.ndarray
s). Add a columns
keyword argument and you're set with your randomized dataframe.
df = pd.DataFrame(
n_by_onehundred,
columns=[
'AAPL weight', 'MSFT weight', 'XOM weight',
'JNJ weight', 'JPM weight', 'AMZN weight',
'GE weight', 'FB weight', 'T weight',
]
)
Just for completeness, if you want to check that each row in df
sums up to 1, just run:
df.sum(axis=1) # This sums each row
You should get 100 1s representing each row