I am creating a tensorflow constant based on a ndarray list object. My understanding was that tensor itself wouldnt do a memory copy of the underlying data, but create a python object using the same underlying ndarray data. However, after running a little test, seems like it does copy the data
def mem_test():
printMemUsed("before r list")
r = ['Supporter'] * 100000000
printMemUsed("after r list")
r_arr = np.array(r)
printMemUsed("after nd_array")
tf.convert_to_tensor(r_arr)
printMemUsed("after tensor conversion")
def printMemUsed(discript):
print("{}:\t{}".format(discript, psutil.virtual_memory().used))
Here's the ouput:
before r list: 727310336 -> 727 Mb
after r list: 1528782848 -> 1.5 GB
after nd_array: 2430574592 -> 2.4 GB
after tensor conversion: 8925667328 -> 8.9 GB
edit: r_arr had a dtype of 'S9' (null terminated string). After changing the input array element to type 'unicode' (U9), the virtual memory consumption bumped up to 5 GB after nd_array