What is the most efficient way to get this kind of matrix from a 1D numpy array?

Question

I have a file with total 4950 values like:

0.012345678912345678

I read the file using:

a = numpy.genfromtxt(file_name, dtype=str, delimiter=',') # a.shape = (4950L, 1L) #dtype=str as I don't want to compromise accuracy
#say a == ['0.000000000000000001', -'0.000000000000000002', ...., '0.000000000004950']

What I am trying to achieve is to obtain a matrix b of size (100L, 100L) whose:

Upper triangular values are filled with values in numpy array 'a'.
Lower triangular values are filled with values in numpy array 'a' but multiplied by -1.
The diagonal consists of zeros only.

Example(The accuracy matters):

array = ['1','2','-3','-5','6','-7'] # In reality the data is up to 18 decimal places.

final_matrix = [
               ['0','1','2','-3'],
               ['-1',0,'-5','6'],
               ['-2','5','0','-7'],
               ['3','-6','7','0']
               ]

What is the most efficient way to achieve this?

What's the format in your file? One row, one column, multiple columns? — hpaulj, Jan 18 '16 at 21:54
http://stackoverflow.com/questions/34234965/copy-flat-list-of-upper-triangle-entries-to-full-matrix - a similar recent question `Copy flat list of upper triangle entries to full matrix`. — hpaulj, Jan 19 '16 at 05:49

Niemerds · Accepted Answer · 2016-01-23T19:47:00.860

5

Not sure if it is the most efficient way, but this seems pretty efficient.

import numpy

# create some random data for testing
sz = 100
a  = numpy.random.random(sz*sz/2 - sz/2).astype('S50')

# convert back to float for a test on minus signs,
# as it would be done if a is read as string values
amins = numpy.where(a.astype(float) <= 0, "", "-")

# get the values without minus signs
aplus = numpy.char.lstrip(a, "-")

# addup to negated string values
aminus = numpy.char.add(amins, aplus)

# create an empty matrix
m = numpy.zeros(shape=(sz,sz), dtype='S51')
# ids of the upper triangle
u_ids = numpy.triu_indices(sz,1)
# set upper values
m[u_ids] = a
# switch coordinates to set lower values
m[u_ids[1],u_ids[0]] = aminus
# fill diag with zeros
numpy.fill_diagonal(m, numpy.zeros(sz).astype('S51'))


print m

edited Jan 23 '16 at 19:47

answered Jan 18 '16 at 20:00

Niemerds

932
7
12

There are various ways of using the `np.tri...` functions to fill upper and lower triangular arrays. But it sounds like the OP is worried more about the long integers (whether as a numeric type or string). – hpaulj Jan 18 '16 at 20:14
great. Last line can simply be `m-m.T` . data seems to be float64 ? – B. M. Jan 18 '16 at 20:29
Yes, according to the docs, the default dtype of numpy.zeros is float64. – Niemerds Jan 18 '16 at 20:32
Question about trying to go larger than `np.float64`: http://stackoverflow.com/questions/29820829/cannot-use-128bit-float-in-python-on-64bit-architecture – hpaulj Jan 18 '16 at 21:53
Updated the example to use strings – Niemerds Jan 19 '16 at 12:11
@Niemerds **numpy.char.add** will just add a '-' sign. Suppose if the value is '-0.001020001111114444' then it will make it '--0.001020001111114444'. – Black Dragon Jan 19 '16 at 17:19
your right. Again edited the example to account for double "-" – Niemerds Jan 19 '16 at 19:11
@Niemerds Sir, you are amazing. Just one thing left. It is putting '-' sign with '0.0'. Thanks – Black Dragon Jan 23 '16 at 18:36

What is the most efficient way to get this kind of matrix from a 1D numpy array?

1 Answers1

Linked