-4

I have an Excel table (200x780). The lines are options available on an aircraft, and the columns are serial numbers. The content is either 1 or 0, depending on if the aircraft has or not the option.

Matrix:

enter image description here

I want a table with the options as line and column, containing the probability of having two options together. Something like this, matrix probability:

enter image description here

I used pandas.read_excel to read the file, and thought about multiplying each line with the others, but this will take a lot of time. Is there a panda module I didn't find that can do this easier?

Ivan
  • 34,531
  • 8
  • 55
  • 100
R.MARTIN
  • 3
  • 1
  • 8
  • Please include a proper dataset(or a part) and what output you desire. – Shubham R Aug 29 '17 at 09:03
  • I don't understand what more I can give about the dataset, over the picture? – R.MARTIN Aug 29 '17 at 09:09
  • 1
    include atleast 10rowsX10 columns as input and also the desired output for the same. – Shubham R Aug 29 '17 at 09:10
  • 2
    Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Aug 29 '17 at 09:14
  • [Don't post images of code (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) – jezrael Aug 29 '17 at 09:15
  • @jezrael there is no code! – Ivan Aug 29 '17 at 09:23
  • @Ivan - You are right, but I think in links is writtten it - code is necessary. – jezrael Aug 29 '17 at 09:24

1 Answers1

0

You can do this in Excel very easily.

If Data represents the 200 rows by 780 columns of input data then the array formula

=MMULT(Data,TRANSPOSE(Data))/780

gives you a 200 row by 200 column matrix of values in which the (i,j)'th element (row i and column j) is obtained from the i'th and j'th rows of the input Data.

Select the 200*200 range of cells where you want the output to appear, enter the formula into the formula bar and commit using the Ctrl, Shift and Enter keys pressed simultaneously. You can substitute a range such as B2:ADA201 for Data in the formula.

DMM
  • 1,090
  • 7
  • 8
  • The problem is I need to use Python, it is the requirement. But I'm working on reformulating the question better. – R.MARTIN Aug 30 '17 at 11:52
  • If you have a matrix multiplication library available then you could use that: multiply your data matrix by its transpose to get a 200*200 matrix then divide every element by 780. – DMM Aug 30 '17 at 12:26
  • I tried using 'df.as_matrix' , then 'df2 = df1.transpose' , but when I try multiplying df1 and df2, there is the error : '_with block values unsupported operand type(s) for *: 'long' and 'instancemethod'_ – R.MARTIN Aug 30 '17 at 13:05
  • Sorry, can't help further as I don't know anything about Python or the features you are trying to use. However, from a mathematical perspective matrix multiplication fits with what you are trying to do. Matrix multiplication is different from normal multiplication so perhaps you need to seek out the appropriate method for matrix multiplication. – DMM Aug 30 '17 at 13:26