0

I have a 2d dictionary as follow:

myDict = {
    "AIK":{"dfa":1,"bff":2,"mam":1},
    "BQQ":{"dic":5,"mam":8,"nzz":3},
    "ZZD":{"gox":6,"xee":2,"dic":9}
}

The length of myDict.keys() is 5000 and the sum of inner keys is 10000 How can I convert it to a scipy sparse matrix (5000,10000) as fast as possible

yaoC
  • 1
  • There is a `dok` format that is a dictionary subclass. For the key values are a tuple, `(i,j)`, row and column index. But to generate that you'd have map all main keys to `range(5000)`, and all inner keys to `range(10000)`. But given the need to iterate over 2 dictionary levels the conversion will not be fast or pretty. – hpaulj Apr 21 '17 at 18:52
  • If you go the SparseDataFrame route, look at http://stackoverflow.com/questions/31084942/pandas-sparse-dataframe-to-sparse-matrix-without-generating-a-dense-matrix-in-m for conversion information. – hpaulj Apr 21 '17 at 20:01

0 Answers0