Questions tagged [sparse-dataframe]

7 questions
14
votes
2 answers

Changing the fill_values in a SparseDataFrame - replace throws TypeError

Current pandas version: 0.22 I have a SparseDataFrame. A = pd.SparseDataFrame( [['a',0,0,'b'], [0,0,0,'c'], [0,0,0,0], [0,0,0,'a']]) A 0 1 2 3 0 a 0 0 b 1 0 0 0 c 2 0 0 0 0 3 0 0 0 a Right now, the fill…
cs95
  • 379,657
  • 97
  • 704
  • 746
1
vote
3 answers

Grouping large sparse pandas dataframe with groupby.sum() is very slow

I have pandas dataframe of size (607875, 12294). The data is sparse and looks like: ID BB CC DD ... 0 abc 0 0 1 ... 1 bcd 0 0 0 ... 2 abc 0 0 1 ... ... I converted it to the sparse form by calling dataframe =…
Maria
  • 515
  • 4
  • 17
1
vote
0 answers

Converting SciPy CSR matrix to Pandas SparseDataFrame is too slow

I have a vocabulary of about 50,000 terms and a corpus of about 20,000 documents in a Pandas DataFrame like this: import pandas as pd vocab = {"movie", "good", "very"} corpus = pd.DataFrame({ "ID": [100, 200, 300], "Text": ["It's a good…
farmer
  • 285
  • 1
  • 13
1
vote
0 answers

Assigning a column to a SparseDataFrame

Consider - df = pd.DataFrame({"a":[1,2,3]}) df a 0 1 1 2 2 3 I'd like to do two things: Convert the dataframe to sparse with a default fill value of False Assign a column of all False values to this sparse dataframe Here's two seemingly…
cs95
  • 379,657
  • 97
  • 704
  • 746
0
votes
0 answers

Pandas sparse dataframe multiplication

I have two pandas sparse dataframes, big_sdf and bigger_sdf. When I try to multiply them: result = big_sdf @ bigger_sdf I get an error: "numpy.core._exceptions.MemoryError: Unable to allocate 3.6 TiB for an array with shape (160815, 3078149) and…
AlonBA
  • 444
  • 1
  • 4
  • 18
0
votes
1 answer

How to drop rows from a Sparse Dataframe without changing the format

I am trying to drop some empty rows in my dataframe. The following code shows that the datatypes are indeed sparse. items_users_sparse_top_tags_df = items_users_sparse_pd.loc[tracks_tags_df.index] items_users_sparse_top_tags_df.rename_axis('tracks',…
Filion
  • 21
  • 5
-1
votes
2 answers

Grep and append column in R

I have 2000 lines of HR dataset and need to append column after grepping the string pattern. I want to match (sometimes they are not exact matches) edu column from df2 to df1, and print the respective Dep rows. Also, when there is no match of edu…