9

I am trying to use the Scanpy Python package to analyze some single-cell data. I read a count matrix (a .tsv file) in as a Pandas data frame, which has genes as the columns and rows as the different cells. Each row contains the counts for the different genes for a single cell. I would like to create an AnnData object from the Pandas data frame... does anyone know how I can do this? Unfortunately, I cannot provide the dataset.

AyeTown
  • 831
  • 1
  • 5
  • 20
  • 1
    This Github issue might be worth following up https://github.com/theislab/anndata/issues/67 – Code42 Jun 28 '21 at 15:36

3 Answers3

6

You can convert your DataFrame df into AnnData adata this way:

adata = anndata.AnnData(X: df.iloc[1:,1:],
                        obs: df.iloc[:,0:1],
                        var: df.iloc[0:1,:])

But you don't really need to do that. Instead, directly read the tsv file into an AnnData object:

with open("your_tsv_file.tsv") as your_data:
    adata = anndata.read_csv(your_data, delimiter='\t')
Jafar Isbarov
  • 1,363
  • 1
  • 8
  • 26
5

Straight forward solution:

adata = sc.AnnData(counts_df)
1

Here's my answer that works with scanpy 1.9.1

adata = sc.AnnData(df, 
    df.index.to_frame(), 
    df.columns.to_frame())

Second entry is cell names, third entry is gene names.

felixm
  • 55
  • 5