-1

i have a table that looks like this table

and i want to upload it to an array so the first colum would be city name and the first row will be the parameters. is it possible given different types? i need an array like this :

[[city name total   00-04   05-09]
[j 882806   110386  98268]
[a 221560   21982   19317]
[h 279591   21069   18200]]

when i use

csv = np.genfromtxt('populationData2016.csv',delimiter=",")

i get this

[[             nan              nan              nan ...,              nan
           nan              nan]
 [             nan   8.82806000e+05   1.10386000e+05 ...,   2.57210000e+04
1.77910000e+04   3.56080000e+04]
hpaulj
  • 221,503
  • 14
  • 230
  • 353
tt600
  • 25
  • 1
  • 7
  • 1
    You do not need to "file upload" (please read the tags description before using them". You need to read in your file into your python program using numpy - there are _tons_ of posts on how to read in f.e. CSV into numpy arrays: start reading those search results for something you can use: https://stackoverflow.com/search?q=csv+to+numpy+array - f.e. https://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy - or any other python "how to csv to list" question on SO (tons more) and then "how to list of list to numpy array" – Patrick Artner Jan 06 '18 at 13:46
  • 1
    Possible duplicate of [How to read csv into record array in numpy?](https://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy) – Patrick Artner Jan 06 '18 at 13:47
  • Depends, how you stored your data. Numpy can load text files: https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html#numpy.loadtxt – Mr. T Jan 06 '18 at 13:49
  • none of the above solves my problem. It does make an array with e numbers and nan statmens – tt600 Jan 06 '18 at 14:10
  • `genfromtxt` tries to load your data as floats; `nan` are the strings that aren't valid floats - for example the label strings on the first row and first column. Your desired array is not a valid `numpy` array, with a mix of strings and integers. It could be loaded as a structured array, but I'm not sure you know enough `numpy` to use that. – hpaulj Jan 06 '18 at 20:30

3 Answers3

1

You can use pandas to simplify this task:

import pandas as pd
import numpy as np

df = pd.read_csv('path/to/file.csv')
headers = np.array(df.columns)  # get headers
values = df.values  # numpy array of values
matrix = np.concatenate([[headers], values])  # append to the final matrix
aiven
  • 3,775
  • 3
  • 27
  • 52
  • `np.concat`? What version of numpy are you using? What's the dtype of `values` and `matrix`? – hpaulj Jan 06 '18 at 21:00
  • @hpaulj fixed `np.concatenate`. It this case `values` and `matrix` will have dtype `object` because table contains both strings and number – aiven Jan 06 '18 at 21:03
1

To convert Pandas Dataframe to Numpy Array:

#import required model

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

#split training and testing model

x_train,x_test,y_train,y_test = train_test_split(input,output,test_size=.2)
x_train

x_train output

#convert data table into numpy array

x = x_train.to_numpy()
x

Array output

Joooeey
  • 3,394
  • 1
  • 35
  • 49
0
def load_table_as_array(filename):
 f = open(filename,'r')
 # Parsing the first line containing column headers
 header_line = f.readline()
 header_line = header_line.strip(',\n')
 column_names = header_line.split(',’)
 #Parsing the rest of the file
 mat = []
 row_names = []
 for line in f:
   tokens = line.rstrip().split(',')
   row_names.append(tokens[0]) #Add first token to row header list
   values = [float(n) for n in tokens[1:]] # Convert to float
   mat.append(values) # Append the current row to the matrix
 f.close()
 row_names = np.array(row_names)
 column_names = np.array(column_names)
 data = np.array(mat)
 return data, column_names, row_names

this works thank for who ever tried to help

tt600
  • 25
  • 1
  • 7