0

I am trying to read csv file separated by spaces. Spacing in columns is not uniform and one column STN_NAME has characters which are separated by space. The data format is given below:

STN CODE      LAT        LON     Un_1      STN_NAME                     S_YR E_YR  Un_2
ACM00078861  17.1170  -61.7830   10.0    COOLIDGE FIELD (UA)            1947 1993  13896
AEM00041217  24.4333   54.6500   16.0    ABU DHABI INTERNATIONAL AIRPOR 1983 2018  36109
AEXUAE05467  25.2500   55.3700    4.0    SHARJAH                        1935 1942   2477
AFM00040911  36.7000   67.2000  378.0    MAZAR-I-SHARIF                 2010 2014   2179
AFM00040913  36.6667   68.9167  433.0    KUNDUZ                         2010 2013   4540

I used pd.read_fwf to read that csv file. File has 2788 rows but pd.read_fwf is reading only first 30 and last 30 rows of file.

import os
import csv
import pandas as pd

with open('radiosonde_stations.csv') as csvfile:
    
    data = pd.read_fwf(csvfile, infer_nrows=2790)
    query = data[(data['LAT'] >= -18) & (data['LAT'] <= 45) & (data['LON'] >= 48) & (data['LON'] <= 117)]
    print(data)
    print(query)
    file = 'southeast_asia_stnlist.csv'
    with open(os.path.join('/home/DATA/RADIOSONDE/', file), 'a') as output:
        writer = csv.writer(output)
        writer.writerow([query])

It gives the output in following form(It printed first 30 and last 30 rows only)

25    AGM00060571  31.5000   -2.2500   811.0                          BECHAR   
26    AGM00060580  31.9333    5.4000   152.0                         OUARGLA   
27    AGM00060590  30.5670    2.8670   397.0                        EL-GOLEA   
28    AGM00060602  30.1330   -2.1670   505.0                      BENI-ABBES   
29    AGM00060607  29.2500    0.2830   312.0                        TIMIMOUN   
          ...      ...       ...     ...                             ...   
2758  ZZV00ASEU02 -98.8888 -998.8888  -998.8                          ASEU02   
2759  ZZV00ASEU03 -98.8888 -998.8888  -998.8                          ASEU03   
2760  ZZV00ASEU04 -98.8888 -998.8888  -998.8                          ASEU04   

I want to apply conditions on lat & lon columns and write the output in another file. But while writing to another csv file it is writing in same way, as it's displayed above.(first 30 lines then ---- then last 30 lines). How can I write all the data completely in csv file.

Thanks!

Ajay
  • 320
  • 2
  • 11
  • `Spacing in columns is not uniform` - is there a specification for the file format? Is each column defined as starting at a specific character position? If you don't have control over the file content the *makers* probably do have specs and maybe they documented it or will provide it. – wwii Jan 01 '21 at 16:00
  • What do you get with `print(len(open('southeast_asia_stnlist.csv').readlines()))` - is it just how the data is displayed? what happens with `with open('southeast_asia_stnlist.csv') as f: for line in f: print(line.strip())`? How about `print(data.shape)` or `print(query.shape)`? – wwii Jan 01 '21 at 16:06
  • 1
    Try `print(query.to_string())`. – wwii Jan 01 '21 at 16:11
  • Does [Pandas: Setting no. of max rows](https://stackoverflow.com/questions/16424493/pandas-setting-no-of-max-rows) answer your question? Or [Pretty-print an entire Pandas Series / DataFrame](https://stackoverflow.com/questions/19124601/pretty-print-an-entire-pandas-series-dataframe)?? – wwii Jan 01 '21 at 16:28

1 Answers1

1

To write in csv, you can use a direct .to_csv() function of the pandas library.

import os
import csv
import pandas as pd

with open('radiosonde_stations.csv') as csvfile:
    
    data = pd.read_fwf(csvfile, infer_nrows=2790)
    query = data[(data['LAT'] >= -18) & (data['LAT'] <= 45) & (data['LON'] >= 48) & (data['LON'] <= 117)]
    print(data)
    print(query)
    file = 'southeast_asia_stnlist.csv'
    query.to_csv(file)

This can make your task easier.

Thanks.

srishtigarg
  • 1,106
  • 10
  • 24