3

So far I have loaded all the parcel tables (with geometry information) in Alaska to PostgreSQL. The tables are originally stored in dump format. Now, I want to convert each table in Postgres to shapefile through cmd interface using ogr2ogr.

My code is something like below:

ogr2ogr -f "ESRI Shapefile" "G:\...\Projects\Dataset\Parcel\test.shp" PG:"dbname=parceldb host=localhost port=5432 user=postgres password=postgres" -sql "SELECT * FROM ak_fairbanks"

However, the system kept returning me this info: Unable to open datasource

PG:dbname='parceldb' host='localhost' port='5432' user='postgres' password='postgres'

With the following drivers.

user7590842
  • 41
  • 1
  • 3

2 Answers2

3

There is pgsql2shp option also available. For this you need to have this utility in your system.

The command that can be follow for this conversion is

pgsql2shp  -u <username> -h <hostname> -P <password> -p 5434 -f <file path to save shape file> <database> [<schema>.]<table_name>

This command has other options also which can be seen on this link.

Campa
  • 4,267
  • 3
  • 37
  • 42
Bharti Mohane
  • 661
  • 7
  • 19
  • 1
    But what if I want to convert multiple tables to multiple shapefiles? The current code can successfully convert only one table to one shapefile at one time. – user7590842 Jun 13 '18 at 17:17
  • For exporting multiple tables in shapefile, you can use pgsql2shp-gui available in postgresql. – Bharti Mohane Jun 14 '18 at 05:02
  • 1
    How did you manage to merge a large amount of the data? I am getting this error - Failed to write shape object. File size cannot reach 4294967068 + 248. Error writing shape 31754843 – KFC Jan 06 '21 at 08:28
0

Exploring this case based on the comments in another answer, I decided to share my Bash scripts and my ideas.

Exporting multiple tables

To export many tables from a specific schema, I use the following script.

#!/bin/bash
source ./pgconfig
export PGPASSWORD=$password

# if you want filter, set the tables names into FILTER variable below and removing the character # to uncomment that.
# FILTER=("table_name_a" "table_name_b" "table_name_c")
#
# Set the output directory
OUTPUT_DATA="/tmp/pgsql2shp/$database"
#
#
# Remove the Shapefiles after ZIP
RM_SHP="yes"

# Define where pgsql2shp is and format the base command
PG_BIN="/usr/bin"
PG_CON="-d $database -U $user -h $host -p $port"

# creating output directory to put files
mkdir -p "$OUTPUT_DATA"

SQL_TABLES="select table_name from information_schema.tables where table_schema = '$schema'"
SQL_TABLES="$SQL_TABLES and table_type = 'BASE TABLE' and table_name != 'spatial_ref_sys';"

TABLES=($($PG_BIN/psql $PG_CON -t -c "$SQL_TABLES"))

export_shp(){
    SQL="$1"
    TB="$2"
    pgsql2shp -f "$OUTPUT_DATA/$TB" -h $host -p $port -u $user $database "$SQL"
    zip -j "$OUTPUT_DATA/$TB.zip" "$OUTPUT_DATA/$TB.shp" "$OUTPUT_DATA/$TB.shx" "$OUTPUT_DATA/$TB.prj" "$OUTPUT_DATA/$TB.dbf" "$OUTPUT_DATA/$TB.cpg"
}

for TABLE in ${TABLES[@]}
do
    DATA_QUERY="SELECT * FROM $schema.$TABLE"
    SHP_NAME="$TABLE"

    if [[ ${#FILTER[@]} -gt 0 ]]; then
        echo "Has filter by table name"
        if [[ " ${FILTER[@]} " =~ " ${TABLE} " ]]; then
            export_shp "$DATA_QUERY" "$SHP_NAME"
        fi
    else
        export_shp "$DATA_QUERY" "$SHP_NAME"
    fi
    # remove intermediate files
    if [[ "$RM_SHP" = "yes" ]]; then
        rm -f $OUTPUT_DATA/$SHP_NAME.{shp,shx,prj,dbf,cpg}
    fi

done

Splitting data into multiple files

To avoid the problem of large tables when pgsql2shp does not write data to the shapefile, we can adopt data splitting using the paging strategy. In Postgres we can use LIMIT, OFFSET and ORDER BY for paging.

Applying this method and considering that your table has a primary key used to sort the data in my example script.

#!/bin/bash
source ./pgconfig
export PGPASSWORD=$password

# if you want filter, set the tables names into FILTER variable below and removing the character # to uncomment that.
# FILTER=("table_name_a" "table_name_b" "table_name_c")
#
# Set the output directory
OUTPUT_DATA="/tmp/pgsql2shp/$database"
#
#
# Remove the Shapefiles after ZIP
RM_SHP="yes"

# Define where pgsql2shp is and format the base command
PG_BIN="/usr/bin"
PG_CON="-d $database -U $user -h $host -p $port"

# creating output directory to put files
mkdir -p "$OUTPUT_DATA"

SQL_TABLES="select table_name from information_schema.tables where table_schema = '$schema'"
SQL_TABLES="$SQL_TABLES and table_type = 'BASE TABLE' and table_name != 'spatial_ref_sys';"

TABLES=($($PG_BIN/psql $PG_CON -t -c "$SQL_TABLES"))

export_shp(){
    SQL="$1"
    TB="$2"
    pgsql2shp -f "$OUTPUT_DATA/$TB" -h $host -p $port -u $user $database "$SQL"
    zip -j "$OUTPUT_DATA/$TB.zip" "$OUTPUT_DATA/$TB.shp" "$OUTPUT_DATA/$TB.shx" "$OUTPUT_DATA/$TB.prj" "$OUTPUT_DATA/$TB.dbf" "$OUTPUT_DATA/$TB.cpg"
}

for TABLE in ${TABLES[@]}
do
    GET_PK="SELECT a.attname "
    GET_PK="${GET_PK}FROM pg_index i "
    GET_PK="${GET_PK}JOIN pg_attribute a ON a.attrelid = i.indrelid AND a.attnum = ANY(i.indkey) "
    GET_PK="${GET_PK}WHERE i.indrelid = 'test'::regclass AND i.indisprimary"
    PK=($($PG_BIN/psql $PG_CON -t -c "$GET_PK"))
    MAX_ROWS=($($PG_BIN/psql $PG_CON -t -c "SELECT COUNT(*) FROM $schema.$TABLE"))
    LIMIT=10000
    OFFSET=0
    # base query
    DATA_QUERY="SELECT * FROM $schema.$TABLE"
    # continue until all data are fetched.
    while [ $OFFSET -le $MAX_ROWS ]
    do
        DATA_QUERY_P="$DATA_QUERY ORDER BY $PK OFFSET $OFFSET LIMIT $LIMIT"
        OFFSET=$(( OFFSET+LIMIT ))
        SHP_NAME="${TABLE}_${OFFSET}"
        if [[ ${#FILTER[@]} -gt 0 ]]; then
            echo "Has filter by table name"
            if [[ " ${FILTER[@]} " =~ " ${TABLE} " ]]; then
                export_shp "$DATA_QUERY_P" "$SHP_NAME"
            fi
        else
            export_shp "$DATA_QUERY_P" "$SHP_NAME"
        fi
        # remove intermediate files
        if [[ "$RM_SHP" = "yes" ]]; then
            rm -f $OUTPUT_DATA/$SHP_NAME.{shp,shx,prj,dbf,cpg}
        fi
    done
done

Common config file

Configuration file for PostgreSQL connection used in both examples (pgconfig):

user="postgres"
host="my_ip_or_hostname"
port="5432"
database="my_database"
schema="my_schema"
password="secret"

Another strategy is to choose GeoPackage as the output file that supports a larger file size than the shapefile format, maintaining portability across Operating Systems and having sufficient support in GIS softwares.

ogr2ogr -f GPKG output_file.gpkg PG:"host=my_ip_or_hostname user=postgres dbname=my_database password=secret" -sql "SELECT * FROM my_schema.my_table"

References:

André Carvalho
  • 927
  • 6
  • 12