0

this is more of a SQL question than a PostGIS question, but I'm getting stuck again :(

I have a table called referred with id numbers in the "from" and "to" columns. I want to calculate the distance between ALL these id numbers based on their zip code.

There is a separate reference table called doc which contains the id number in column "NPI" and zip code in column "Provider Business Mailing Address Postal Code" and a separate geo table called zctas which has zip code column as zcta and geom column.

For example, this query works fine:

SELECT z.zcta As zip1, 
       z2.zcta As zip2, 
       ST_Distance(z.geom,z2.geom) As thedistance 
  FROM zctas z, 
       zctas z2 
 WHERE z2.zcta = '60611' 
       AND z.zcta='19611';

One catch is that the "Provider Business Mailing Address Postal Code" should = left("Provider Business Mailing Address Postal Code", 5).

I'm getting stuck on JOIN-ing the 2 zip codes from the reference table in this one query.

Sample table:

referred table:

    from    |     to     | count 
------------+------------+-------
 1174589766 | 1538109665 |   108
 1285653204 | 1982604013 |    31

desired output:

    from    |     to     | count | distance
------------+------------+----------------
 1174589766 | 1538109665 |   108 | 53434
 1285653204 | 1982604013 |    31 | 34234

\d+

Table "public.zctas"
          Column      |          Type          | Modifiers | Storage  | Stats target | Description 
    ------------------+------------------------+-----------+----------+--------------+-------------
     state            | character(2)           |           | extended |              | 
     zcta             | character(5)           |           | extended |              | 
     junk             | character varying(100) |           | extended |              | 
     population_tot   | bigint                 |           | plain    |              | 
     housing_tot      | bigint                 |           | plain    |              | 
     water_area_meter | double precision       |           | plain    |              | 
     land_area_meter  | double precision       |           | plain    |              | 
     water_area_mile  | double precision       |           | plain    |              | 
     land_area_mile   | double precision       |           | plain    |              | 
     latitude         | double precision       |           | plain    |              | 
     longitude        | double precision       |           | plain    |              | 
     thepoint_lonlat  | geometry(Point,4269)   |           | main     |              | 
     thepoint_meter   | geometry(Point,32661)  | not null  | main     |              | 
     geom             | geometry(Point,32661)  |           | main     |              | 
    Indexes:
        "idx_zctas_thepoint_lonlat" gist (thepoint_lonlat)
        "idx_zctas_thepoint_meter" gist (thepoint_meter) CLUSTER

                              Table "public.referred"
 Column |         Type          | Modifiers | Storage  | Stats target | Description 
--------+-----------------------+-----------+----------+--------------+-------------
 from   | character varying(25) |           | extended |              | 
 to     | character varying(25) |           | extended |              | 
 count  | integer               |           | plain    |              | 
Has OIDs: no

        Table "public.doc"
                            Column                            |          Type          | Modifiers | Storage  | Stats target | Description 
--------------------------------------------------------------+------------------------+-----------+----------+--------------+-------------
 NPI                                                          | character varying(255) |           | extended |              | 
 Entity Type Code                                             | character varying(255) |           | extended |              | 
 Replacement NPI                                              | character varying(255) |           | extended |              | 
 Employer Identification Number (EIN)                         | character varying(255) |           | extended |              | 
 Provider Organization Name (Legal Business Name)             | character varying(255) |           | extended |              | 
 Provider Last Name (Legal Name)                              | character varying(255) |           | extended |              | 
 Provider First Name                                          | character varying(255) |           | extended |              | 
 Provider Middle Name                                         | character varying(255) |           | extended |              | 
 Provider Name Prefix Text                                    | character varying(255) |           | extended |              | 
 Provider Name Suffix Text                                    | character varying(255) |           | extended |              | 
 Provider Credential Text                                     | character varying(255) |           | extended |              | 
 Provider Other Organization Name                             | character varying(255) |           | extended |              | 
 Provider Other Organization Name Type Code                   | character varying(255) |           | extended |              | 
 Provider Other Last Name                                     | character varying(255) |           | extended |              | 
 Provider Other First Name                                    | character varying(255) |           | extended |              | 
 Provider Other Middle Name                                   | character varying(255) |           | extended |              | 
 Provider Other Name Prefix Text                              | character varying(255) |           | extended |              | 
 Provider Other Name Suffix Text                              | character varying(255) |           | extended |              | 
 Provider Other Credential Text                               | character varying(255) |           | extended |              | 
 Provider Other Last Name Type Code                           | character varying(255) |           | extended |              | 
g(255) |           | extended |              | 
 Provider Second Line Business Mailing Address                | character varying(255) |           | extended |              | 
 Provider Business Mailing Address City Name                  | character varying(255) |           | extended |              | 
 Provider Business Mailing Address State Name                 | character varying(255) |           | extended |              | 
 Provider Business Mailing Address Postal Code                | character varying(255) |           | extended |         . . . . other columns not really needed.

Thanks!!!!

user2923767
  • 567
  • 1
  • 4
  • 13

1 Answers1

1

This should be relatively straightforward.

Assuming the NPIs are actually all the same length in doc and referred, you can join those tables quite easily:

SELECT ad."Provider Business Mailing Address Postal Code" as a_zip,
       bd."Provider Business Mailing Address Postal Code" as b_zip,
       r."count"
  FROM referred r
           LEFT JOIN doc ad ON r."from" = ad."NPI"
           LEFT JOIN doc bd ON r."from" = bd."NPI";

Obviously, adjust this join based on careful analysis of the NPI and from/to fields in your data. Add trim or left method calls within the join if necessary -- the most important thing is that the JOIN condition be on comparable data.

Now, going from this to your original query to find a distance is trivial:

SELECT ad."Provider Business Mailing Address Postal Code" as a_zip,
       bd."Provider Business Mailing Address Postal Code" as b_zip,
       r."count",
       ST_Distance(az.geom,bz.geom) As thedistance
  FROM referred r
           LEFT JOIN doc ad ON r."from" = ad."NPI"
           LEFT JOIN doc bd ON r."from" = bd."NPI"
           LEFT JOIN zctas az 
               ON az.zcta = left(ad."Provider Business Mailing Address Postal Code",5)
           LEFT JOIN zctas bz
               ON bz.zcta = left(bd."Provider Business Mailing Address Postal Code",5)

This is just one construction that should work, many others are possible. This particular construction will ensure that every entry in referred is represented, even if the NPI doesn't match to an entry in the doc table, or a zipcode can't be matched against the zctas table.

On the flip side, if there exists more than one entry for an NPI in the doc table, any referred entry that mentions this duplicated NPI will also be duplicated.

Similarly, if there is more than one entry in zctas for a particular zip code (zcta), you would see duplicates of referred rows.

That's how LEFT JOIN works, but I figured it was worth putting in the warning, as Provider data is typically full of duplicates against NPI, and there are often duplicate zip codes in zip code lookup lists as some zip codes cross state lines.

ProgrammerDan
  • 871
  • 7
  • 17