1

I want to convert an excel file into a geojson file. The dictionary is like:

[
{"NoAdresse": 42537006584, "NoUsager": 42537000086, "LatEffective": 45.83675, "LongDebut": 4.91956, "LatDebut": 45.75529, "LongEffective": 4.84574, "IdVehicule": "246Veh", "LatArrivee": 45.83492, "NoDemande": 42537000003, "LongArrivee": 4.84762}, 
{"NoAdresse": 42537007718, "NoUsager": 42537002720, "LatEffective": 45.83955, "LongDebut": 4.84574, "LatDebut": 45.83675, "LongEffective": 4.83098, "IdVehicule": "246Veh", "LatArrivee": 45.83935, "NoDemande": 42537000004, "LongArrivee": 4.83084}, 
{"NoAdresse": 42537005803, "NoUsager": 42537002424, "LatEffective": 45.98730, "LongDebut": 4.83098, "LatDebut": 45.83955, "LongEffective": 4.72695, "IdVehicule": "246Veh", "LatArrivee": 45.98174, "NoDemande": 42537000006, "LongArrivee": 4.73942}, 
{"NoAdresse": 42537005803, "NoUsager": 42537003576, "LatEffective": 45.98730, "LongDebut": 4.83098, "LatDebut": 45.83955, "LongEffective": 4.72695, "IdVehicule": "246Veh", "LatArrivee": 45.98174, "NoDemande": 42537000005, "LongArrivee": 4.73942}, 
{"NoAdresse": 42537004215, "NoUsager": 42537003576, "LatEffective": 45.93778, "LongDebut": 4.72695, "LatDebut": 45.9873, "LongEffective": 4.62676, "IdVehicule": "246Veh", "LatArrivee": 45.93784, "NoDemande": 42537000005, "LongArrivee": 4.62625}, 
{"NoAdresse": 42537004215, "NoUsager": 42537002424, "LatEffective": 45.93778, "LongDebut": 4.72695, "LatDebut": 45.9873, "LongEffective": 4.62676, "IdVehicule": "246Veh", "LatArrivee": 45.93784, "NoDemande": 42537000006, "LongArrivee": 4.62625}, 
{"NoAdresse": 42537004215, "NoUsager": 42537002720, "LatEffective": 45.93778, "LongDebut": 4.72695, "LatDebut": 45.9873, "LongEffective": 4.62676, "IdVehicule": "246Veh", "LatArrivee": 45.93784, "NoDemande": 42537000004, "LongArrivee": 4.62625}, 
{"NoAdresse": 42537004215, "NoUsager": 42537000086, "LatEffective": 45.93778, "LongDebut": 4.72695, "LatDebut": 45.9873, "LongEffective": 4.62676, "IdVehicule": "246Veh", "LatArrivee": 45.93784, "NoDemande": 42537000003, "LongArrivee": 4.62625}, 
{"NoAdresse": 42537000007, "NoUsager": 42537002425, "LatEffective": 45.72941, "LongDebut": 4.77845, "LatDebut": 45.77335, "LongEffective": 4.88396, "IdVehicule": "164Veh", "LatArrivee": 45.72815, "NoDemande": 42537000070, "LongArrivee": 4.88241}, 
{"NoAdresse": 42537000007, "NoUsager": 42537002425, "LatEffective": 45.69349, "LongDebut": 4.88396, "LatDebut": 45.72941, "LongEffective": 4.94466, "IdVehicule": "164Veh", "LatArrivee": 45.69429, "NoDemande": 42537000070, "LongArrivee": 4.94216}]

And I use this code to achieve this:

import json

from xlrd import open_workbook

book = open_workbook('forum.xlsx')
sheet = book.sheet_by_index(0)

keys = [sheet.cell(0,col_index).value for col_index in xrange(sheet.ncols)]

dict_list = []
for row_index in xrange(1,sheet.nrows):
    d = {keys[col_index]: sheet.cell(row_index,col_index).value
        for col_index in xrange(sheet.ncols)}
    dict_list.append(d)

j = json.dumps(dict_list)

with open('data.json','w') as f:
    f.write(j)

then I want to make it into geojson file in the form:

{ "type": "FeatureCollection",
"features": [{ 
    "type": "Feature",
    "geometry": {
        "type": "LineString",
        "coordinates": [[LatDebut, LongDebut],[LatEffective,LongEffective]]
    },
    "properties": {
        "NoAdresse": "XXX",
        "NoUsager": "XXX",
        "NoDemand":"XXX",
        "IdVehicule":"XXX"
    }
}, { 
...
}]
}

I don't know how to do so and if there is another way to directly convert from excel to geojson file. Besides, I want to add a property "Tour", it changes every time the "IdVehicule" changes. I know it's so much to ask but I'm stuck for so long and any help would be appreciated.

Thanks

ch36r5s
  • 119
  • 2
  • 13
  • Your question might profit from more detail and a small sample with real "numbers" to understand where one to one and where "aggregate" mappings are needed. Basically, your above code seems to produce a data.json file, but you want "some" geojson mapped json file. Right? So I suggest, you do the things with dict_list directly and instead of dunping json serialization of it to a string and then writing it to file, directly dump the json serialization to file. So please add a concrete sample input dict part a (and b? ...) shall map to geojson modelled part / whole z. Thanks – Dilettant Jun 28 '16 at 20:27
  • In fact, that's real numbers. It's number about clients and their coordinates. And the first part is already the dictionary. and how to directly dump it to file? Besides the coordinates part, all the others will be into the "properties" in the geojson – ch36r5s Jun 29 '16 at 07:52
  • Ok, if I understand this now correctly, I hope my offered answer helps. If I misunderstand the problem, please comment there. Thanks. – Dilettant Jun 29 '16 at 10:38

2 Answers2

1

So if you know how to "massage" the dict into your wanted final structure - moving all the "other elements" into some properties member - then I think it is as simple as

# work on dict_list to produce some dict_in_geojson_form
# ... then:
with open('data.json','w') as f:
    json.dump(dict_in_geojson_form, f)

In case the serialization is still unclear, you might want to look also at Python dump dict to json file or How do I write JSON data to a file in Python? ...

Advanced Update: Per amended request some simple transformations and the tour enumeration added. Please see below sample code (split in different code boxes to not have to scroll ...:

#! /usr/bin/env python
"""Map a specific stream to seom GeoJSON target structure and stream the
output in chunks as elements are complete (any new vehicle id increments
the tour label which works like an enumeration in the properties of the
features GeoJSON representation)."""
from __future__ import print_function

import copy
import json
import sys

# Some declarations to confine literals and prepare structure

NO_ADDRESS = 'NoAdresse'
NO_USAGE = 'NoUsager'
ID_VEHICLE = 'IdVehicule'
NO_DEMAND = 'NoDemande'

TOUR_LABEL = 'Tour'

PROPERTY_KEYS = (NO_ADDRESS, NO_USAGE, ID_VEHICLE, NO_DEMAND)

LAT_DEBUT = 'LatDebut'
LONG_DEBUT = 'LongDebut'
LAT_EFFECTIVE = 'LatEffective'
LONG_EFFECTIVE = 'LongEffective'

COORD_KEYS_DEBUT = (LAT_DEBUT, LONG_DEBUT)
COORD_KEYS_EFFECTIVE = (LAT_EFFECTIVE, LONG_EFFECTIVE)

PROPERTIES_KEY = 'properties'
GEOMETRY_KEY = 'geometry'
COORDINATES_KEY = 'coordinates'
FEATURES_KEY = 'features'

GEOJSON_FRAME_PREFIX = """{
"type": "FeatureCollection",
"features": [
"""

FEATURE_TEMPLATE = {
    "type": "Feature",
    GEOMETRY_KEY: {
        "type": "LineString",
        "coordinates": []
    },
    "properties": {
        NO_ADDRESS: None,
        NO_USAGE: None,
        NO_DEMAND: None,
        ID_VEHICLE: None,
        TOUR_LABEL: None
    }}

GEOJSON_LIST_SEP = ',\n'
GEOJSON_FRAME_POSTFIX = ']\n}\n'

# A simple feature emitting source mock

def event_source():
    """Sample input line source generator. Might anything yielding
    python dictionaries matching "this questions" event specification."""

    event_seq = [
        {NO_ADDRESS: 42537006584, NO_USAGE: 42537000086,
         LAT_EFFECTIVE: 45.83675, LONG_DEBUT: 4.91956,
         LAT_DEBUT: 45.75529, LONG_EFFECTIVE: 4.84574,
         ID_VEHICLE: "246Veh", "LatArrivee": 45.83492,
         NO_DEMAND: 42537000003, "LongArrivee": 4.84762},
        {NO_ADDRESS: 42537007718, NO_USAGE: 42537002720,
         LAT_EFFECTIVE: 45.83955, LONG_DEBUT: 4.84574,
         LAT_DEBUT: 45.83675, LONG_EFFECTIVE: 4.83098,
         ID_VEHICLE: "246Veh", "LatArrivee": 45.83935,
         NO_DEMAND: 42537000004, "LongArrivee": 4.83084},
        # ...
        {NO_ADDRESS: 42537000007, NO_USAGE: 42537002425,
         LAT_EFFECTIVE: 45.69349, LONG_DEBUT: 4.88396,
         LAT_DEBUT: 45.72941, LONG_EFFECTIVE: 4.94466,
         ID_VEHICLE: "164Veh", "LatArrivee": 45.69429,
         NO_DEMAND: 42537000070, "LongArrivee": 4.94216}]

    for event in event_seq:
        yield event

# The context free transformations:

def feature_from(event):
    """Transform event to feature, applying the "business" rules."""
    feature = copy.deepcopy(FEATURE_TEMPLATE)
    for property_key in PROPERTY_KEYS:
        feature[PROPERTIES_KEY][property_key] = event[property_key]
    coords_debut = [event[k] for k in COORD_KEYS_DEBUT]
    coords_effective = [event[k] for k in COORD_KEYS_EFFECTIVE]
    feature[GEOMETRY_KEY][COORDINATES_KEY].append(coords_debut)
    feature[GEOMETRY_KEY][COORDINATES_KEY].append(coords_effective)
    return feature

# The separated emitter of features (overkill here, but ...)

def feature_gen(events):
    """Generator creates features from events (might be a good place
    to hook into validty checks in real lif processing)."""
    for event in events:
        yield feature_from(event)

# The context aware generator for feature sequences sharing a tour

def tour_gen(features):
    """Generator emits the feature in chunks per complete tour as detected
    by a change in vehicle id."""
    id_vehicle_active = None
    tour_enumeration = None
    for feature in features:
        id_vehicle_received = feature[PROPERTIES_KEY][ID_VEHICLE]
        if id_vehicle_active is None:
            id_vehicle_active = id_vehicle_received
            tour_enumeration = 1
            tour = []
        if id_vehicle_active != id_vehicle_received:
            yield tour
            tour = []
            tour_enumeration += 1
            id_vehicle_active = id_vehicle_received
            feature[PROPERTIES_KEY][TOUR_LABEL] = tour_enumeration
            tour.append(feature)
        else:
            feature[PROPERTIES_KEY][TOUR_LABEL] = tour_enumeration
            tour.append(feature)
    if tour:
        yield tour

# Mock function to channel the output to stdout or file ...

def geojson_out(text, stream=sys.stdout):
    """Expected JSON text is output here."""
    stream.write(text)

# Sample processing logic driving the transformation

def main():
    """Poor man's streaming falls back on hardcoded GeoJSON "frame"
    string as pre and post fix to the feature stream. the latter
    elements are accumulated in chunks carrying common "tour"
    enumeration label.

    The frame structure in Python lingo is:
        geo_dict = {"type": "FeatureCollection", "features": []}

    The features will be injected in the list and in this implementation
    need some stateful separator injection hack, to yield valid JSON
    (which does not allow trailing comma after last array elememt).
    """

# ... here as sample writing to a hard coded file path

    with open('yumyum.geojson', 'wt') as f_out:
        geojson_out(GEOJSON_FRAME_PREFIX, stream=f_out)
        json_array_nanny_needed = False  # Semi-auto, means semi-manual =(
        for features in tour_gen(feature_gen(event_source())):
            for feature in features:
                if json_array_nanny_needed:
                    geojson_out(GEOJSON_LIST_SEP, stream=f_out)
                geojson_out(json.dumps(feature, sort_keys=True), stream=f_out)
                json_array_nanny_needed = True  # HACK A DID ACK

        geojson_out(GEOJSON_FRAME_POSTFIX, stream=f_out)

if __name__ == '__main__':
    main()

The idea in processing should be in the docstrings ...

One could also collect/aggregate/filter the coordinates per ID of vehicle, or ... it's Python a general purpose programming language ;-)

On my machine the output inside yumyum.txt is:

$ cat yumyum.geojson 
{
"type": "FeatureCollection",
"features": [
{"geometry": {"coordinates": [[45.75529, 4.91956], [45.83675, 4.84574]], "type": "LineString"}, "properties": {"IdVehicule": "246Veh", "NoAdresse": 42537006584, "NoDemande": 42537000003, "NoUsager": 42537000086, "Tour": 1}, "type": "Feature"},
{"geometry": {"coordinates": [[45.83675, 4.84574], [45.83955, 4.83098]], "type": "LineString"}, "properties": {"IdVehicule": "246Veh", "NoAdresse": 42537007718, "NoDemande": 42537000004, "NoUsager": 42537002720, "Tour": 1}, "type": "Feature"},
{"geometry": {"coordinates": [[45.72941, 4.88396], [45.69349, 4.94466]], "type": "LineString"}, "properties": {"IdVehicule": "164Veh", "NoAdresse": 42537000007, "NoDemande": 42537000070, "NoUsager": 42537002425, "Tour": 2}, "type": "Feature"}]
}
Community
  • 1
  • 1
Dilettant
  • 3,267
  • 3
  • 29
  • 29
  • the "massage", in fact is m problem. I don't know how to moving the elements. and how to make two 'keys' in the dict to a coordinate point? – ch36r5s Jun 29 '16 at 11:53
  • I assume that I don't have to type everything manually, but the result returns null.{} The "Tour" counts 1 when the "IdVehicule" = 246 and when the "IdVehicule" = 164 , the "Tour" counts 2. that's the general idea of it. – ch36r5s Jun 30 '16 at 09:23
  • So "Tour" would be a feature count per vehicle id, but where to place it? As info from processing script "printed" or injected into (and where exactly) the GeoJSON? – Dilettant Jun 30 '16 at 10:48
  • it's used to identify and different ride (or journey). I think it will be in the "properties" part as info. – ch36r5s Jun 30 '16 at 13:30
  • @ch36r5s Please check if my answer now matches the updated/amended question. If so, then please mark it as accepted else comment. Thanks. – Dilettant Jun 30 '16 at 16:26
  • I tested it. Most of the parts worked perfectly well. There is a small problem with the "Tour". In fact, it works only in the "Tour 1". From "Tour 2", it adds up every time a new list coming in. – ch36r5s Jul 01 '16 at 14:39
  • Yup. The variable with the current vehicle id should be updated after every detected change. You already spotted that, right? ;-) I updated the code above by that line. – Dilettant Jul 01 '16 at 14:53
0

Here's package [geojson][1]:https://github.com/frewsxcv/python-geojson.

allen
  • 19
  • 4