-
Notifications
You must be signed in to change notification settings - Fork 20
Managing Data Sources
DataSourcesConfig is an XML tag similar to the Stylesheet tag that allows you to externalize elements in an easy to manage format.
Datasources pose a number of challenges.
- Production and development environments are often differ enough to require separate database connection parameters, paths, etc.
- Local development is often easier with shapefiles - and you don't want 10 duplicates of a 400GB file floating around.
- A common technique for simplifying Datasources and managing the environment changes is to use XML entities. This works, but isn't easy.
- Sharing stylesheets is easy, but sharing datasource definitions is very cumbersome.
- At a certain point brains start exploding as an MML grows to large.
Sold already? You can quickly convert your existing mml:
$ cascadenik-extract-dscfg.py existing.mml new.mml datasources.cfg
- Datasources, their parameters, and SRS are represented in an INI type file and given a name
- MML files declare the data source config files that define the datasources they use
- Layers associate themselves with a Datasource by declaring a source_name= attribute
- Cascadenik handles the rest
<Map>
<DataSourcesConfig>
# either inline or as a separate file using the src= attribute, like a Stylesheet tag
# name a data source, and define its parameters
[natural_earth_land_110m]
type = shape
file = http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/110m-land.zip
layer_srs = epsg:4326
</DataSourcesConfig>
<Stylesheet ... />
<!-- Reference the datasource by name in any number of layers -->
<Layer class="land" source_name="natural_earth_land_110m" />
</Map>
Below is a quick example:
# this is a comment
[DEFAULT] # this declares variables for the document
postgis_dbname = gis
[this_is_a_datasource_declaration] # and the <Datasource> parameters follow
dbname = %(postgis_dbname)s # this dereferences a variable
estimate_extent = false
port = 5432
table = (SELECT *, y(astext(way)) AS latitude
FROM planet_osm_point
WHERE (railway IN ('station', 'subway_entrance')
OR aeroway IN ('aerodrome', 'airport'))
AND name IS NOT NULL
ORDER BY z_order ASC, latitude DESC) AS rail_points
etc=etc...
For more detail on sytnax, see python's configparser.
Declare variables in a [DEFAULT] section, then use them in values as %(variable_name)s
[DEFAULT]
natural_earth_110m_base_url = http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m
[natural_earth_land_110m]
file = %(natural_earth_110m_base_url)s/physical/110m-land.zip
[natural_earth_admin0_110m]
file = %(natural_earth_110m_base_url)s/cultural/110m-admin-0-countries.zip
Mapnik XML declares SRS values on the Layer element, yet they are by definition a property of the Datasource. Cascadenik will set a Layer's srs to a Datasource's layer_srs, if it is provided.
The compiler also excepts shorthand notation for EPSG values accepted by Proj.4.
[source1] # long hand
layer_srs = +proj=merc +a=637... # set to a proj.4 string
[source2] # short hand
layer_srs = epsg:4326 # specify 'epsg:...' and cascadenik will create a proj.4 string for it
Sometimes you want to use a different data source, or at least a differently configured one. There are two mechanisms for this:
- Compile-time overrides - the typical case for development
- Permanently redeclare a datasource after its original definition explicitly in the MML
Most config files will declare variables akin to postgis_host****postgis_dbname and shapedir in their [DEFAULT] section. These can be overridden at compile time by providing the --datasources-config= command-line option, and point it to a file that redefines those values in its [DEFAULT] section.
Further, you can redefine entire datasources - you want to use a shapefile instead of a database. To do so, define a datasource with the same name in that same file you override variables in.
$ cat my.cfg
[DEFAULT]
shapedir = /opt/geodata
[processed_p]
type = shape
file = /data/processed_p.shp
layer_srs = epsg:900913
$ cascadenik-compile.py in.mml outdir/out.xml --datasources-config=my.cfg
DataSourcesConfig elements are processed sequentially to build up a full list of datasources. To redefine a datasource you must redeclare a complete replacement after the initial declaration, e.g.
<!-- This file defines [land_polygon] -->
<DataSourcesConfig src="master.cfg" />
<DataSourcesConfig>
[land_polygon] # this second definition will be used
...
</DataSourcesConfig>
<Layer source_name="land_polygon" />
Mapnik supports the datasource templates a mapnik XML file: http://trac.mapnik.org/changeset/574. You declare templates in a similar way, by providing a **template = ** value.
[DEFAULTS] # declare these so they can be overridden easily
postgis_dbname = osm_belgium
postgis_user = gis
postgis_host = localhost
postgis_port = 5432
postgis_pass =
[postgis_conn_0]
type = postgis
user = %(postgis_user)s
dbname = %(postgis_dbname)s
estimate_extent = false
extent = -20037508,-19929239,20037508,19929239
host = %(postgis_host)s
layer_srs = epsg:900913
password = %(postgis_pass)s
port = %(postgis_port)s
[water_area]
template = postgis_conn_0
table = (SELECT *
FROM planet_osm_polygon
WHERE landuse IN ('reservoir', 'water')
OR "natural" IN ('lake', 'water', 'land')
OR waterway IN ('canal', 'riverbank', 'river')
ORDER BY z_order ASC) AS water
A script is provided to take existing MML files and output a config file containing all the datasources defined and output two new files.
$ cascadenik-extract-dscfg.py existing.mml new.mml datasources.cfg