Watermap

2024 –

Drinking water, plus toilets and benches, on a 3D map using data from OpenStreetMap.

Problem

When traveling or hiking, I often look for fountains to refill my water bottle. Nature also sometimes calls at inopportune times.

Outcome

A website that displays drinking water sources and toilets across Europe. Mobile-optimized to load quickly and allow users to quickly (and optionally) see what's nearby.

Watermap

Switzerland and many European countries are blessed with public drinking water sources. I usually have a reusable bottle with me when out and about, but sometimes it takes some searching to find a nearby fountain to fill up.

Continuing with my frontend mentorship streak, I decided to build a map myself.

I had a few goals in mind:

Keep it as simple as possible (both in terms of code and features).
Make it work well on mobile devices (primarily, make it load quickly).
Have zero operating costs.

How it works

There were several steps needed to get this working.

Get the data
Process
Optimize the data
Show the data on a map
Host the site

1. Get the data

OpenStreetMap is always a good place to start with this specific type of geospatial data.

OSM features uses tags to describe features of map elements. I don't know of a way to only download elements with specific tags from OSM, and having the most up-to-date information isn't required, so I looked at downloading all OSM data. This is provided, but at 78GB, it's quite a lot of data to work with on this supposedly-simple project.

Luckily, GeoFabrik offers regional extracts of OSM Planet data. The data for Switzerland, where I live, is less than 500MB. This seemed like a good place to start.

2. Process the data

Once I had a nice switzerland.osm.pbf, I then tried to figure out how to extract only the data I needed (drinking water, and later toilets and benches). Osmium is a powerful CLI tool to process .pbf files and extract data based on tags.

osmium tags-filter switzerland.osm.pbf drinking_water=yes, amenity=drinking_water -o drinking_water.pbf

The tags-filter lets you pass one or more tags to extract. Here, Osmium checks for any drinking_water key with the value of yes, or any amenity key with the value of drinking_water. OSM tags are user-defined, but amenity=drinking_water and drinking_water=yes seem to be the two used in practice. I repeated the process with amenity=toilets and building=toilets for public restrooms, and amenity=bench and leisure=picnic_table for places to sit.

Now our output pbf files (drinking_water.pbf, toilets.pbf and benches.pbf) are each around 1MB, a 99%+ reduction.

3. Optimize the data

The data should be optimized in at least two ways:

The amount of data shown at one point shouldn't be overwhelming.
The amount of data the browser needs to download should be minimized.

From my limited technical ability, and keeping things simple (goal 1), tiling seemed like the best approach. That way, enough data is shown without having to download the whole dataset.

Tippecanoe is an open-source CLI tool to tile geospatial datasets. PBF is not a supported input filetype (AFAIK), so the exported PBF needed to be converted into a format that Tippecanoe supports.

Python and Geopandas seemed like a good tool for this task. After some trial and error, I came up with an approach:

Load exported PBF
Loop through "layers" in the PBF (element types, which include points, lines, multilinestrings (not commonly used AFAIK), and multipolygons)
Export layers as individual GeoJSON files (as {layer}.geojson)

layers = ['points', 'lines', 'multilinestrings', 'multipolygons']

gdf_list = []

# Iterate through the layers and read each one
for layer in layers:
    try:
        # Read the layer from the PBF file
        gdf = gpd.read_file("../data/raw/output/europe_toilets.pbf", engine="pyogrio", layer=layer)

        # Add a new column to indicate the layer
        gdf['layer'] = layer

        # Append the GeoDataFrame to the list
        gdf_list.append(gdf)

        # Optionally print the first few rows of the GeoDataFrame
        print(f"Layer: {layer}, Number of features: {len(gdf)}")

        # Export the layer as a separate .geojson, to later add as tile layers in tippecanoe

        if len(gdf) > 0:
            export = gdf[["osm_id", "geometry"]]
            export.to_file(f"../data/raw/geojsons/{layer}.geojson", driver="GeoJSON")
        else:
            print(f"Nothing to write to {layer}.geojson. Skipping...")
    except Exception as e:
        print(f"Failed to read layer {layer}: {e}")

The full process is written in this Jupyter Notebook, and is repeated for each of the three categories (drinking water, toilets, and benches).

Then, Tippecanoe is used to load the GeoJSON files. One tile set is created for each data type. This could probably be one tile set for all categories. I played around with some of the various flags and settings to try to get the appearance of the tiles to balance data accuracy while not overwhelming users.

tippecanoe -z14 --drop-densest-as-needed --extend-zooms-if-still-dropping --no-tile-compression --output-to-directory=drinking_water/ raw/geojsons/lines.geojson raw/geojsons/multipolygons.geojson raw/geojsons/points.geojson -B 12

Let's break this down:

z14 creates tiles up to zoom level 14. Creating higher zoom level tiles exponentially increases the file size or number of tile files. z14 adequately showed all data without missing detail in this use case.
--drop-densest-as-needed and --extend-zooms-if-still-dropping help to reduce the file size of the tile set, especially at low zoom levels.
--no-tile-compression because apparently GitHub Pages doesn't support compressed tiles.
--output-to-directory gives us a bunch of small files, instead of a single file. I think this is also needed to get this working with GitHub Pages.
-B 12 ensures that all data is loaded at zoom level 12. I was having some issues with not having all data appear at higher zoom levels, and this seemed to fix it.

4. Show the data on a map

Thinking of goal 1 again, the map is build with Javascript (no frameworks) using MapLibre. Even more simply, the JS code is written directly in the index.html file.

When the page loads, the map loads, with viewport defaulting to a view of Europe.

Watermap_mobile

All three sources (drinking water tiles, toilet tiles and bench tiles) are added as individual sources in MapLibre. Then, each layer from each source (points, lines, multilinestrings and multipolygons) are added as layers in MapLibre. Each source type shares a color, but the layers have different styling (shared across source type).

Attentive readers might have noticed that the osm_id was kept in the data optimization step – if a point is clicked, its OSM_ID is shown in a tooltip. It would be better to link to the OSM feature through the tooltip.

The basemap is from OpenFreeMap.

MapLibre has a built-in NavigationControl which lets the map access the user's location if they choose to permit it.

5. Host the site

As eluded to earlier, this site is hosted at a GitHub Page. Every time I commit to main, the site is re-built. Tiles are stored and loaded from GitHub, and the total hosting cost is $0. Easy!

Extras

Going against goal 1, I thought about adding more features. The first was to add toilets and benches (already mentioned), which means the name doesn't really make sense anymore...

I then realized that I'm often looking for water / toilets / benches while doing some sport in the mountains. Adding elevation data to create a 3D map would help determine if the desired feature was uphill or downhill. I added this from Tilezen and sources, but then realized it was going against goal 2 by having to load terrain data, so a terrain toggle was added to reduce initial loading times.

Thoughts and next steps

I think this turned out quite well! It was a great way to learn how to work with large OSM datasets, create map tiles, and build a map using MapLibre. While the scope did increase to include more data types, I think it still sticks to the core vision of being able to see where key features are quickly and in your proximity. The map and data load in under 4MB (most of that coming from the basemap from what I can tell) There are no operating costs – thank you to the open source contributors of the various libraries and datasets used, and to GitHub for supporting free hosting.

This post covers adding data for Switzerland, but data for all of Europe is currently on the map. Adding data globally would be nice, but I think I was pushing the limits of the number of changes in a commit that GitHub appreciates. I'm not sure if I'd run into diff size limitations or timeouts if trying to add more tiles.

Updating the data periodically would be nice as well. Automating the data processing and tiling steps would help make this easier.

For other ideas, please see the Issues section of the repo. Feel free to open an issue yourself!