How to filter OpenStreetmap ways by tag from .osm.pbf file using Python & Osmium

In our previous post Minimal example how to read .osm.pbf file using Python & osmium we investigated how to read a .osm.pbf file and count all nodes, ways and relations.

Today, we’ll investigate how to filter for specific ways by tag using osmium working on .osm.pbf files. In this example, we’ll filter by highwaytrunk

#!/usr/bin/env python3
import osmium as osm
from collections import namedtuple
import itertools

Way = namedtuple("Way", ["id", "nodes", "tags"])

class FindHighwayTrunks(osm.SimpleHandler):
    """
    Find ways with "highway: trunk" tag
    """

    def __init__(self):
        osm.SimpleHandler.__init__(self)
        self.ways = []

    def way(self, way):
        # If this way is a highway: trunk, ...
        if way.tags.get("highway") == "trunk":
            # ... add it to self.ways
            # NOTE: We may not keep a reference to the way object,
            # so we have to create a new Way object
            nodes = [node.ref for node in way.nodes]
            self.ways.append(Way(way.id, nodes, dict(way.tags)))

# Find ways with the given tag
way_finder = FindHighwayTrunks()
way_finder.apply_file("kenya-latest.osm.pbf")

print(f"Found {len(way_finder.ways)} ways")

This example uses kenya-latest.osm.pbf downloaded from Geofabrik, but you can use any .osm.pbf file. Parsing said file takes about 10 seconds on my Desktop.

Note that we can’t just self.ways.append(way) because way is a temporary/internal Protobuf object and may not be used directly since. Hence, we create our own object that just wraps the extracted contents of way: Its id, its list of nodes (node IDs) and a dictionary containing the tags. Otherwise, you’ll see this error message:

Traceback (most recent call last):
  File "run.py", line 32, in <module>
    trunk_handler.apply_file("kenya-latest.osm.pbf")
RuntimeError: Way callback keeps reference to OSM object. This is not allowed.