Reading a Shapefile directly from a ZIP using pyshp

Problem

You have a ZIP file containing a single ESRI shapefile database (i.e. three files), for example the Natural Earth Countries database. Without unzipping the ZIP you want to use pyshp in order to read the data contain  in the shapefile.

Solution:

This solution currently only works with the correct branch of my fork. I’ve submitted a pull request and I’ll update this post once it has been merged & released.

Install the correct version (plus UliEngineering which we will use for a simpler solution later) using

pip3 install https://github.com/ulikoehler/pyshp/archive/ulikoehler-patch-2.zip --upgrade

pip3 install git+https://github.com/ulikoehler/UliEngineering.git --upgrade

As it turns out, it’s more complex (in terms of SLOC) to find the prefix of the shapefile than to actually read the shapefile. Luckily, UliEngineering provides

from UliEngineering.Utils.Files import *
from UliEngineering.Utils.ZIP import *
import shapefile

# List files inside ZIP
zipcontents = list(list_zip("ne_110m_admin_0_countries.zip"))
# Find one filename that is present with ".shp", ".dbf" and ".prj" extensions
dataset_filenames = list(find_datasets_by_extension(zipcontents, (".shp", ".dbf", ".prj")))
# Read the files (copy to memory)
dataset = read_from_zip("ne_110m_admin_0_countries.zip", dataset_filenames[0])
# Read shapefile format
reader = shapefile.Reader(shp=dataset[0], dbf=dataset[1], prj=dataset[2])
# Do something useful with the reader