Reading a Shapefile directly from a ZIP using pyshp
Problem
You have a ZIP file containing a single ESRI shapefile database (i.e. three files), for example the Natural Earth Countries database. Without unzipping the ZIP you want to use pyshp in order to read the data contain in the shapefile.
Solution
This solution currently only works with the correct branch of my fork. I’ve submitted a pull request and I’ll update this post once it has been merged & released.
Install the correct version (plus UliEngineering which we will use for a simpler solution later) using
pip3 install https://github.com/ulikoehler/pyshp/archive/ulikoehler-patch-2.zip --upgrade
pip3 install UliEngineering --upgrade
As it turns out, it’s more complex (in terms of lines of code) to find the prefix of the shapefile than to actually read the shapefile. Luckily, UliEngineering provides find_datasets_by_extension()
and read_from_zip()
to make this process more approachable.
from UliEngineering.Utils.Files import *
from UliEngineering.Utils.ZIP import *
import shapefile
# List files inside ZIP
zipcontents = list(list_zip("ne_110m_admin_0_countries.zip"))
# Find one filename that is present with ".shp", ".dbf" and ".prj" extensions
dataset_filenames = list(find_datasets_by_extension(zipcontents, (".shp", ".dbf", ".prj")))
# Read the files (copy to memory)
dataset = read_from_zip("ne_110m_admin_0_countries.zip", dataset_filenames[0])
# Read shapefile format
reader = shapefile.Reader(shp=dataset[0], dbf=dataset[1], prj=dataset[2])
# Do something useful with the reader