How to fix Python pyarrow pip install error: Could NOT find Arrow (missing: Arrow_DIR)
Problem:
When trying to install pyarrow such as using
pip install pyarrow
you see an error log like
-- Found Python3Alt: /home/uli/.pypy3-virtualenv/bin/pypy3
CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
The package name passed to `find_package_handle_standard_args` (PkgConfig)
does not match the name of the calling package (Arrow). This can lead to
problems in calling code that expects `find_package` result variables
(e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
/usr/share/cmake-3.18/Modules/FindPkgConfig.cmake:59 (find_package_handle_standard_args)
cmake_modules/FindArrow.cmake:39 (include)
cmake_modules/FindArrowPython.cmake:46 (find_package)
CMakeLists.txt:229 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2")
-- Could NOT find Arrow (missing: Arrow_DIR)
-- Checking for module 'arrow'
-- No package 'arrow' found
CMake Error at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:165 (message):
Could NOT find Arrow (missing: ARROW_INCLUDE_DIR ARROW_LIB_DIR
ARROW_FULL_SO_VERSION ARROW_SO_VERSION)
Call Stack (most recent call first):
/usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:458 (_FPHSA_FAILURE_MESSAGE)
cmake_modules/FindArrow.cmake:450 (find_package_handle_standard_args)
cmake_modules/FindArrowPython.cmake:46 (find_package)
CMakeLists.txt:229 (find_package)
-- Configuring incomplete, errors occurred!
See also "/tmp/pip-install-409dctif/pyarrow_b70cde6894c3469483f7360493fc2e65/build/temp.linux-x86_64-pypy39/CMakeFiles/CMakeOutput.log".
error: command '/usr/bin/cmake' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pyarrow
Failed to build pyarrow
ERROR: Could not build wheels for pyarrow, which is required to install pyproject.toml-based projects
Solution
You need to install the arrow library in order to be able to compile pyarrow from source. On Ubuntu, this can be done using
sudo apt install -y -V ca-certificates lsb-release wget
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb -O /tmp/apache-arrow.deb
sudo apt -y install /tmp/apache-arrow.deb
sudo apt -y update
sudo apt -y install libarrow-dev libarrow-python-dev