Programming languages

How to convert collections.Counter to pandas DataFrame

Pandas can take care of the conversion of a Counter to a DataFrameby itself but you need to add a column label:

pd.DataFrame({"YourColumnLabelGoesHere": counterObject})

Full example

import pandas as pd
from collections import Counter

ctr = Counter()
ctr["a"] += 1
ctr["b"] += 1
ctr["a"] += 1
ctr["a"] += 1
ctr["b"] += 1
ctr["a"] += 1
ctr["c"] += 1

pd.DataFrame({"ctr": ctr})

This will result in the following DataFrame:

 

Posted by Uli Köhler in pandas, Python

structlog minimal example

import structlog

logger = structlog.get_logger()

# Usage example
logger.info("Test log")

 

Posted by Uli Köhler in Python

How to fix matplotlib OSError: ‘xkcd’ not found in the style library

Problem:

While trying to enable the matplotlib xkcd style using

plt.style.use("xkcd")

you see the following error message:

OSError: 'xkcd' not found in the style library and input is not a valid URL or path; see `style.available` for list of available styles

Solution:

You can’t enable xkcd-style plots by running plt.style.use("xkcd"). Instead, use with plt.xkcd():

with plt.xkcd():
    # TODO your plotting code goes here!
    # plt.plot(x, y) # Example

 

Posted by Uli Köhler in Python

Recommended library for executing shell commands in Python

I recommend using invoke instead of the built-in subprocess to handle executing any shell command in Python.

Not only does invoke‘s run() it provide a more user friendly syntax compared to e.g. subprocess.check_output():

run('make')

but it also tends to act more like you’d expect especially regarding the output of the command and has easy-to-use parameters such as hide=True to hide the output of shell commands.

Furthermore, it provides a buch of really useful features such as automatically responding to prompts from the shell command.

Posted by Uli Köhler in Python

How to test if MongoDB database exists on command line (bash)

Use this command to test if a given MongoDB database exists:

mongo --quiet --eval 'db.getMongo().getDBNames().indexOf("mydb")'

This will return an index such as 0 or 241 if the database is found. On the other hand, it will return -1 if the database does not exist.

docker-compose version:

docker-compose exec mongodb mongo --quiet --eval 'db.getMongo().getDBNames().indexOf("mydb")'

where mongodb is the name of your container.

Now we can put it together in a bash script to test if the database exists:

# Query if DB exists in MongoDB
mongo_indexof_db=$(mongo --quiet --eval 'db.getMongo().getDBNames().indexOf("mydb")')
if [ $mongo_indexof_db -ne "-1" ]; then
    echo "MongoDB database exists"
else
    echo "MongoDB database does not exist"
fi

 

docker-compose variant:

# Query if DB exists in MongoDB
mongo_indexof_db=$(docker-compose -f inspect.yml exec -T mongodb mongo --quiet --eval 'db.getMongo().getDBNames().indexOf("mydb")')
if [ $mongo_indexof_db -ne "-1" ]; then
    echo "MongoDB database exists"
else
    echo "MongoDB database does not exist"
fi

 

Posted by Uli Köhler in MongoDB, Shell

How to fix Python pyarrow pip install error: Could NOT find Arrow (missing: Arrow_DIR)

Problem:

When trying to install pyarrow such as using

pip install pyarrow

you see an error log like

      -- Found Python3Alt: /home/uli/.pypy3-virtualenv/bin/pypy3
      CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
        The package name passed to `find_package_handle_standard_args` (PkgConfig)
        does not match the name of the calling package (Arrow).  This can lead to
        problems in calling code that expects `find_package` result variables
        (e.g., `_FOUND`) to follow a certain pattern.
      Call Stack (most recent call first):
        /usr/share/cmake-3.18/Modules/FindPkgConfig.cmake:59 (find_package_handle_standard_args)
        cmake_modules/FindArrow.cmake:39 (include)
        cmake_modules/FindArrowPython.cmake:46 (find_package)
        CMakeLists.txt:229 (find_package)
      This warning is for project developers.  Use -Wno-dev to suppress it.
      
      -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2")
      -- Could NOT find Arrow (missing: Arrow_DIR)
      -- Checking for module 'arrow'
      --   No package 'arrow' found
      CMake Error at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:165 (message):
        Could NOT find Arrow (missing: ARROW_INCLUDE_DIR ARROW_LIB_DIR
        ARROW_FULL_SO_VERSION ARROW_SO_VERSION)
      Call Stack (most recent call first):
        /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:458 (_FPHSA_FAILURE_MESSAGE)
        cmake_modules/FindArrow.cmake:450 (find_package_handle_standard_args)
        cmake_modules/FindArrowPython.cmake:46 (find_package)
        CMakeLists.txt:229 (find_package)
      
      
      -- Configuring incomplete, errors occurred!
      See also "/tmp/pip-install-409dctif/pyarrow_b70cde6894c3469483f7360493fc2e65/build/temp.linux-x86_64-pypy39/CMakeFiles/CMakeOutput.log".
      error: command '/usr/bin/cmake' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pyarrow
Failed to build pyarrow
ERROR: Could not build wheels for pyarrow, which is required to install pyproject.toml-based projects

Solution:

You need to install the arrow library in order to be able to compile pyarrow from source. On Ubuntu, this can be done using

sudo apt install -y -V ca-certificates lsb-release wget
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb -O /tmp/apache-arrow.deb
sudo apt -y install /tmp/apache-arrow.deb
sudo apt -y update
sudo apt -y install libarrow-dev libarrow-python-dev

 

Posted by Uli Köhler in Python

How to fix Python Pillow pip install exception: RequiredDependencyException: jpeg

Problem:

When trying to install pillow such as using

pip install Pillow

you see an error log like

      running build_ext
      
      
      The headers or library files could not be found for jpeg,
      a required dependency when compiling Pillow from source.
      
      Please see the install instructions at:
         https://pillow.readthedocs.io/en/latest/installation.html
      
      Traceback (most recent call last):
        File "/tmp/pip-install-_g5fa7ox/pillow_7cb18c0d6bec468e8844184b98c8bf45/setup.py", line 989, in <module>
          setup(
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/__init__.py", line 87, in setup
          return distutils.core.setup(**attrs)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/core.py", line 148, in setup
          return run_commands(dist)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/core.py", line 163, in run_commands
          dist.run_commands()
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands
          self.run_command(cmd)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/dist.py", line 1214, in run_command
          super().run_command(command)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
          cmd_obj.run()
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/command/install.py", line 68, in run
          return orig.install.run(self)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/command/install.py", line 670, in run
          self.run_command('build')
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/dist.py", line 1214, in run_command
          super().run_command(command)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
          cmd_obj.run()
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/command/build.py", line 136, in run
          self.run_command(cmd_name)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/dist.py", line 1214, in run_command
          super().run_command(command)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
          cmd_obj.run()
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/command/build_ext.py", line 79, in run
          _build_ext.run(self)
        File "/home/uli/.pypy3-virtualenv/lib/pypy3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 339, in run
          self.build_extensions()
        File "/tmp/pip-install-_g5fa7ox/pillow_7cb18c0d6bec468e8844184b98c8bf45/setup.py", line 804, in build_extensions
          raise RequiredDependencyException(f)
      RequiredDependencyException: jpeg
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-_g5fa7ox/pillow_7cb18c0d6bec468e8844184b98c8bf45/setup.py", line 1009, in <module>
          raise RequiredDependencyException(msg)
      RequiredDependencyException:
      
      The headers or library files could not be found for jpeg,
      a required dependency when compiling Pillow from source.
      
      Please see the install instructions at:
         https://pillow.readthedocs.io/en/latest/installation.html
      
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> Pillow

Solution:

Pillow needs a buch of libraries to be installed in order to work properly. Use the following command from the official Pillow website on Ubuntu:

sudo apt-get install cmake libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk libharfbuzz-dev libfribidi-dev libxcb1-dev

or check out the installation guide for commands for other operating systems.

Posted by Uli Köhler in Python

How to fix Python MongoDB TypeError: Object of type ObjectId is not JSON serializable

Problem:

When trying to export data as JSON that has originally been queried from MongoDB using code like

with open("alle.json", "w") as outfile:
    json.dump(alle, outfile)

you see the following error message:

File /usr/lib/python3.9/json/__init__.py:179, in dump(obj, fp, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    173     iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    174         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    175         separators=separators,
    176         default=default, sort_keys=sort_keys, **kw).iterencode(obj)
    177 # could accelerate with writelines in some versions of Python, at
    178 # a debuggability cost
--> 179 for chunk in iterable:
    180     fp.write(chunk)

File /usr/lib/python3.9/json/encoder.py:429, in _make_iterencode.<locals>._iterencode(o, _current_indent_level)
    427     yield _floatstr(o)
    428 elif isinstance(o, (list, tuple)):
--> 429     yield from _iterencode_list(o, _current_indent_level)
    430 elif isinstance(o, dict):
    431     yield from _iterencode_dict(o, _current_indent_level)

File /usr/lib/python3.9/json/encoder.py:325, in _make_iterencode.<locals>._iterencode_list(lst, _current_indent_level)
    323         else:
    324             chunks = _iterencode(value, _current_indent_level)
--> 325         yield from chunks
    326 if newline_indent is not None:
    327     _current_indent_level -= 1

File /usr/lib/python3.9/json/encoder.py:405, in _make_iterencode.<locals>._iterencode_dict(dct, _current_indent_level)
    403         else:
    404             chunks = _iterencode(value, _current_indent_level)
--> 405         yield from chunks
    406 if newline_indent is not None:
    407     _current_indent_level -= 1

File /usr/lib/python3.9/json/encoder.py:438, in _make_iterencode.<locals>._iterencode(o, _current_indent_level)
    436         raise ValueError("Circular reference detected")
    437     markers[markerid] = o
--> 438 o = _default(o)
    439 yield from _iterencode(o, _current_indent_level)
    440 if markers is not None:

File /usr/lib/python3.9/json/encoder.py:179, in JSONEncoder.default(self, o)
    160 def default(self, o):
    161     """Implement this method in a subclass such that it returns
    162     a serializable object for ``o``, or calls the base implementation
    163     (to raise a ``TypeError``).
   (...)
    177 
    178     """
--> 179     raise TypeError(f'Object of type {o.__class__.__name__} '
    180                     f'is not JSON serializable')

TypeError: Object of type ObjectId is not JSON serializable

Solution:

This error occurs because objects queried from PyMongo always contain _id which is of type ObjectId and the normal JSON library (or drop-in replacements like simplejson do not know how to create JSON representations of Objects of type ObjectId).

In order to fix this, use pymongo‘s json_util instead of json. Note that the bson.json_util package contains dumps but does not contain dump, so use the following snippet to write to a file:

 

import bson.json_util as json_util

with open("alle.json", "w") as outfile:
    outfile.write(json_util.dumps(alle))

 

Posted by Uli Köhler in MongoDB, Python

How to fix RcppGSL installation error gsl-config: Command not found

Problem:

While trying to install RcppGSL using

BiocManager::install("RcppGSL")

you see the following error message:

checking for gcc option to accept ISO C89... none needed
checking for gsl-config... no
configure: error: gsl-config not found, is GSL installed?
ERROR: configuration failed for package ‘RcppGSL’
* removing ‘/usr/local/lib/R/site-library/RcppGSL’

The downloaded source packages are in
        ‘/tmp/RtmpqSzFab/downloaded_packages’
Warning message:
In .inet_warning(msg) :
  installation of package ‘RcppGSL’ had non-zero exit status

Solution:

You need to install the libgsl2 development headers which include the gsl-config executable.

On Ubuntu, you can install it using

sudo apt -y install libgsl2-dev

 

Posted by Uli Köhler in R

How to iterate all databases in PyMongo

This short example shows how to iterate all databases or list all database names for a MongoDB in Python using pymongo:

from pymongo import MongoClient

client = MongoClient("mongodb://localhost")

for database in client.list_databases():
    print(database['name'])

s

Posted by Uli Köhler in Python

How to start Jupyter Lab for remote access

This will start Jupyter listening on all network interfaces / bind to all IP addresses in order to make direct browser access possible not only from localhost but any remote host that has network access to the host where you’re running Jupyter:

jupyter lab --ip=0.0.0.0

 

Posted by Uli Köhler in Networking, Python

How to fix R package installation fatal error: lzma.h: No such file or directory

Problem:

While installing some R package, you see an error message like

cram/cram_io.c:61:10: fatal error: lzma.h: No such file or directory
   61 | #include 
      |          ^~~~~~~~
compilation terminated.

Solution:

You need to install the liblzma development headers. On Ubuntu, you can do that using

sudo apt -y install liblzma-dev

 

Posted by Uli Köhler in R

How to fix R package installation fatal error: bzlib.h: No such file or directory

Problem:

While installing some R package, you see an error message like

cram/cram_io.c:57:10: fatal error: bzlib.h: No such file or directory
   57 | #include 
      |          ^~~~~~~~~
compilation terminated.

Solution:

You need to install the libbz2 development headers. On Ubuntu, you can do that using

sudo apt -y install libbz2-dev

 

Posted by Uli Köhler in R

How to fix Jupyter Lab ImportError: cannot import name ‘soft_unicode’ from ‘markupsafe’

Problem:

When running

jupyter lab

you see the following error message:

Traceback (most recent call last):
  File "/usr/local/bin/jupyter-lab", line 5, in <module>
    from jupyterlab.labapp import main
  File "/usr/local/lib/python3.8/dist-packages/jupyterlab/labapp.py", line 13, in <module>
    from jupyter_server.serverapp import flags
  File "/usr/local/lib/python3.8/dist-packages/jupyter_server/serverapp.py", line 39, in <module>
    from jinja2 import Environment, FileSystemLoader
  File "/usr/lib/python3/dist-packages/jinja2/__init__.py", line 33, in <module>
    from jinja2.environment import Environment, Template
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 15, in <module>
    from jinja2 import nodes
  File "/usr/lib/python3/dist-packages/jinja2/nodes.py", line 23, in <module>
    from jinja2.utils import Markup
  File "/usr/lib/python3/dist-packages/jinja2/utils.py", line 656, in <module>
    from markupsafe import Markup, escape, soft_unicode
ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/dist-packages/markupsafe/__init__.py)

Solution:

You need to install an older version of markupsafe using

sudo pip3 install markupsafe==2.0.1

until other packages have been updated.

Posted by Uli Köhler in Python

How to fix R package installation /bin/bash: gfortran: command not found

Problem:

While installing some R package, you see an error message like

/bin/bash: gfortran: command not found

Solution:

You need to install the gfortran compiler. On Ubuntu, you can do that using

sudo apt -y install gfortran

 

Posted by Uli Köhler in R

How to fix R package installation /usr/bin/ld: cannot find -lblas

Problem:

While installing some R package, you see an error message like

/usr/bin/ld: cannot find -lblas

Solution:

You need to install the libblas development headers. On Ubuntu, you can do that using

sudo apt -y install libblas-dev

 

Posted by Uli Köhler in R

How to fix R package installation /usr/bin/ld: cannot find -llapack

Problem:

While installing some R package, you see an error message like

/usr/bin/ld: cannot find -llapack

Solution:

You need to install the liblapack development headers. On Ubuntu, you can do that using

sudo apt -y install liblapack-dev

 

Posted by Uli Köhler in R

How to fix R package installation fatal error: png.h: No such file or directory

Problem:

While installing some R package, you see an error message like

/bin/bash: libpng-config: command not found
read.c:3:10: fatal error: png.h: No such file or directory
    3 | #include 
      |          ^~~~~~~
compilation terminated.
make: *** [/usr/lib/R/etc/Makeconf:168: read.o] Error 1

Solution:

You need to install the libpng development headers. On Ubuntu, you can do that using

sudo apt -y install libpng-dev

 

Posted by Uli Köhler in R

How to fix R package installation fatal error: zlib.h: No such file or directory

Problem:

While installing some R package, you see an error message like

io_utils.c:16:10: fatal error: zlib.h: No such file or directory
   16 | #include <zlib.h>
      |          ^~~~~~~~
compilation terminated.
make: *** [/usr/lib/R/etc/Makeconf:168: io_utils.o] Error 1

Solution:

You need to install the zlib development headers. On Ubuntu, you can do that using

sudo apt -y install zlib1g-dev

 

Posted by Uli Köhler in R

How to fix R openssl error fatal error: openssl/opensslv.h: File or directory not found

Problem:

When installing the R openssl package using

BiocManager::install("openssl")

or any package depending on the openssl package, you see an error message like

-------------------------- [ERROR MESSAGE] ---------------------------
tools/version.c:1:10: fatal error: openssl/opensslv.h: File or directory not found
    1 | #include <openssl/opensslv.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
--------------------------------------------------------------------
ERROR: configuration failed for package ‘openssl’
* removing ‘/home/uli/R/x86_64-pc-linux-gnu-library/4.0/openssl’

Solution:

You need to install OpenSSL. On Ubuntu, install it using

sudo apt -y install libssl-dev

For other systems, see the description just above the error message by the authors of the package:

--------------------------- [ANTICONF] --------------------------------
Configuration failed because openssl was not found. Try installing:
 * deb: libssl-dev (Debian, Ubuntu, etc)
 * rpm: openssl-devel (Fedora, CentOS, RHEL)
 * csw: libssl_dev (Solaris)
 * brew: [email protected] (Mac OSX)
If openssl is already installed, check that 'pkg-config' is in your
PATH and PKG_CONFIG_PATH contains a openssl.pc file. If pkg-config
is unavailable you can set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
Posted by Uli Köhler in R