Technologies

How to build & upload a Dockerized application to Google Container Registry in 5 minutes

This post provides an easy example on how to build & upload your application to the private Google Container registry. We assume you have already setup your project and installed Docker. In this example, we’ll build & upload pseudo-perseus v1.0. Since this is a NodeJS-based application, we also assume that you installed a recent version of NodeJS and NPM (see our previous article on how to do that using Ubuntu)

First we configure docker to be able to authenticate to Google:

gcloud auth configure-docker

Now we can checkout the repository and install the NPM packages:

git clone https://github.com/ulikoehler/pseudo-perseus.git
cd pseudo-perseus
git checkout v1.0
npm install

Now we can build the local docker image (we directly name it so that it can be uploaded to the Google Container Registry. Be sure to use the correct google cloud project ID!):

docker build -t eu.gcr.io/myproject-123456/pseudo-perseus:v1.0 .

The next step is to upload the image:

docker push eu.gcr.io/myproject-123456/pseudo-perseus:v1.0

For reference see the official Container Registry documentation.

Posted by Uli Köhler in Cloud, Container, Docker

Fixing gcloud WARNING: `docker-credential-gcloud` not in system PATH

Problem:

You want to configure docker to be able to access Google Container Registry using

gcloud auth configure-docker

but you see this warning message:

WARNING: `docker-credential-gcloud` not in system PATH.
gcloud's Docker credential helper can be configured but it will not work until this is corrected.
gcloud credential helpers already registered correctly.

Solution:

Install docker-credential-gcloud using

sudo gcloud components install docker-credential-gcr

In case you see this error message:

ERROR: (gcloud.components.install) You cannot perform this action because this Cloud SDK installation is managed by an external package manager.
Please consider using a separate installation of the Cloud SDK created through the default mechanism described at: https://cloud.google.com/sdk/

use this alternate installation command instead (this command is for Linux, see the official documentation for other operating systems):

VERSION=1.5.0
OS=linux
ARCH=amd64

curl -fsSL "https://github.com/GoogleCloudPlatform/docker-credential-gcr/releases/download/v${VERSION}/docker-credential-gcr_${OS}_${ARCH}-${VERSION}.tar.gz" \
  | tar xz --to-stdout ./docker-credential-gcr \
  | sudo tee /usr/bin/docker-credential-gcr > /dev/null && sudo chmod +x /usr/bin/docker-credential-gcr

After that, configure docker using

docker-credential-gcr configure-docker

Now you can retry running your original command.

For reference, see the official documentation.

Posted by Uli Köhler in Cloud, Container, Docker, Linux

How to fix kubectl ‘The connection to the server localhost:8080 was refused – did you specify the right host or port?’

Problem:

You want to configure a Kubernetes service using kubectl using a command like

kubectl patch service/"my-elasticsearch-svc" --namespace "default"   --patch '{"spec": {"type": "LoadBalancer"}}'

but you only see this error message:

The connection to the server localhost:8080 was refused - did you specify the right host or port?

Solution:

Kubernetes does not have the correct credentials to access the cluster.

Add the correct credentials to the kubectl config using

gcloud container clusters get-credentials [cluster name] --zone [cluster zone]

e.g.

gcloud container clusters get-credentials cluster-1 --zone europe-west3-c

After that, retry your original command.

In case you don’t know your cluster name or zone, use

gcloud container clusters list

to display the cluster metadata.

Credits to this StackOverflow answer for the original solution.

Posted by Uli Köhler in Allgemein, Cloud, Container, Kubernetes

ElasticSearch equivalent to MongoDB .distinct(…)

Let’s say we have an ElasticSearch index called strings with a field pattern of {"type": "keyword"}.

Now we want to do the equivalent of MongoDB db.getCollection('...').distinct('pattern'):

Solution:

In Python you can use the iterate_distinct_field() helper from this previous post on ElasticSearch distinct. Full example:

from elasticsearch import Elasticsearch

es = Elasticsearch()

def iterate_distinct_field(es, fieldname, pagesize=250, **kwargs):
    """
    Helper to get all distinct values from ElasticSearch
    (ordered by number of occurrences)
    """
    compositeQuery = {
        "size": pagesize,
        "sources": [{
                fieldname: {
                    "terms": {
                        "field": fieldname
                    }
                }
            }
        ]
    }
    # Iterate over pages
    while True:
        result = es.search(**kwargs, body={
            "aggs": {
                "values": {
                    "composite": compositeQuery
                }
            }
        })
        # Yield each bucket
        for aggregation in result["aggregations"]["values"]["buckets"]:
            yield aggregation
        # Set "after" field
        if "after_key" in result["aggregations"]["values"]:
            compositeQuery["after"] = \
                result["aggregations"]["values"]["after_key"]
        else: # Finished!
            break

# Usage example
for result in iterate_distinct_field(es, fieldname="pattern.keyword", index="strings"):
    print(result) # e.g. {'key': {'pattern': 'mypattern'}, 'doc_count': 315}
Posted by Uli Köhler in Databases, ElasticSearch, Python

How to query distinct field values in ElasticSearch

Let’s say we have an ElasticSearch index called strings with a field pattern of {"type": "keyword"}.

Get the top N values of the column

If we want to get the top N ( 12 in our example) entries, i.e. the patterns that are present in the most documents, we can use this query:

{
    "aggs" : {
        "patterns" : {
            "terms" : {
                "field" : "pattern.keyword",
                "size": 12
            }
        }
    }
}

Full example in Python:

from elasticsearch import Elasticsearch

es = Elasticsearch()

result = es.search(index="strings", body={
    "aggs" : {
        "patterns" : {
            "terms" : {
                "field" : "pattern.keyword",
                "size": 12
            }
        }
    }
})
for aggregation in result["aggregations"]["patterns"]["buckets"]:
    print(aggregation) # e.g. {'key': 'mypattern, 'doc_count': 2802}

See the terms aggregation documentation for more infos.

Get all the distinct values of the column

Getting all the values is slightly more complicated since we need to use a composite aggregation that returns an after_key to paginate the query.

This Python helper function will automatically paginate the query with configurable page size:

from elasticsearch import Elasticsearch

es = Elasticsearch()

def iterate_distinct_field(es, fieldname, pagesize=250, **kwargs):
    """
    Helper to get all distinct values from ElasticSearch
    (ordered by number of occurrences)
    """
    compositeQuery = {
        "size": pagesize,
        "sources": [{
                fieldname: {
                    "terms": {
                        "field": fieldname
                    }
                }
            }
        ]
    }
    # Iterate over pages
    while True:
        result = es.search(**kwargs, body={
            "aggs": {
                "values": {
                    "composite": compositeQuery
                }
            }
        })
        # Yield each bucket
        for aggregation in result["aggregations"]["values"]["buckets"]:
            yield aggregation
        # Set "after" field
        if "after_key" in result["aggregations"]["values"]:
            compositeQuery["after"] = \
                result["aggregations"]["values"]["after_key"]
        else: # Finished!
            break

# Usage example
for result in iterate_distinct_field(es, fieldname="pattern.keyword", index="strings"):
    print(result) # e.g. {'key': {'pattern': 'mypattern'}, 'doc_count': 315}
Posted by Uli Köhler in Databases, ElasticSearch, Python

How to fix ElasticSearch ‘Fielddata is disabled on text fields by default’ for keyword field

Problem:

You have a field in ElasticSearch named e.g. patterns of type keyword. However, when you query for an aggregation of this field e.g.

es.search(index="strings", body={
    "size": 0,
    "aggs" : {
        "patterns" : {
            "terms" : { "field" : "pattern" }
        }
    }
})

you see this error message:

elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'Fielddata is disabled on text fields by default. Set fielddata=true on [pattern] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.'

Solution:

This error message is confusing since you already have a keyword field. However, the ElasticSearch fielddata documentation tells us that you need to to use pattern.keyword in the query instead of just pattern.

Full example:

es.search(index="strings", body={
    "size": 0,
    "aggs" : {
        "patterns" : {
            "terms" : { "field" : "pattern.keyword" }
        }
    }
})
Posted by Uli Köhler in Databases, ElasticSearch

How to fix ModuleNotFoundError: No module named ‘grpc’ in Python

Problem:

You want to run a Python script that is using some Google Cloud services. However you see an error message similar to this:

[...]
  File "/usr/local/lib/python3.6/dist-packages/google/api_core/gapic_v1/__init__.py", line 16, in <module>
    from google.api_core.gapic_v1 import config
  File "/usr/local/lib/python3.6/dist-packages/google/api_core/gapic_v1/config.py", line 23, in <module>
    import grpc
ModuleNotFoundError: No module named 'grpc'

Solution:

Install the grpcio Python module:

sudo pip3 install grpcio

or, for Python 2.x

sudo pip install grpcio
Posted by Uli Köhler in Cloud, Linux, Python

How to fix ‘elasticsearch exited with code 78’

Problem:

You want to run ElasticSearch using docker, but the container immediately stops again using this error message

elasticsearch exited with code 78

or

elasticsearch2 exited with code 78

Solution:

If you look through the entire log message, you’ll find lines like

elasticsearch     | [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

Therefore we need to increase the vm.max_map_count limit:

sudo sysctl -w vm.max_map_count=524288

Now we need to edit /etc/sysctl.conf so the setting will also be in effect after a reboot.

Look for any vm.max_map_count line in /etc/sysctl.conf. If you find one, set its value to 524288. If there is no such line present, add the line

vm.max_map_count=524288

to the end of /etc/sysctl.conf

Original source: GitHub

 

Posted by Uli Köhler in Container, Databases, Docker, Linux

MongoDB: How to run db.adminCommand() in NodeJS

Problem:

You want to run a db.adminCommand() in NodeJS using the node-mongodb-native client, e.g. you want to run the NodeJS equivalent of

db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes: 100151432});

Solution:

Use conn.executeDbAdminCommand() where db is a MongoDB database object.

db.executeDbAdminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes: 100151432});

Full example:

// To install, use npm i --save mongodb
const MongoClient = require('mongodb').MongoClient;

async function configureMongoDB() {
    // Connect to MongoDB
    const conn = await MongoClient.connect('mongodb://localhost:27017/', { useNewUrlParser: true });
    const db = await conn.db('mydb');
    // Configure MongoDB settings
    await db.executeDbAdminCommand({
        setParameter: 1,
        internalQueryExecMaxBlockingSortBytes: 100151432
    });
    // Cleanup
    return conn.close();
}

// Run configureMongoDB()
configureMongoDB().then(() => {}).catch(console.error)

 

Posted by Uli Köhler in Databases, NodeJS

How to fix NodeJS MongoDB ‘Cannot read property ‘high_’ of null’

When encountering an error message like

TypeError: Cannot read property 'high_' of null
    at Long.equals (/home/uli/dev/NMUN/node_modules/bson/lib/bson/long.js:236:31)
    at nextFunction (/home/uli/dev/NMUN/node_modules/mongodb-core/lib/cursor.js:473:16)
    at Cursor.next (/home/uli/dev/NMUN/node_modules/mongodb-core/lib/cursor.js:763:3)
    at Cursor._next (/home/uli/dev/NMUN/node_modules/mongodb/lib/cursor.js:211:36)
    at nextObject (/home/uli/dev/NMUN/node_modules/mongodb/lib/operations/cursor_ops.js:192:10)
    at hasNext (/home/uli/dev/NMUN/node_modules/mongodb/lib/operations/cursor_ops.js:135:3)
    (...)

you likely have code like this:

const cursor = db.getCollection('mycollection').find({})
while (cursor.hasNext()) {
    const doc = cursor.next();
    // ... handle doc ...
}

The solution is quite simple: Since find(), cursor.hasNext() and cursor.next() all return Promises, you can’t use their results directly.

This example shows you how to do it properly using async/await:

const cursor = await db.getCollection('mycollection').find({})
while (await cursor.hasNext()) {
    const doc = await cursor.next();
    // ... handle doc ...
}

In order to do this remember that the function containing this code will need to be an async function. See the Mozilla documentation or google for Javascript async tutorial in order to learn about the details!

Posted by Uli Köhler in Databases, Javascript

How to install MongoDB CE on Ubuntu in 1 minute

Quick install using

wget -qO- https://techoverflow.net/scripts/install-mongodb.sh | bash

Run these shell commands on your Ubuntu computer to install the current MongoDB community edition and automatically start it (both instantly and on bootup)

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 4B7C549A058F8B6B
echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo systemctl enable mongod
sudo systemctl start mongod

Source: Official MongoDB documentation

Posted by Uli Köhler in Databases, Linux

How to download a file or directory from a LXC container

To download files, use

lxc file pull <container name>/<path>/<filename> <target directory>

To download directories, use

lxc file pull --recursive <container name>/<path>/<filename> <target directory>

Examples:

Download /root/myfile.txt from mycontainer to the current directory (.):

lxc file pull mycontainer/root/myfile.txt .

Download /root/mydirectory from mycontainer to the current directory (.):

lxc file pull -r mycontainer/root/mydirectory .

 

Posted by Uli Köhler in Container, Linux, LXC, Virtualization

Puppeteer: Get text content / inner HTML of an element

Problem:

You want to use puppeteer to automate testing a webpage. You need to get either the text or the inner HTML of some element, e.g. of

<div id="mydiv">
</div>

on the page.

Solution:

// Get inner text
const innerText = await page.evaluate(() => document.querySelector('#mydiv').innerText);

// Get inner HTML
const innerHTML = await page.evaluate(() => document.querySelector('#mydiv').innerHTML);

Note that .innerText includes the text of sub-elements. You can use the complete DOM API inside page.evaluate(...). You can use any CSS selector as an argument for document.querySelector(...).

Posted by Uli Köhler in Javascript, Puppeteer

How to fix ModuleNotFoundError: No module named ‘google.cloud.iam’

Problem:

You want to run a Python script that uses one of the Google Cloud Python APIs but you get this error message:

ModuleNotFoundError: No module named 'google.cloud.iam'

Solution:

Reinstall any google cloud package using pip:

sudo pip install --upgrade google-cloud-storage

or

sudo pip3 install --upgrade google-cloud-storage

That will also reinstall the relevant google.cloud.iam module.

After that, re-run your script. If that didn’t work, try to install --upgrade some other google-cloud-* module, especially the modules you actually use in your script.

 

Posted by Uli Köhler in Cloud, Python

How to set cv2.VideoCapture() image size in Python

Use cv2.CAP_PROP_FRAME_WIDTH and cv2.CAP_PROP_FRAME_HEIGHT in order to tell OpenCV which image size you would like.

import cv2

video_capture = cv2.VideoCapture(0)
# Check success
if not video_capture.isOpened():
    raise Exception("Could not open video device")
# Set properties. Each returns === True on success (i.e. correct resolution)
video_capture.set(cv2.CAP_PROP_FRAME_WIDTH, 160)
video_capture.set(cv2.CAP_PROP_FRAME_HEIGHT, 120)
# Read picture. ret === True on success
ret, frame = video_capture.read()
# Close device
video_capture.release()

Note that most video capture devices (like webcams) only support specific sets of widths & heights. Use uvcdynctrl -f to find out which resolutions are supported:

$ uvcdynctrl -f
Listing available frame formats for device video0:
Pixel format: YUYV (YUYV 4:2:2; MIME type: video/x-raw-yuv)
  Frame size: 640x480
    Frame rates: 30, 20, 10
  Frame size: 352x288
    Frame rates: 30, 20, 10
  Frame size: 320x240
    Frame rates: 30, 20, 10
  Frame size: 176x144
    Frame rates: 30, 20, 10
  Frame size: 160x120
    Frame rates: 30, 20, 10
Posted by Uli Köhler in OpenCV, Python, Video

How to take a webcam picture using OpenCV in Python

This code opens /dev/video0 and takes a single picture, closing the device afterwards:

import cv2

video_capture = cv2.VideoCapture(0)
# Check success
if not video_capture.isOpened():
    raise Exception("Could not open video device")
# Read picture. ret === True on success
ret, frame = video_capture.read()
# Close device
video_capture.release()

You can also use cv2.VideoCapture("/dev/video0"), but this approach is platform-dependent. cv2.VideoCapture(0) will also open the first video device on non-Linux platforms.

In Jupyter you can display the picture using

import sys
from matplotlib import pyplot as plt

frameRGB = frame[:,:,::-1] # BGR => RGB
plt.imshow(frameRGB)

 

Posted by Uli Köhler in OpenCV, Python, Video

Launching Debian containers using LXC on Ubuntu

Problem:

You know you can launch an Ubuntu LXC container using

lxc launch ubuntu:18.04 myvm

Now you want to launch a Debian VM using

lxc launch debian:jessie myvm

but you only get this error message:

Error: The remote "debian" doesn't exist

Solution:

The debian images are (by default) available from the images remote, not the debian remote, so you need to use this:

lxc launch images:debian/jessie myvm

 

Posted by Uli Köhler in Container, Linux, LXC, Virtualization

Routing public IPv6 addresses to your lxc/lxd containers

The enormous amount of IPv6 addresses available to most commercially hosted VPS / root servers with a public IPv6 prefix allows you to route a public IPv6 address to every container that is running on your server. This tutorial shows you how to do that, even if you have no prior experience with routing,

Step 0: Create your LXC container

We assume you have already done this – just for reference, here’s how you can create a container:

lxc launch ubuntu:18.04 my-container

Step 1: Which IP address do you want to assign to your container?

First you need to find out what prefix is routed to your host. Usually you can do that by checking in your provider’s control panel. You’re looking for something like 2a01:4f9:c010:278::1/64. Another option would be to run sudo ifconfig

and look for a inet6 line in the section of your primary network interface (this only works if you have configured your server to have an IPv6 address). Note that addresses that start with fe80:: and addresses starting with fd, among others, are not public IPv6 addresses.

Then you can define a new IPv6 address to your container. Which one you choose – as long as it’s within the prefix – is entirely your decision.

Often, <prefix>::1 is used for the host itself, therefore you could, for example, choose <prefix>::2. Note that some providers use some IP addresses for other purposes. Check your provider’s documentation for details.

If you don’t want to make it easy to find your container’s public IPv6, don’t choose <prefix>::1<prefix>::2<prefix>::3 etc but something more random like <prefix>:af15:99b1:0b05:1, for example2a01:4f9:c010:278:af15:99b1:0b05:0001. Ensure your IPv6 address has 8 groups of 4 hex digits each!

For this example, we choose the IPv6 address 2a01:4f9:c010:278::8.

Step 2: Find out the ULA of your container

We need to find the ULA (unique local address – similar to a private IPv4 address which is not routed on the internet) of the container. Using lxc, this is quite easy:

[email protected]:~$ lxc list
+--------------+---------+-----------------------+-----------------------------------------------+
|     NAME     |  STATE  |         IPV4          |                     IPV6                      |
+--------------+---------+-----------------------+-----------------------------------------------+
| my-container | RUNNING | 10.144.118.232 (eth0) | fd42:830b:36dc:3691:216:3eff:fed1:9058 (eth0) |
+--------------+---------+-----------------------+-----------------------------------------------+

You need to look in the IPv6 column and copy the address listed there. In this example, the address is fd42:830b:36dc:3691:216:3eff:fed1:9058.

Step 3: Setup IPv6 routing

Now we can tell the host Linux to route your chosen public IPv6 to the container’s private IPv6. This is quite easy:

sudo ip6tables -t nat -A PREROUTING -d <public IPv6> -j DNAT --to-destination <container private IPv6>

In our example, this would be

sudo ip6tables -t nat -A PREROUTING -d 2a01:4f9:c010:278::8 -j DNAT --to-destination fd42:830b:36dc:3691:216:3eff:fed1:9058

First, test the command by running it in a shell. If it works (i.e. if it doesn’t print any error message), you can permanently store it e.g. by adding it to /etc/rc.local (after #!/bin/bash, before exit 0). Advanced users should prefer to add it to /etc/network/interfaces.

Step 4: Connect to your container using SSH on your public IPv6 (optional)

Note: This step requires that you have working IPv6 connectivity at your local computer. If you are unsure, check at ipv6-test.com

First, open a shell on your container:

lxc exec my-container bash

After running this, you should see a root shell prompt inside your container:

[email protected]:~#

The following commands should be entered in the container shell, not the host!

Now we can create a user to login to (in this example, we create the uli user):

[email protected]:~# adduser uli
Adding user `uli' ...
Adding new group `uli' (1001) ...
Adding new user `uli' (1001) with group `uli' ...
Creating home directory `/home/uli' ...
Copying files from `/etc/skel' ...
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
Changing the user information for uli
Enter the new value, or press ENTER for the default
        Full Name []: 
        Room Number []: 
        Work Phone []: 
        Home Phone []: 
        Other []: 
Is the information correct? [Y/n]

You only need to enter a password (you won’t see anything on screen when entering it) twice, for all other lines you can just press enter.

The ubuntu:18.04 lxc image used in this example does not allow SSH password authentication in its default configuration. In order to fix this, change PasswordAuthentication no to PasswordAuthentication yes in /etc/ssh/sshd_config and restart the SSH server by running service sshd restart. Be sure you understand the security implications before you do that!

Now, logout of your container shell by pressing Ctrl+D. The following commands can be entered on your desktop or any other server with IPv6 connectivity.

Now login to your server:

ssh <username>@<public IPv6 address>

in this example:

ssh [email protected]:4f9:c010:278::8

If you configured everything correctly, you’ll see the shell prompt for your container:

[email protected]:~$

Note: Don’t forget to configure a firewall for your container, e.g. ufw! Your container’s IPv6 is exposed to the internet and just assuming noone will guess it is not good security practice.

Posted by Uli Köhler in Cloud, Container, Linux, LXC, Networking