Find the total number of open file descriptors

Posted on 25 July 2022 by kosmas

To get the total number of open file descriptors use the following

awk '{print $1}' /proc/sys/fs/file-nr

Request Entity Too Large when trying to import Gitlab project

Posted on 28 June 2022 by kosmas

Problem

You would like to import an existing Gitlab project, through an export file, to a new self hosted instance of Gitlab, but using the Web UI, even after changing the max-body-size in the ingress deployment you end up having the error message

Request Entity Too Large

Solution

There is another way to import the exported file, but is not documented anywhere as it is classed as EXPERIMENTAL from Gitlab.

You can copy the exported file to the gitlab-toolbox pod

kubectl --kubeconfig ~/.kube/gitlab_config cp local_export.tar.gz gitlab-toolbox-xxx-xxx:/tmp/

You can then login to the gitlab-toolbox pod

kubectl --kubeconfig ~/.kube/gitlab_config -n gitlab-system exec -it gitlab-toolbox-xxx-xxx -- bash

get to directory with the application

cd srv/gitlab

and finally use the rake task gitlab:import_export:import to import your project

git@gitlab-toolbox-xxx-xxx:/srv/gitlab$ bundle exec rake gitlab:import_export:import[your_new_gitlab_username,namespace_path,project_path,/tmp/2022-06-14_14-53-007_export.tar.gz]

Helm list does not produce the full list of deployments

Posted on 14 June 2022 by kosmas

Problem

You are using helm list and there is a time that you don’t get the full list back, after some time.

Solution

By default helm only displays 256 items, so if you are over this default limit you will need to add the –max 0 flag to the command like

helm list --max 0

Changing gitlab-runner configuration options

Posted on 23 May 2022 by kosmas

It is possible to change the configuration of a running gitlab-runner by editing the file ~/.gitlab-runner/config.toml.

For example to be able to switch the log_level from ‘info’ to ‘debug’ and back again, you can do so by logging into the gitlab-runner and then editing the file.

The file is reloaded without the need for rebooting the gitlab-runner.

Google Autopilot and Gitlab failed builds

Posted on 17 May 2022 by kosmas

Problem

You want to use Google’s Autopilot for your gitlab runners, but your job/builds fail because of low resources (ie ephemeral storage).

Solution

You can use a limit range to increase the limits for ephemeral storage or/and memory that will make Google’s autopilot to use them and scale them appropriately.

Create a limit range file like:

apiVersion: v1
kind: LimitRange
metadata:
  name: limit-ephemeral-storage
spec:
  limits:
  - default:
      ephemeral-storage: "10Gi"
      memory: "16Gi"
    defaultRequest:
      ephemeral-storage: "10Gi"
      memory: "16Gi"
    type: Container

And then apply it to your cluster

kubectl -n namespace apply -f limit_range.yaml

Kill linux zombie processes

Posted on 27 April 2022 by kosmas

Get the list of zombie processes:

 ps aux | awk '{if($8=="Z") print}'

Get the parent process of each of the processes listed above (second column)

ps -o ppid= -p 490392

Kill the parent process from the above output

sudo kill -9 3167559

googlecloudsdk.calliope.exceptions.HttpException: ResponseError: code=400, message=Autopilot clusters must be regional clusters.

Posted on 25 April 2022 by kosmas

Problem

Trying to create an auto-cluster either using terraform or gcloud cli, and specifying the region name returns the error that ‘Autopilot clusters must be regional clusters.

So with gcloud this is the command and output

kosmas: (master %)$ gcloud container clusters create-auto test-cluster --region=europe-west6-b
Note: The Pod address range limits the maximum size of the cluster. Please refer to https://cloud.google.com/kubernetes-engine/docs/how-to/flexible-pod-cidr to learn how to optimize IP address allocation.
ERROR: (gcloud.container.clusters.create-auto) ResponseError: code=400, message=Autopilot clusters must be regional clusters.

Solution

Using the actual region name (that can be taken from the list of available zones/regions)

gcloud compute zones list

NAME                       REGION                   STATUS  NEXT_MAINTENANCE  TURNDOWN_DATE
us-east1-b                 us-east1                 UP
us-east1-c                 us-east1                 UP
...
europe-west6-b             europe-west6             UP
...

And using the correct region name (without the b)

gcloud container clusters create-auto test-cluster --region=europe-west6 --verbosity debug

...
Created [https://container.googleapis.com/v1/projects/gitlab-runner-343714/zones/europe-west6/clusters/test-cluster].
...
NAME          LOCATION      MASTER_VERSION   MASTER_IP     MACHINE_TYPE  NODE_VERSION     NUM_NODES  STATUS
test-cluster  europe-west6  1.21.6-gke.1503  xxx.xxx.xxx.xxx  e2-medium     1.21.6-gke.1503  3          RUNNING

Adding a LOKI report in Grafana counting number of http_user_agents from NGINX logs

Posted on 17 March 2022 by kosmas

To add a Loki report in Grafana, as for example having a count of the different http_user_agents that have logged in, you can use the following query:

count by (http_agent) (rate({namespace="ingress-nginx",stream="stdout"} |= "https://domain.name.com/session/new" |~ "GET\\s/\\s" | pattern "<ip> - - [<timestamp>] \"<method> <path> <version>\" <result> <_> \"<url>\" \"<http_agent>\" <_>" [$__interval]))

You select the source (Loki) and then use the {namespace=”ingress-nginx”,stream=”stdout”} to get the log files from Nginx.

Then you can filter by two conditions:

the instance login

|= "https://domain.name.com/session/new"

and

use the regular expression to only select the GET / calls

|~ "GET\\s/\\s"

Then you can use the pattern parser for the log file in order to get the label

pattern "<ip> - - [<timestamp>] \"<method> <path> <version>\" <result> <_> \"<url>\" \"<http_agent>\" <_>"

<_> is used if we are not interested in getting the field, so everything after http_agent is not analyzed in this case.

Finally use the

count by(http_agent)

to get the data (note that you also need to use the rate with the [$__interval])

Adding Google Cloud json credentials as an environment variable in Terraform Cloud

Posted on 25 February 2022 by kosmas

When you are trying to add the contents of the Google Cloud credentials json file as a variable in Terraform Cloud you get the error that it cannot contain new lines. In this case you have to use the jq -c option as in:

cat credentials.json | jq -c

Adding EHLO to curl with smtp

Posted on 23 February 2022 by kosmas

If you would like to add the EHLO to the smtp url used in curl you will have to add it at the end of the url (test.example.com), like this:

--url "smtp:/smtp-relay.gmail.com:587/test.example.com"

42

Web notebook about thoughts and discoveries in DevOps, SRE, Software Engineering….

Author Archives: kosmas

Find the total number of open file descriptors

Request Entity Too Large when trying to import Gitlab project

Problem

Solution

Changing gitlab-runner configuration options

Google Autopilot and Gitlab failed builds

Problem

Solution

Kill linux zombie processes

googlecloudsdk.calliope.exceptions.HttpException: ResponseError: code=400, message=Autopilot clusters must be regional clusters.

Problem

Solution

Adding a LOKI report in Grafana counting number of http_user_agents from NGINX logs

Adding Google Cloud json credentials as an environment variable in Terraform Cloud

Adding EHLO to curl with smtp