Categories
DevOps Learning Thoughts Weekly

Weekly: It has been a week?

The past week has been pretty hectic changing between roles as a dev and ops, helping out with other projects till 2-3am every day has really taken its toll and I feel old.

Unsurprisingly, I haven’t been able to really work on any of my own projects but I did learn something interesting that I wish to write about.

Recently facing an issue on Gitlab CI pipeline, where I want to run integration/regression tests on the latest docker build. However, since each image is meant to be production ready, it means that it will be ran as a non-root user. Which means that it will restrict what the user can do when the container starts. Here’s why this problem has caused me such a headache.

Beware, below is really more of a rant about the troubles I faced.

Pulling of images

  1. The test exist within the image itself
  2. To run image, I need permission to pull from a private registry
  3. To pull from private registry, I need to login into registry
  4. Runner does not have permission to pull by default
  5. I need to run a public docker image like docker:19.03 to login and pull, and run
  6. But that would mean that I am running a docker in docker
  7. Which means that it cannot integrate properly with that pipeline’s DB instance
  8. This means that I need to find a work around link the image with the DB instance, or whatever extra services required to run the app
  9. Therefore, need to find workarounds

Running of test in the image

  1. To run the tests, I need to install the test dependencies
  2. Because user is non-root in container, unable to install anything
  3. So tests cannot run
  4. Therefore, need to find workarounds

It was such a chicken and egg problem that we’ve spent a few days just to figure out how to even pull the image properly. In the end, the solution we ended up with was this.

  1. Run the service dependency as services in Gitlab CI
  2. Run the stage with public docker:19.03
  3. Rely on docker ps --filter to find the correct services in CI
  4. Rely on docker login into AWS ECR with credentials from CI variables
  5. Rely on docker run to create the app
  6. Rely on --link to grant access to those services from within the app
  7. Rely on --user root to override the default tty user in container
  8. Rely on overriding the command to execute the list of commands we want the container to run when it starts up

The reason we stuck with this was because it gives us a greater control over what is happening to the container, and it’s a lot more obvious. To be perfectly honest I don’t know much alternatives to solve this problem. However, my colleague did suggest something interesting.

So, when Gitlab starts a stage, it clones the current branch’s git into the working directory. What she did, instead of pulling from the current build, is to take the current volume that is mounted with the stage, and swap it into the container that she wants to run custom code on.

This is basically what it looks like, where she manually swapped out just the /app with the volume from the current stage’s /app memory. This means that the app on volume B can be ran as root, cause it’s not built the same way, and we can do anything we want with the app like adding extra test dependencies. In terms of security it’s still okay because technically the code that is being shipped to production is whatever is in Volume A (note: untouched). This was extremely confusing before she explained the gist of what she’s doing.

But it was a very interesting (and weird) way of approaching a problem. In a way, she made it so that her stage is not dependent on the previous build step, while mine is dependent on build (which I think is clearer).

So yes, I learned a new way of doing something. In any case, I hope the next week would be less tiring for me, cause this circuit breaker is really taking its toll on how alive I feel.

Leave a Reply