5 Keys to Running Workloads Resiliently with Rancher and Docker – Part 1

Thursday, 4 August, 2016
Build a CI/CD Pipeline with Kubernetes and Rancher
Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.

Containers and orchestration frameworks like Rancher will soon allow
every organization to have access to efficient cluster management. This
brave new world frees operations from managing application configuration
and allows development to focus on writing code; containers abstract
complex dependency requirements, which enables ops to deploy immutable
containerized applications and allows devs a consistent runtime for
their code. If the benefits are so clear, then why do companies with
existing infrastructure practices not switch? One of the key issues is
risk. The risk of new unknowns brought by an untested technology, the
risk of inexperience operating a new stack, and the risk of downtime
impacting the brand. Planning for risks and demonstrating that the ops
team can maintain a resilient workload whilst moving into a
containerized world is the key social aspect of a container migration
project. Especially since, when done correctly, Docker and Rancher
provide a solid framework for quickly iterating on infrastructure
improvements, such as [Rancher

catalogs](https://docs.rancher.com/rancher/latest/en/catalog/) for

quickly spinning up popular distributed applications like
ElasticSearch.
In regard to risk management, we will look into identifying the five
keys to running a resilient workload on Rancher and Docker. The topics
that will be covered are as follows:

  • Running Rancher in HA Mode (covered in this post)
  • Using Service Load Balancers in Rancher
  • Setting up Rancher service health checks and monitoring
  • Providing developers with their own Rancher setup
  • Discussing Convoy for data resiliency

I had originally hoped to perform experiments on a Rancher cluster
built on a laptop using Docker Machine with a Rancher
Server
and various
Rancher Agents on Raspberry Pis. Setup instructions
here.
The problem is that most Docker images are made for Intel based CPUs, so
nothing works properly on Pi’s ARM processors. Instead I will directly
use AWS for our experiments with resilient Rancher clusters. With our
initial setup, we have 1 Rancher Server and 1 Agent. Let’s deploy a
simple multiple container application. Rancher HA Experiment Diagram
The above diagram illustrates the setup I am going to use to experiment
with Rancher. I chose AWS because I am familiar with the service, but
you can choose any other provider for setting up Rancher according to
the Quick Start
Guide
.
Rancher Machine Creation
Let’s test our stack with the WordPress
compose

described in the Rancher Quick Start instructions. Rancher HA
So now our application is up and running, the one scenario is what
happens if the Rancher Server malfunctions? Or a network issue occurs?
What happens to our application? Will it still continue serving
requests? WordPress up
For this experiment, I will perform the following and document the
results.

  • Cutting the Internet from Rancher Agent to Rancher Server
  • Stopping the Rancher Server Container
  • Peeking under the hood of the Rancher Server Container

Afterwards we will address each of these issues, and then look at
Rancher HA as a means of addressing these risks.

Cutting the Internet from Rancher Agent to Rancher Server

So let’s go onto AWS and block all access to the Rancher Server from my
Rancher Agents.

  • Block access from Rancher Server to Rancher Agent
  • Note down what happens
  • Kill a few WordPress containers
  • Re-instantiate the connection

Observations:

Firstly, after a few seconds our Rancher hosts end up in a reconnecting
state. Turn off Rancher Server
Browsing to my WordPress URL I can still access all my sites properly.
There is no service outage as the containers are still running on the
remote hosts. The IPSec tunnel between my two agents is still
established, thus allowing my lone WordPress container to still connect
to the DB. Now let’s kill a WordPress container and see what happens.
Since I can’t access my Rancher Agents from the UI, I will be SSHing
into the agent hosts to run Docker commands. (Instructions for SSHing
into Rancher-created hosts can be found
here)
Turning off Rancher Server
The WordPress container does not get restarted. This is troublesome, we
will need our Rancher Server back online. Let’s re-establish the network
connection and see if the Rancher Server notices that one of our
WordPress services is down. After a few moments, our Rancher Server
re-establishes connection with the agents and restarts the WordPress
container. Excellent. So the takeaway here is that Rancher Server can
handle intermittent connection issues and reconnect to the agents and
continue on as usual. Although, for reliable uptime of our containers we
would need multiple instances of Rancher Server on different hosts for
resiliency against networking issues in the data center. Now, what would
happen if the Rancher Server dies? Would we lose all of our ability to
manage our hosts after it comes back? Let’s find out!

Killing the Rancher Server

In this second experiment I will go into the Rancher Server host and
manually terminate the process. Generally a failure will result in the
Docker process restarting due to –restart=always being set. Though
let’s assume that either your host ran out of disk space or otherwise
borked itself.

Observations:

Let’s simulate catastrophic failure, and nuke our Rancher container.
sudo docker stop rancher-server As with the network experiment our
WordPress applications still run on the agents and serve traffic
normally. The Rancher UI and any semblance of control is now gone. We
don’t like this world, so we will start the rancher-server back up.
sudo docker start rancher-server After starting up again, the Rancher
server picks up where it left off. Wow, that is cool, how does this
magic work?

Peeking under the hood of the Rancher Server Container

So how does the Rancher Server operate? Let’s take a brief tour into the
inner working of the Rancher server container to get a sense of what
makes it tick. Taking a look at the Rancher Server Docker build file
found here.
Rancher Server Components

# Dockerfile contents
FROM ...
...
...
CMD ["/usr/bin/s6-svscan", "/service"]

What is s6-svscan? It is a supervisor process that keeps a process
running based on commands found in files in a folder; these key files
are named as Run, Down, and Finish. If we look inside the service
directory we can see that the container will install dependencies and
use s6-svscan to start up 2 services. Rancher Server Components - Service
The Cattle service, which is the core Rancher scheduler, and a MySQL
instance. Inside our container the following services are being run.

PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:00 /usr/bin/s6-svscan /service
    7 ?        S      0:00 s6-supervise cattle
    8 ?        S      0:00 s6-supervise mysql
    9 ?        Ssl    0:57 java -Xms128m -Xmx1g -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/cattle/logs -Dlogback.bootstrap.level=WARN -cp /usr/share/cattle/1792f92ccdd6495127a28e16a685da7
  135 ?        Sl     0:01 websocket-proxy
  141 ?        Sl     0:00 rancher-catalog-service -catalogUrl library=https://github.com/rancher/rancher-catalog.git,community=https://github.com/rancher/community-catalog.git -refreshInterval 300
  142 ?        Sl     0:00 rancher-compose-executor
  143 ?        Sl     0:00 go-machine-service
 1517 ?        Ss     0:00 bash
 1537 ?        R+     0:00 ps x

We see that our Rancher brain is a Java application named Cattle, which
uses a MySQL database embedded within its container to store state. This
is quite convenient, but it would seem that we found the single point of
failure on our quick-start setup. All the state for our cluster lives in
one MySQL instance which no one knows existed. What happens if I nuke
some data files?

Corrupting the MySQL Store

Inside my Rancher server container I executed MySQL commands. There is a
certain rush of adrenaline as you execute commands you know will break
everything.
docker exec -it rancher-server bash $ > mysql mysql> use cattle; mysql> SET FOREIGN_KEY_CHECKS = 0; mysql> truncate service; mysql> truncate network;
Lo and behold, my Rancher service tracking is broken, even when I kill
my WordPress containers they do not come back up, because Rancher no
longer remembers them. Loss of data - 1
Since I also truncated the network setup tables, my WordPress
application no longer knows how to route to its DB. Loss of data - 2
Clearly, to have confidence in running Rancher in production, we need a
way to protect our Rancher Server’s data integrity. This is where
Rancher HA comes in.

Rancher HA Setup Process

The first order of business is we need to secure the cluster data. I
chose AWS RDS for this because it is what I am familiar with — you can
manage your own MySQL or choose another managed provider. We will
proceed assuming we have a trusted MySQL management system with backups
and monitoring. Following the HA setup steps documented in Rancher:
Rancher HA Setup
As per the setup guide, we create an AWS RDS instance to be our data
store. Once we have our database’s public endpoint, the next step is to
dump your current Rancher installation’s data, and export it to the new
database. High Availability Setup
For this I created an RDS instance with a public IP address. For your
first Rancher HA setup I recommend just making the database public, then
secure it later with VPC rules. Since Rancher provides an easy way to
dump the state, you can move it around to a secured database at a later
time. Next we will set up our Rancher Server to use the new database.
Rancher HA Setup - Database
After Rancher detects that it is using an external database, it will
open up 2 more options as part of setting up HA mode. (At this point, we
have already solved our point of failure, but for larger scale
deployments, we need to go bigger to lower risk of failure.) Rancher HA Setup - Config
Oh no, decision! — but no worries, let’s go through each of these
options and their implications. Cluster size, notice how everything
is odd? Behind the scenes, Rancher HA sets up a ZooKeeper Quorum to keep
locks in sync (More on this in the appendix). ZooKeeper
recommends odd numbers because an even number of servers does not
provide additional fault tolerance. Let’s pick 3 hosts to test out the
feature, as it is a middle ground between usefulness and ease of setup.
Host registration URL, well this section is asking us to provide the
Fully Qualified Domain Name (FQDN) of our Rancher HA cluster. The
instructions recommend an external loadbalancer or a DNS record that
round robins between the 3 hosts. Rancher HA Setup - DNS
The examples would be to use a SRV
Record
on your DNS provider
to balance between the 3 hosts; or an ELB on AWS with the 3 Rancher EC2
instances attached; or just a plain old DNS record pointing to 3 hosts.
I choose the DNS record for my HA setup as it is the simplest to setup
and debug. Now anytime I hit https://rancher.example.com my DNS
hosting provider will round robin requests between the 3 Rancher hosts
that I defined above. SSL Certificate is the last item on the list.
If you have your own SSL certificate on your domain then you can use it
here. Otherwise Rancher will provide a self-signed certificate instead.
Once all options are filled, Rancher will update fields in its database
to prepare for HA setup. You will then be prompted to download a
rancher-ha.sh script.

WARNING Be sure to kill the Rancher container you used to generate the
rancher-ha.sh script. It will be using ports that are needed by the
Rancher-HA container that will be spun up by the script.

Next up, copy the rancher-ha.sh script onto each of the participating
instances in the cluster and then execute them on the nodes to setup HA.

Caveat! Docker v1.10.3 is required at the time of writing. Newer
version of Docker is currently unsupported for the rancher-ha.sh
script.

You can provision the correct Docker version on your hosts with the
following commands:

#!/bin/bash
apt-get install -y -q apt-transport-https ca-certificates
apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
echo "deb https://apt.dockerproject.org/repo ubuntu-trusty main" > /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install -y -q docker-engine=1.10.3-0~trusty

# run the command below to show all available versions
# apt-cache showpkg docker-engine

After Docker, we need to make sure that our instances can talk to each
other so make sure the ports listed on the Rancher multi-node requirements
page are open.

Advice! For your first test setup, I recommend opening all ports to
avoid networking-related blockers.

Once you have the correct prerequisites, you can run the rancher-ha.sh
script on each participating host. You will see the following output.

...
ed5d8e75b7be: Pull complete
ed5d8e75b7be: Pull complete
7ebc9fcbf163: Pull complete
7ebc9fcbf163: Pull complete
ffe47ea37862: Pull complete
ffe47ea37862: Pull complete
b320962f9dbe: Pull complete
b320962f9dbe: Pull complete
Digest: sha256:aff7c52e52a80188729c860736332ef8c00d028a88ee0eac24c85015cb0e26a7
Status: Downloaded newer image for rancher/server:latest
Started container rancher-ha c41f0fb7c356a242c7fbdd61d196095c358e7ca84b19a66ea33416ef77d98511
Run the below to see the logs

docker logs -f rancher-ha

This is where the rancher-ha.sh script creates additional images that
support the HA feature. Due to the addition of components to the Rancher
Server, it is recommended to run a host with at least 4 GB of memory. A
docker ps of what is running after running the rancher-ha.sh script is
shown here. Rancher HA Setup - Enabled

Common Problems and Solutions

You may see some connection errors, so try to run the script on all 3
hosts first. You should see logs showing members being added to the
Rancher HA Cluster.

time="2016-07-22T04:13:22Z" level=info msg="Cluster changed, index=0, members=[172.30.0.209, 172.30.0.111, ]" component=service
...
time="2016-07-22T04:13:34Z" level=info msg="Cluster changed, index=3, members=[172.30.0.209, 172.30.0.111, 172.30.0.69]" component=service

Sometimes you will see a stream of the following error lines.

time="2016-07-23T14:37:02Z" level=info msg="Waiting for server to be available" component=cert
time="2016-07-23T14:37:02Z" level=info msg="Can not launch agent right now: Server not available at http://172.17.0.1:18080/ping:" component=service

This is the top level symptom of many issues. Here are some other issues
I have identified by going through the GitHub issues list and various
forum posts: Security Group Network issues Sometimes your nodes are
binding on the wrong
IP

so you would want to coerce Rancher to broadcast the correct
IP
.
ZooKeeper not being up It is possible that the ZooKeeper Docker
container is not able to communicate with the other nodes, so you would
want to verify
ZooKeeper

and you should expect to see this sample
output
.
Leftover files in the /var/lib/rancher/state directory from previous
HA attempt
If you ran the rancher-ha.sh multiple times then you may
need to clean up old state
files
.
Broken Rancher HA setup state from multiple reattempts Drop
Database

and try again. There is a previous issue with detailed
steps

to try to surface the issue. Insufficient Resources on the machine
Since Rancher HA runs multiple Java processes on the machine, you will
want to have at least 4 GB of memory. While testing with a t2.micro
instance with 1 GB the instance became inaccessible due to being memory
constrained. Another issue is that your database host needs to support
50 connections per HA node. You will see these messages when you attempt
to spin up additional nodes.

time="2016-07-25T11:01:02Z" level=fatal msg="Failed to create manager" err="Error 1040: Too many connections"

Mismatched rancher/server:version By default the rancher-ha.sh
script pulls in rancher/server:latest, but this kicked me in the back
because during my setup, Rancher pushed out rancher/server:1.1.2 so I
had two hosts running rancher/server:1.1.1, and my third host was
rancher/server:1.1.2. This caused quite a headache, but a good takeaway
is to always specify the version of rancher/server when running the
rancher-ha.sh script on subsequent hosts.
./rancher-ha.sh rancher/server: Docker virtual network bridge was
returning wrong IP
This was the issue I ran into – my HA setup was
trying to check agent health on the wrong Docker interface.
curl localhost:18080/ping > pong curl http://172.17.0.1:18080/ping > curl: (7) Failed to connect to 172.17.0.1 port 18080: Connection refused
The error line is found on
rancher/cluster-manager/service
And the offending error call is found here in
rancher/cluster-manager/docker
What the code is doing is to locate the Docker Bridge and attempt to
ping the :18080 port on the exposed Docker port. Since my Docker bridge
is actually set up on 172.17.42.1 this will always fail. To resolve it I
re-instantiated the host because the multiple Docker installation seemed
to have caused the wrong bridge IP to be fetched. After restarting the
instance and setting the correct Docker bridge, I now see the expected
log lines for HA.

After Setting Up HA

time="2016-07-24T19:51:53Z" level=info msg="Waiting for 3 host(s) to be active" component=cert

Excellent. With one node up and ready, repeat the procedure for the rest
of the hosts. After 3 hosts are up, you should be able to access the
Rancher UI on the URL you specified for step 3 of the setup.

time="2016-07-24T20:00:11Z" level=info msg="[0/10] [zookeeper]: Starting "
time="2016-07-24T20:00:12Z" level=info msg="[1/10] [zookeeper]: Started "
time="2016-07-24T20:00:12Z" level=info msg="[1/10] [tunnel]: Starting "
time="2016-07-24T20:00:13Z" level=info msg="[2/10] [tunnel]: Started "
time="2016-07-24T20:00:13Z" level=info msg="[2/10] [redis]: Starting "
time="2016-07-24T20:00:14Z" level=info msg="[3/10] [redis]: Started "
time="2016-07-24T20:00:14Z" level=info msg="[3/10] [cattle]: Starting "
time="2016-07-24T20:00:15Z" level=info msg="[4/10] [cattle]: Started "
time="2016-07-24T20:00:15Z" level=info msg="[4/10] [go-machine-service]: Starting "
time="2016-07-24T20:00:15Z" level=info msg="[4/10] [websocket-proxy]: Starting "
time="2016-07-24T20:00:15Z" level=info msg="[4/10] [rancher-compose-executor]: Starting "
time="2016-07-24T20:00:15Z" level=info msg="[4/10] [websocket-proxy-ssl]: Starting "
time="2016-07-24T20:00:16Z" level=info msg="[5/10] [websocket-proxy]: Started "
time="2016-07-24T20:00:16Z" level=info msg="[5/10] [load-balancer]: Starting "
time="2016-07-24T20:00:16Z" level=info msg="[6/10] [rancher-compose-executor]: Started "
time="2016-07-24T20:00:16Z" level=info msg="[7/10] [go-machine-service]: Started "
time="2016-07-24T20:00:16Z" level=info msg="[8/10] [websocket-proxy-ssl]: Started "
time="2016-07-24T20:00:16Z" level=info msg="[8/10] [load-balancer-swarm]: Starting "
time="2016-07-24T20:00:17Z" level=info msg="[9/10] [load-balancer-swarm]: Started "
time="2016-07-24T20:00:18Z" level=info msg="[10/10] [load-balancer]: Started "
time="2016-07-24T20:00:18Z" level=info msg="Done launching management stack" component=service
time="2016-07-24T20:00:18Z" level=info msg="You can access the site at https://" component=service

Rancher HA Setup - Enabled
To get around issues regarding the self-signed HTTPS certificate, you
will need to add it to your trusted certificates. After waiting and
fixing up resource constraints on the DB, I then see all 3 hosts up and
running. Rancher HA Setup - Done

Conclusion

Wow, that was a lot more involved than originally thought. This is why
scalable distributed systems is a realm of PhD study. After resolving
all the failure points, I think setting up and getting to know Rancher
HA is a great starting point to touching state-of-the-art distributed
systems. I will eventually script this out into Ansible provisioning to
make provisioning Rancher HA a trivial task. Stay tuned!

Appendix

For any distributed system, there is an explicit way to manage state and
changes. Multiple servers need a process to coordinate between updates.
Rancher’s management process works by keeping state and desired state
in the database; then emitting events to be handled by processing
entities to realize the desired state. When an event is being processed,
there is a lock on it, and it is up to the processing entity to update
the state in the database. In the single server setup, all of the
coordination happens in memory on the host. Once you go to a multi
server setup, the additional components like ZooKeeper and Redis are
needed. Nick Ma is an Infrastructure Engineer who blogs about Rancher
and Open Source. You can visit Nick’s blog,
CodeSheppard.com, to catch up on practical
guides for keeping your services sane and reliable with open-source
solutions.

Build a CI/CD Pipeline with Kubernetes and Rancher
Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.
Tags: ,,,, Category: Uncategorized Comments closed

Matador Deploy – Making Deployment on Rancher Fun

Tuesday, 26 July, 2016

By Timon Sotiropoulos, software engineer at
SEED. SEED is a leading product development
company that builds design-driven web and mobile applications for
startup founders and enterprise innovators.
Seed
LogoDeployment
days can be quite confronting and scary for new developers. We realized
through onboarding some of our developers and introducing them to the
world of DevOps that the complexity and stress of deployment days could
take a toll on morale and productivity, with everyone always half
dreading a deployment on the upcoming calendar. After learning the no.1
rule of “never deploy on a Friday” the hard way, the team at SEED
decided there had to be a better way than the traditional “pull down
from Git repository and deploy to a server” method. The Road to
Matador
This journey started with a trip down the lane of the hottest
containerisation framework in the business, the flying whale Docker. For
those have haven’t heard of it, Docker essentially allows you to create
a blueprint for your application inside its own contained virtual
machine image. What this means is that you can create a working version
of your app on any server that has Docker installed and be confident
that everything will work as expected. The next link in the chain we
discovered was Rancher, an excellent tool for
automatically connecting and configuring Docker containers. Rancher
allows you to break your application up into multiple, separate
components the same way you would break up a program into different
classes, allowing single responsibility as well as the ability to scale
certain services up and down as required. This process and procedure
became second nature to us, but it was easy to mess things up. It was
easy to accidentally update the wrong Rancher environment and as we
planned on moving to a more continuous development lifecycle, the manual
updating of the Rancher environments had to stop. Our long term plan for
our continuous deployment process is to get to a point where a developer
can push their code to GitHub, build a copy of the Docker container, tag
it with that commit ID, and then push that code to their desired Rancher
environment. All the separate parts work independently, but we are
working towards integrating all of these tools into a fully-fledged
continuous deployment service. The first step is Matador Deploy. Matador
is a tool we have created to handle the creation and building of our
Docker containers and deploying them to the Rancher
environments
. The complication here is
that for each of our projects, we would have two or three separate
environments, one each for Production, Staging and Development. To do
this, we would have to duplicate all of our DevOps configurations and
scripts for each of our environments and then build the application
using a Makefile that set specific variables for each of the Rancher
Compose commands. However, we found that these Makefiles were simply
starting to duplicate themselves across all of our projects and we knew
there had to be a better way.

So what does Matador do?

The first thing we wanted Matador to do was combine the similar parts of
our environments and Docker/Rancher configurations into one file, while
still also allowing us to have the environment-specific parts when
required, such as the environment variables that connected to our
production or staging database. This led to the creation of three files
that Matador requires to run: two generic templates that setup the
basics of the project, and one configuration file that holds all our
environment specific configuration:

  • docker-compose-template.yml: The generic Docker Compose file for
    the application. This file contains all the configuration that
    builds the docker containers that together create your application
    stack, as well as the connections between them.
  • rancher-compose-template.yml: The generic Rancher Compose file
    for the application. This file contains all the configuration that
    is specific to your Rancher environments, such as the scale for each
    of your docker containers or your SSL certificates that have been
    setup for the Rancher environment.
  • config.yml: The config file is the one where you can define your
    environment specific configuration between your production, staging
    and development environments that have been set up on Rancher. Below
    is a short example of how this config file should be structured out:

image_base: seed/example-image
project_name: tester
global:
 web:
  environment:
   - TEST=forall
dev:
 web:
  environment:
   - NODE_ENV=dev
  labels:
   io.rancher.scheduler.affinity:host_label: client=ibackpacker,env=development
staging:
 web:
  environment:
   - NODE_ENV=staging
  labels:
   # io.rancher.scheduler.affinity:host_label: client=alessi,env=staging
   com.alessimutants.pods: version=0.1,branch=dev
prod:
 lb:
  labels:
   io.rancher.scheduler.local: 'false'
  web:
   image: seed/web
   environment:
    - NODE_ENV=prod
   labels:
    io.rancher.scheduler.local: 'false'

Everything defined in the config.yml file will be added to your
docker-compose-template depending on the environment variable that you
pass the application at run time. Matador will take the additional
config provided, then append or overwrite what is in the
docker-compose-template file and write out a new docker-compose file for
you automatically. The same is done with your rancher-compose-template;
although at this point in time there are no configuration options to
alter the template, this will be added in future releases. These output
files are then used as part of the Rancher Compose process to update
your environment on Rancher. They are also saved locally so that you can
review the configuration that Matador has created for you.

So How Do I Use Matador?

We have put together some extremely detailed usage instructions on the
GitHub repository, but the
general gist is pretty straight forward. You will need to download the
latest version of Matador Deploy from the Python Package Index –
PyPI
, as well as Rancher
Compose
, which can be
downloaded from their release page on GitHub. Once that is done, there
are a few required fields that you must supply to the configuration file
to make things work. These are the first two entries in the config.yml:

  • project_name: This field will be the name that your stack receives
    when it is deployed to Rancher. It will also be automatically
    namespaced with the environment that you pass to Matador when you
    deploy. Note, this is not the Rancher environment name, but rather
    the Rancher stack name.
  • image_base: This field is the most important because it provides
    the DockerHub registry that your application will attempt to load
    your docker images from. These also have a naming convention that is
    required for each of your respective environment images as follows:

seed/example-image:latest // Production Image seed/example-image:staging
// Staging Image seed/example-image:dev // Development Image

We do plan to include the building of your Docker images within Matador
itself in future releases, however for now you will need to add these
tags manually when pushing your images to DockerHub. Once your
config.yml, docker-compose-template.yml and rancher-compose-template.yml
have been configured, place them inside a folder called “templates” in
the root of your project directory. Finally, from the root of your
project directory call the following command:

$ matador-deploy –url http://rancher.url.co –key RANCHER_KEY –secret RANCHER_SECRET –env dev

The fields themselves are explained here:

–url: This refers to the rancher url that you are trying to upload
your rancher configuration to. –key: This is the API Key that needs to
be created specifically for the rancher environment that you are trying
to update. –secret: This is the Secret Key of Password that is
provided to you when you create a new API Key for your rancher
environment. –env: This is the environment that you wish to update. It
takes one of the following options are

The benefit of Matador in this instance is that it forces you to provide
the authentication information for your Rancher environment. One of the
issues with Rancher Compose is that it will search your local
environment in your shell for the Rancher environment keys, so if you
are pushing a lot of different stacks to Rancher (for example pushing to
Staging, then to Production), it can be easy to make a mistake and push
the wrong image to the wrong environment. If these fields aren’t
provided to Matador, the process will simply fail. There are also plans
to improve this even further by querying your Rancher server with your
API keys and having Matador actually tell you what environment it is
attempting to update – look for that too in a future release!

Where To From Here?

We have a few ideas of things we want the application to be able to do
as we work our way into building a full continuous deployment tool. A
few basic examples would be: ● Adding Support for building docker images
and pushing them to Docker Hub ● Adding a tagging system that connects
your Docker Hub images to your currently loaded Image on your Rancher
environment ● Add a simplified rollback option, most likely using the
tagging system However, what we really want to know are the features
that you would find most useful. We have open sourced Matador because we
think that it could be really helpful in integrating all these excellent
services together in the future. So please give it a try, and if you
have any ideas either write an issue and we will have a look into it, or
just fork the repository and give it a go. We can’t wait to see what you
come up with.

Tags: Category: Uncategorized Comments closed

Running Elasticsearch on Rancher

Tuesday, 31 May, 2016

Elasticsearch is one of the most popular analytics platform for large
datasets. It is useful for a range of use-cases ranger from log
aggregation, business intelligence as well as machine learning.
Elasticsearch is popular because of its simple REST based API which
makes it trivial to create indices, add data and make complex queries.
However, before you get up and running building your dataset and running
queries you need to setup a elasticsearch cluster, which can be a
somewhat daunting prospect. Today, we look at how Rancher Catalogs make
it trivial to setup a scalable, highly available Elasticsearch cluster.

Assuming you already have a Rancher Cluster up and running, getting
Elasticsearch running on your cluster is a simple matter of browsing to
Catalog in the top menu and searching for Elasticsearch. There are two
versions of the Elasticsearch catalog. We are assuming that you are
using 2.x, the latest stable release version. To launch the stack from
the cluster, select View Details, and in the subsequent screen choose
a Stack Name, Cluster Name and select launch.

Elastic Search
Catalog

The stack should launch the following services;
kopf, client(s),
datanode(s) and master(s). The kopf container provides a web interface
to manage your elasticsearch cluster. Datanodes store the actual
indices. The master node runs cluster management tasks and the client
nodes originate and coordinate your searches and other operations.
Initially, your Elasticsearch cluster will have one container of each
type (master, client and datanodes have two sidekick containers).
However, you can scale out each of those components based on query load
and the size of the indices. Note that you need different physical hosts
for each datanode container to function correctly. Hence, you may have
to register more Rancher compute nodes.

ElasticSearch
Service

Once all your containers are active, you can bring up the kopf interface
by browsing to the host running the kopf container. If you click on the
nodes tab, you will see the various components I mentioned above
listed. As you can see, I have launched a second data node in order to
provide redundant storage for my indices. As we will see shortly, when
creating indices, we can control the number of data shards and copies of
each shards. This will help provide redundancy as well as speedup query
processing.

kopf
nodes

From the menu on the top of kopf, select more and then create index.
In the resulting screen, you will be asked to enter the Index Name,
the Number of Shards and the Number of replicas. The defaults for
these are 5 shards and 1 replica respectively. The number of shards and
replicas to setup for an index is highly dependent on the data set and
query model. The number of shards help spread data onto multiple nodes
and allow parallel processing of queries. Hence, if you only have a
single datanode then you will not see much benefit from multiple shards.
In addition, if you expect the data to grow rapidly you may want to have
more shards so that you can add nodes later and have data move to those.
Another thing to keep in mind is that elastic search recommends a max
heap size of 32 GB and hence, a max shard size should be about that size
so that it can be kept in memory as much as possible.

Replicas, on the other hand, are less related to datasize and more to
redundancy and performance. All queries for your index need to look at
one copy of each shard. If you have multiple copies of the shard, the
data is resilient to one node going down. Furthermore, with multiple
copies, the query load for a given shard is split between multiple
nodes. Having multiple replica only makes sense when you have more than
one data container/node in your cluster, and becomes more important as
you scale larger and larger cluster sizes.

As an example, lets define an index called movies with 2 shards and 2
replicas. Now select the rest tab from the top menu so that we can add
some documents to our index and test some queries. Elasticsearch is
schema free so we can add free form data into our index as long as it is
valid JSON. Update the path field to /movies/movie/1. The format of
the path is /INDEX_NAME/TYPE/ID where movies is the index we just
created, movie is the name we are giving to the type of document we are
about to submit and id is a unique ID for the document within the index.
Note the ID is optional, and if you skip this from the path, a random ID
will be created for your document. Once you have added the path, select
POST as the method, enter your JSON document in the bottom text field
and hit send. This will add the document to the index and send you a
confirmation.

movie index
data

Once you have added a few movies into the index, we can use the same
rest interface in order to search and aggregate data from our index.
Update the path field to /movies/movie/_search. The format of the
path is /INDEX_NAME/TYPE/_search where both INDEX_NAME and TYPE are
optional. If you skip type, your search will apply to all types in the
index, and if you also skip index name, then your search will apply to
all indices.

There are a number of different types of quires that are supported by
ElasticSearch. However, we cover some of the common types here. The
first type of query is the free text query. The query string parameter
allows for fairly complicated queries using the Elasticsearch Query
DS
.
However, we can also enter a simple string to match. This will match the
specified word or words in any field in the documents over which the
query applies.

{
    "query": {
        "query_string": {
            "query": "Apocalypse"
        }
    }
}

For example, the query above will return the result shown below. It
contains details about the time taken to process the query, the shards
that were searched, the total number of results, and then details of
each result.

{
  "took": 139,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.5291085,
    "hits": [{
      "_index": "movies",
      "_type": "movie",
      "_id": "AVSvEC1fG_1bjVtG66mm",
      "_score": 0.5291085,
      "_source": {
        "title": "Apocalypse Now",
        "director": "Francis Ford Coppola",
        "year": 1979,
        "genres": [
          "Drama",
          "War"
        ]
      }
    }
....

In addition to the query text, you can also specify a field or est of
fields to limit your query to searching a subset of the document. For
example, the search below should return the same result as before, but
will have to look at a subset of the document, and should have faster
performance for larger data sets. There are many other operations.

{
  "query": {
    "query_string": {
      "query": "Apocalypse"
      "fields": ["title"]
    }
  }
}

We can wrap the query string in a filtered object and then specify a
filter to apply on the results of the query. This allows us to retain
the free form search over the initial dataset, but then filters the
results for the specific data we are looking for.

{
  "query": {
    "filtered": {
      "query_string": {
        "query": "Apocalypse"
        "fields": ["title"]
      }
      "filter": {
        "term": { "year": 1979 }
      }
    }
  }
}

Lastly, another type of query you may run is an aggregation. This is
useful for computing summary statistics about the data. Two examples of
these types of aggregations are shown below. The first will return a
count of the movies directed by each director. The second will return
the average rating for all the movies in our data set.

{
  "aggs": {
    "group_by_director": {
      "terms": {
        "field": "director"
      }
    }
  }
}

{
  "aggs" : {
    "avg_rating" : { "avg" : { "field" : "rating" } }
  }
}

Elasticsearch is one of the best ways of running analytics over large
unstructured datasets and is used extensively in many domains from
log-aggregation, machine learning to business intelligence. In this
article, we have looked at how simple it is to set up a fully
functioning Elasticsearch cluster on Rancher using the catalog. In
addition, we have taken a quick look at the power of Elasticsearch using
the rest API. Once you have Elasticsearch up and running you can use it
for a host of different use cases with the many available visualization
and aggregation frameworks such as
Kibana for real time
visualization or Pentaho for business
analytics.

Tags: , Category: Uncategorized Comments closed

Lessons Learned Building a Deployment Pipeline with Docker, Docker-Compose, and Rancher (Part 1)

Monday, 4 April, 2016

Build a CI/CD Pipeline with Kubernetes and Rancher
Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.

John Patterson (@cantrobot) and Chris
Lunsford run This End Out, an operations and infrastructure services
company. You can find them online at

https://www.thisendout.com *and follow them
on twitter @thisendout. * Update:
All four parts of the series are now live, you can find them here: Part
1: Getting started with CI/CD and
Docker

Part 2: Moving to Compose
blueprints

Part 3: Adding Rancher for
Orchestration
Part
4: Completing the Cycle with Service
Discovery

This post is the first in a series in which we’d like to share the story
of how we implemented a container deployment workflow using Docker,
Docker-Compose and Rancher. Instead of just giving you the polished
retrospective, though, we want to walk you through the evolution of the
pipeline from the beginning, highlighting the pain points and decisions
that were made along the way. Thankfully, there are many great resources
to help you set up a continuous integration and deployment workflow with
Docker. This is not one of them! A simple deployment workflow is
relatively easy to set up. But our own experience has been that building
a deployment system is complicated mostly because the easy parts must be
done alongside the legacy environment, with many dependencies, and while
changing your dev team and ops organization to support the new
processes. Hopefully, our experience of building our pipeline the hard
way will help you with the hard parts of building yours. In this first
post, we’ll go back to the beginning and look at the initial workflow we
developed using just Docker. In future posts, we’ll progress through the
introduction of Docker-compose and eventually Rancher into our workflow.
To set the stage, the following events all took place at a
Software-as-a-Service provider where we worked on a long-term services
engagement. For the purpose of this post, we’ll call the company Acme
Business Company, Inc., or ABC. This project started while ABC was in
the early stages of migrating its mostly-Java micro-services stack from
on-premise bare metal servers to Docker deployments running in Amazon
Web Services (AWS). The goals of the project were not unique: lower lead
times on features and better reliability of deployed services. The plan
to get there was to make software deployment look something like this:
CD Part 1
software-deployment-workflow_edited
The process starts with the code being changed, committed, and pushed to
a git repository. This would notify our CI system to run unit tests,
and if successful, compile the code and store the result as an
artifact. If the previous step was successful, we trigger another job
to build a Docker image with the code artifact and push it to a private
Docker registry. Finally, we trigger a deployment of our new image to
an environment. The necessary ingredients are these:

  • A source code repository. ABC already had its code in private
    GitHub repository.
  • A continuous integration and deployment tool. ABC had a local
    installation of Jenkins already.
  • A private registry. We deployed a Docker registry container, backed
    by S3.
  • An environment with hosts running Docker. ABC had several target
    environments, with each target containing both a staging and
    production deployment.

When viewed this way, the process is deceptively simple. The reality
on the ground, though, is a bit more complicated. Like many other
companies, there was (and still is) an organizational divide between
development and operations. When code is ready for deployment, a ticket
is created with the details of the application and a target
environment. This ticket is assigned to operations and scheduled for
execution during the weekly deployment window. Already, the path to
continuous deployment and delivery is not exactly clear. In the
beginning, the deployment ticket might have looked something like this:

DEPLOY-111:
App: JavaService1, branch "release/1.0.1"
Environment: Production

The deployment process:

  • The deploy engineer for the week goes to Jenkins and clicks “Build
    Now” for the relevant project, passing the branch name as a
    parameter. Out pops a tagged Docker image which is automatically
    pushed into the registry. The engineer selects a Docker host in the
    environment that is not currently active in the load balancer. The
    engineer logs in and pulls the new version from the registry
docker pull registry.abc.net/javaservice1:release-1.0.1
  • Finds the existing container
docker ps
  • Stops the existing container.
docker stop [container_id]
  • Starts a new container with all of the necessary flags to launch the
    container correctly. This can be borrowed from the previous running
    container, the shell history on the host, or it may be documented
    elsewhere.
docker run -d -p 8080:8080 … registry.abc.net/javaservice1:release-1.0.1
  • Pokes the service and does some manual testing to verify that it is
    working.
curl localhost:8080/api/v1/version
  • During the production maintenance window, updates the load-balancer
    to point to the updated host.

  • Once verified, the update is applied to all of the other hosts in
    the environment in case a failover is required.

Admittedly, this deployment process isn’t very impressive, but it’s a
great first step towards continuous deployment. There are plenty of
places to improve, but consider the benefits:

  • The ops engineer has a recipe for deployment and every application
    deploys using the same steps. The parameters for the Docker run
    step have to be looked up for each service, but the general cadence
    is always the same: Docker pull, Docker stop, Docker run. This is
    super simple and makes it hard to forget a step.
  • With a minimum of two hosts in the environment, we have manageable
    blue-green deployments. A production window is simply a cutover in
    the load-balancer configuration with an obvious and quick way to
    rollback. As the deployments become more dynamic, upgrades,
    rollbacks, and backend server discovery get increasingly difficult
    and require more coordination. Since deployments are manual, the
    costs of blue-green are minimal but provide major benefits over
    in-place upgrades.

Alright, on to the pain points:

  • Retyping the same commands. Or, more accurately, hitting up and
    enter at your bash prompt repeatedly. This one is easy: automation
    to the rescue! There are lots of tools available to help you launch
    Docker containers. The most obvious solution for an ops engineer is
    to wrap the repetitive logic in a bash script so you have a single
    entry point. If you call yourself a devops engineer, instead, you
    might reach for Ansible, Puppet, Chef, or SaltStack. Writing the
    scripts or playbooks are easy, but there are a few questions to
    answer: where does the deployment logic live? And how do you keep
    track of the different parameters for each service? That brings us
    to our next point.
  • Even an ops engineer with super-human abilities to avoid typos and
    reason clearly in the middle of the night after a long day at the
    office won’t know that the service is now listening on a different
    port and the Docker port parameter needs to be changed. The crux of
    the issue is that the details of how the application works are
    (hopefully) well-known to developers, and that information needs to
    be transferred to the operations team. Often times, the operations
    logic lives in a separate code repository or no repository at all.
    Keeping the relevant deployment logic in sync with the application
    can be difficult. For this reason, it’s a good practice to just
    commit your deployment logic into the code repo with your
    Dockerfile. If there are situations where this isn’t possible,
    there are ways to make it work (more on this later). The important
    thing is that the details are committed somewhere. Code is better
    than a deploy ticket, but a deploy ticket is still much better than
    in someone’s brain.
  • Visibility. Troubleshooting a container requires logging into
    the host and running commands. In reality, this means logging into
    a number of hosts and running a combination of ‘docker ps’ and
    ‘docker logs –tail=100’. There are many good solutions for
    centralizing logs and, if you have the time, they are definitely
    worth setting up. What we found to generally be lacking, though,
    was the ability to see what containers were running on which hosts.
    This is a problem for developers, who want to know what versions are
    deployed and at what scale. And this is a major problem for
    operations, who need to hunt down containers for upgrades and
    troubleshooting.

Given this state of affairs, we started to implement changes to address
the pain points. The first advancement was to write a bash script
wrapping the common steps for a deployment. A simple wrapper might look
something like this:

!/bin/bash
APPLICATION=$1
VERSION=$2

docker pull "registry.abc.net/${APPLICATION}:${VERSION}"
docker rm -f $APPLICATION
docker run -d --name "${APPLICATION}" "registry.abc.net/${APPLICATION}:${VERSION}"

This works, but only for the simplest of containers: the kind that users
don’t need to connect to. In order to enable host port mapping and
volume mounts, we need to add application-specific logic. Here’s the
brute force solution that was implemented:

APPLICATION=$1
VERSION=$2

case "$APPLICATION" in
java-service-1)
  EXTRA_ARGS="-p 8080:8080";;
java-service-2)
  EXTRA_ARGS="-p 8888:8888 --privileged";;
*)
  EXTRA_ARGS="";;
esac

docker pull "registry.abc.net/${APPLICATION}:${VERSION}"
docker stop $APPLICATION
docker run -d --name "${APPLICATION}" $EXTRA_ARGS "registry.abc.net/${APPLICATION}:${VERSION}"

This script was installed on every Docker host to facilitate
deployments. The ops engineer would login and pass the necessary
parameters and the script would do the rest. Deployment time was
simplified because there was less for the engineer to do. The problem
of encoding the deployment logic didn’t go away, though. We moved it
back in time and turned it into a problem of committing changes to a
common script and distributing those changes to hosts. In general, this
is a great trade. Committing to a repo gives you great benefits like
code review, testing, change history, and repeatability. The less you
have to think about at crucial times, the better. Ideally, the relevant
deployment details for an application would live in the same source repo
as the application itself. There are many reasons why this may not be
the case, not the least of which being that developers may object to
having “ops” stuff in their java repo. This is especially true for
something like a deployment bash script, but also pertains to the
Dockerfile itself. This comes down to a cultural issue and is worth
working through, if at all possible. Although it’s certainly doable to
maintain separate repositories for your deployment code, you’ll have to
spend extra energy making sure that the two stay in sync. But, of
course, this is an article about doing it the hard way. At ABC, the
Dockerfiles started life in a dedicated repository with one folder per
project, and the deploy script lived in its own repo. cd
3
The Dockerfiles repository had a working copy checked out at a
well-known location on the Jenkins host (say,
‘/opt/abc/Dockerfiles’). In order to build the Docker image for an
application, Jenkins would first check for a Dockerfile in a local
folder ‘docker’. If not present, Jenkins would search the Dockerfiles
path, copying over the Dockerfile and accompanying scripts before
running the ‘docker build’. Since the Dockerfiles are always at
master, it’s possible to find yourself in a situation where the
Dockerfile is ahead of (or behind) the application configuration, but in
practice this mostly just works. Here’s an excerpt from the Jenkins
build logic:

if [ -f docker/Dockerfile ]; then
  docker_dir=Docker
elif [ -f /opt/abc/dockerfiles/$APPLICATION/Dockerfile ]; then
  docker_dir=/opt/abc/dockerfiles/$APPLICATION
else
  echo "No docker files. Can’t continue!"
  exit 1
if
docker build -t $APPLICATION:$VERSION $docker_dir

Over time, dockerfiles and supporting scripts were migrated into the
application source repositories. Since Jenkins was already looking in
the local repo first, no changes were required to the build pipeline.
After migrating the first service, the repo layout looked roughly like:
cd
4
One problem we ran into with having a separate repo was getting Jenkins
to trigger a rebuild of the application if either the application source
or packaging logic changed. Since the ‘dockerfiles’ repo contained
code for many projects, we didn’t want to trigger all repos when a
change occurred. The solution: a well-hidden option in the Jenkins Git
plugin
called
Included Regions.
When configured, Jenkins isolates the build trigger to a change in a
specific sub-directory inside the repository. This allows us to keep all
Dockerfiles in a single repository and still be able to trigger specific
builds when a change is made (compared to building all images when a
change is made to a specific directory inside the repo). CD Part
1_2_jenkins-included-regions_edited
Another aspect of the initial workflow was that the deploy engineer
had to force a build of the application image before deployment. This
resulted in extra delays, especially if there was a problem with the
build and the developer needed to be engaged. To reduce this delay, and
pave the way to more continuous deployment, we started building Docker
images on every commit to a well-known branch. This required that every
image have a unique version identifier, which was not always the case if
we relied solely on the official application version string. We ended
up using a combination of official version string, commit count, and
commit sha:

commit_count=$(git rev-list --count HEAD)
commit_short=$(git rev-parse --short HEAD)
version_string="${version}-${commit_count}-${commit_short}"

This resulted in a version string that looks like ‘1.0.1-22-7e56158’.
Before we end our discussion of the Docker file phase of our pipeline,
there are a few parameters that are worth mentioning. Before we
operated a large number of containers in production we had little use
for these, but they have proven helpful in maintaining the uptime of our
Docker cluster.

  • Restart
    Policy

    • A restart policy allows you to specify, per-container, what
      action to take when a container exits. Although this can be used to
      recover from an application panic or keep the container retrying
      while dependencies come online, the big win for Ops is automatic
      recovery after a Docker daemon or host restart. In the long run,
      you’ll want to implement a proper scheduler that can restart failed
      containers on new hosts. Until that day comes, save yourself some
      work and set a restart policy. At this stage in ABC, we defaulted to
      ‘–restart always’, which will cause the container to restart
      indefinitely. Simply having a restart policy will make planned (and
      unplanned) host restarts much less painful.
  • **Resource
    Constraints
    ** –
    With runtime resource constraints, you can set the maximum amount of
    memory or CPU that a container can consume. It won’t save you from
    general over-subscription of a host, but it can keep a lid on memory
    leaks and runaway containers. We started out by applying a generous
    memory limit (e.g. ‘–memory=“8g“‘) to containers that were
    known to have issues with memory growth. Although having a hard
    limit means the application will eventually hit an Out-of-Memory
    situation and panic, the host and other containers keep right on
    humming.

Combining restart policies and resource limits gives you greater cluster
stability while minimizing the impact of the failure and improving time
to recovery. In practice, this type of safeguard gives you the time to
work with the developer on the root cause, instead of being tied up
fighting a growing fire. To summarize, we started with a rudimentary
build pipeline that created tagged Docker images from our source repo.
We went from deploying containers using the Docker CLI to deploying them
using scripts and parameters defined in code. We also looked at how we
organized our deployment code, and highlighted a few Docker parameters
to help Ops keep the services up and running. At this point, we still
had a gap between our build pipelines and deployment steps. The
deployment engineer were bridging that gap by logging into a server to
run the deployment script. Although an improvement from where we
started, there was still room for a more automated approach. All of the
deployment logic was centralized in a single script, which made testing
locally much more difficult when developers need to install the script
and muddle through its complexity. At this point, handling any
environment specific information by way of environment variables
was also contained in our deployment script. Tracking down which
environmental variables were set for a service and adding new ones
was tedious and error-prone. In the next
post
,
we take a look at how we addressed these pain points by deconstructing
the common wrapper script, bringing the deployment logic closer to the
application using Docker Compose.Go to Part
2>>

Please also download your free copy of ”Continuous Integration and
Deployment with Docker and
Rancher
” a detailed
eBook that walks through leveraging containers throughout your CI/CD
process.

Build a CI/CD Pipeline with Kubernetes and Rancher
Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.

Tags: ,, Category: Uncategorized Comments closed

Announcing Rancher 1.0 GA

Tuesday, 29 March, 2016

Rancher 1.0
LogoToday we achieved a
major milestone by shipping Rancher 1.0, our first generally available
release. After more than one and a half years of development, Rancher
has reached the quality and feature completeness for production
deployment. We first unveiled a preview of Rancher to the world at the
November 2014 Amazon Re:invent conference. We followed that with a Beta
release in June 2015. I’d like to congratulate the entire Rancher
development team for this achievement.

As an open source project, Rancher was developed in close collaboration
with the user and developer community. Some of you have been with us
from the very beginning and have seen dramatic enhancements and changes
since our initial preview release. Some of you were even bold enough to
deploy Rancher in mission-critical production environments prior to our
GA release! Rancher simply would not be where it is today without your
feedback, suggestions, and contributions. With your help, Rancher has
now over a million downloads and over 2500 beta program participants.
We can’t thank our community enough and are looking forward to working
with each and every one of you as we continue to improve and shape
Rancher into the best container management platform.

In this blog post I would like to also reflect upon what we have built
by explaining what problems Rancher is designed to solve, how the
community is using Rancher in practice, why users find Rancher to be
uniquely suited for their applications, and how we plan to continue to
develop Rancher post 1.0 GA.

We’ve created a quick demo to introduce the new release.

###

The problem Rancher solves

Rancher is a complete and turn-key container management platform. As
organizations start to deploy Docker containers in production, the
immediate challenge becomes the integration of large collections of
open-source technologies. As illustrated in the following figure,
container management involves solving problems spanning across storage,
networking, monitoring, orchestration, and scheduling.

container
management

Rancher develops, integrates, and distributes all of the technologies
necessary to run containers in production. At Rancher Labs, we integrate
and distribute market-leading container orchestration and scheduling
frameworks such as Docker Swarm and Kubernetes, while developing the
necessary app catalog, enterprise user management, access control,
container networking and storage technologies ourselves. The result is a
complete platform that offers superb developer and operator experience.
With Rancher, organizations no longer need to worry about keeping up
with and integrating a myriad of technologies from the fast-moving
container ecosystem. Instead, they deploy Rancher once and can then
focus on developing the applications that make their business better.

###

How organizations use Rancher

Most organizations today employ or are moving toward an agile software
development pipeline. As users adopt containers, tools like GitHub,
Jenkins, and Docker Hub solve the front-half of that pipeline. Users
deploy Rancher so that they can test, deploy, upgrade, and operate
containerized application on any public cloud or private data center.

Container Devops
Pipeline

###

Why Rancher is unique

We set out to build Rancher because we saw the need for a complete and
turn-key container management platform. The resulting product, Rancher,
has a number of unique qualities:

  • Open source

    . Rancher is 100% open source. We believe leveraging the power of an
    open source community is the best way to build platform software,
    and we are confident that organizations will pay for enterprise
    support and thus help fund our development effort.

  • Easy to use

    . Time and time again, developers have told us they love Rancher’s
    simple and intuitive experience which enables them to focus on
    building applications as opposed to having to build the underlying
    infrastructure software themselves.

  • Enterprise grade

    . Rancher implements enterprise management features such as LDAP and
    AD integration, role based access control, unified infrastructure
    visibility and audit, and a unified application catalog.

  • Infrastructure agnostic

    . Rancher runs on computing resources in the form of Linux servers,
    which are a commodity offered by all clouds and data centers.
    Rancher does not rely on proprietary features supported by one cloud
    provider and not others. Rancher builds a rich set of storage,
    networking, load balancing, DNS, and metadata services that work
    consistently for containers running on any cloud.

  • Support for both Swarm and Kubernetes

    . Modern DevOps practices do not impose the choice of application
    frameworks across the organization. As a result different teams tend
    to choose their own container orchestration and scheduling
    frameworks. Rancher is the only container management platform today
    that can support both Swarm and Kubernetes.

###

What you can expect after 1.0

We are lucky to have passionate users and open source community members.
The community wants us to continue to improve networking and storage
features, implement richer enterprise-grade visibility and control
features, onboard and certify more application catalog entries, and
support additional container orchestration and scheduling frameworks.
You can get a good idea of what users and community members want by
looking at the Rancher Forums

and the list of open
issues

on Rancher’s GitHub page. With 1.0 behind us, you should expect the
feature development velocity to increase. While maintaining a stable
release branch, we will continue to release new features and
capabilities on an aggressive release schedule. Stay tuned!

To see Rancher 1.0 in action, join us this Wednesday at 1:00 pm Eastern
time for an Online Meetup on building your own Containers-as-a-Service
platform with Rancher 1.0. We’ll be joined by the container team at Sony
PlayStation, who will be sharing some of their experiences using
Rancher.

Tags: , Category: Uncategorized Comments closed

Rancher and Spotinst partner to introduce a new model for utilizing Docker orchestration tools on spot instances

Monday, 16 November, 2015

spotinstlogo](https://cdn.rancher.com/wp-content/uploads/2015/11/16025649/spotinstlogo.png)
We
are very excited to announce a new partnership
with Spotinst today to deliver intelligent
management and migration of container workloads running on spot
instances. With this new solution, we have developed a simple, intuitive
way for using spot instances to run any
container workload reliably and for a fraction of the cost of
traditional applications. Since the dawn
of data centers we’ve seen continuous improvements in utilization and
cost efficiency. But like [Jevons’
Paradox]](https://en.wikipedia.org/wiki/Jevons_paradox),
the more efficient we become in consuming
a resource, the more of that resource we consume
. So we always are
seeking the newest, fastest and uber-optimized of everything.

How it works:

Spotinst is a SaaS platform that enables reliable, highly available use
of AWS Spot Instances and Google Preemptible VMs with typical savings of
70-90%.

We’ve worked with the team at Spotinst to integrate with the Rancher API directly. The integration
utlizes Docker “checkpoint and resume” (CRIU project). Based on metrics
and business rules provided by Spotinst, Rancher can freeze any
container and resume it on any other instance, automating the process a
typical DevOps team might implement to manage container
deployment.
rancher-spotinst-1](https://www.suse.com/c/wp-content/uploads/2021/09/rancher_blog_rancher-spotinst-1.png)
For example, if Spotinst identifies that the spot instance a
container is running on, is about to terminate (with a 4 – 7 minute
heads-up), Spotinst will instruct Rancher to pause that container and
relocate it to another relevant instance.
rancher-spotinst-2](https://www.suse.com/c/wp-content/uploads/2021/09/rancher_blog_rancher-spotinst-2.png)

Unprecedented Availability for Online Gaming While pizza servers,
blade racks and eventually virtualization technology paved the way for
modern data centers, today’s cloud computing customer expects
increasingly higher performance and higher availability in everything
from online gaming to genome sequencing. An
awesome example of how Docker is utilizing live migration to deliver
high availability can be seen in this
presentation from
DockerCon earlier this yeaar.

The presenters show how they containerized Quake, had it running on a
DigitalOcean server in Singapore, and then live migrated it to Amsterdam
with the player experiencing practically zero interruption to his game.
Using “checkpoint and resume”, they
didn’t just stop the container, but
took an entire running process with all its memory, file descriptors,
etc. and effortlessly moved it and resumed it halfway around the
world.

How it works?

rancher-spotinst-4](https://www.suse.com/c/wp-content/uploads/2021/09/rancher_blog_rancher-spotinst-4.png)

We’re really excited about the potential of live migrating containers,
and this partnership with Spotinst. By moving workloads to spot
instances, organizations can dramatically reduce the cost of cloud
resources.

To try out the new service, you can sign up for a Spotinst account and directly connect it to your
running Rancher deployment, via your API keys.

To learn more, please request a demonstration from one of our engineers.

Tags: ,,, Category: Uncategorized Comments closed

Introducing Hyper-Converged Infrastructure for Containers, Powered by Rancher and Redapt

Wednesday, 11 November, 2015

converged
nodesHyper-Converged
Infrastructure is one of the greatest innovations in the modern data
center. I have been a big fan ever since I heard the analogy “iPhone
for the data center

from Nutanix, the company who invented hyper-converged infrastructure.
In my previous roles as CEO of Cloud.com, creator of CloudStack, and CTO
of Citrix’s CloudPlatform Group, I helped many organizations transform
their data centers into infrastructure clouds. The biggest challenge was
always how to integrate a variety of technologies from multiple vendors
into a coherent and reliable cloud platform. Hyper-converged
infrastructure is an elegant solution to this complexity, that makes
infrastructure consumable by offering a simple turn-key experience.
Hyper-convergence hides the underlying complexity and makes the lives of
data center operators much better. Typically, hyper-converged
infrastructure is used to run virtual machines (VMs), the popular
workload running in data centers today. The nature of data center
workloads, however, are changing. In the last year, Docker containers
have become a significant type of workloads in data centers. Because of
this, we are beginning to see market demand for purpose-built and
optimized infrastructure solution for containers. Today, our team at
Rancher announced a hyper-converged infrastructure platform for
containers
,
powered by Rancher and Redapt. This is a turn-key solution to stand up a
complete container service platform in the data center. Organizations no
longer need to source hardware, deploy virtualization and cloud
platforms, and integrate separate container orchestration systems.

Support for both VMs and Container Infrastructure

We designed the solution to support both VMs and containers, following
the approach used by Google to run virtual machines in
containers
.
We have experimented with this approach in our
RancherVM
project since April and have received a lot of positive feedback from
users. A benefit of running VMs inside containers is the ability to
leverage the same tools to manage both VMs and containers. Because VMs
and containers in fact behave in similar ways, the Rancher CLI and UI we
have developed for Docker containers seamlessly applies to VMs. We use
RancherOS as the base operating system
for the converged infrastructure platform. The RancherOS kernel has
builtin support for KVM. The following figure depicts how Rancher and
RancherOS work together to form the complete software stack for our
hyper-converged infrastructure solution.
Rancher Converged Infrastructure

Containerized Storage Services

All hyper-converged infrastructure solutions include a distributed
storage implementation. By leveraging our other major announcement
today, Persistent Storage
Services
,
this hyper-converged infrastructure solution has the unique ability to
use multiple distributed storage implementations. Users have the freedom
to deploy the software storage platform that suits the needs of their
applications. This approach reduces failure domain and improve
reliability. Failure of a distributed storage deployment can only impact
the application that consumes that storage. Users can deploy open source
and commercial storage software, as long as the storage software is
packaged as Docker containers. We are incorporating Gluster and
NexentaEdge into our hyper-converged infrastructure platform, and plan
to support additional storage products in the future. converged
infrastructure for
containers

Access to the Docker Image Ecosystem

Successful hyper-converged infrastructure solutions often target popular
application workload, such as databases or virtual desktops. The Docker
ecosystem offers a rich set of applications that can run on the Rancher
hyper-converged infrastructure solution. DockerHub alone, for example,
contains hundreds of thousands of Docker images. In addition, Rancher
makes it easy to run not just single containers, but large application
clusters orchestrated by new container frameworks such as Compose,
Swarm, and Kubernetes. Rancher Labs has certified and packaged a set of
popular DevOps tools. With a single click, users can deploy, for
example, an entire ELK cluster on the hyper-converged infrastructure.
catalog

Our Partnership with Redapt

redaptlogoWe
have known and worked with Redapt team for many years. Back in 2011, my
team at Cloud.com collaborated with Redapt to build one of the largest
CloudStack-powered private clouds at the time, consisting of over 40,000
physical servers. We were deeply impressed by the technical skills, the
ability to innovate, and the professionalism of the Redapt team.
Creating a hyper-converged infrastructure solution requires close
collaboration between the hardware and software vendors. We are
fortunate to be able to work with Redapt again to bring to market the
industry’s first hyper-converged infrastructure for containers solution.

Availability

Rancher and Redapt are working with early access customers now. We plan
to make the hyper-converged infrastructure solution generally available
in first half of 2016. Please request a demo if you would
like to speak with one of our engineers about converged infrastructure,
or register for our next online meetup, where we will be demonstrating
this new functionality. Sheng Liang is the
CEO and co-founder of Rancher Labs. You can follow him on Twitter at
@shengliang.

Tags: , Category: Uncategorized Comments closed

Deploying a scalable Jenkins cluster with Docker and Rancher

Thursday, 5 November, 2015

Build a CI/CD Pipeline with Kubernetes and Rancher
Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.

Containerization brings several benefits to traditional CI platforms
where builds share hosts: build dependencies can be isolated,
applications can be tested against multiple environments (testing a Java
app against multiple versions of JVM), on-demand build environments can
be created with minimal stickiness to ensure test fidelity, Docker
Compose can be used to quickly bring up environments which mirror
development environments. Lastly, the inherent isolation offered by
Docker Compose-based stacks allow for concurrent builds — a sticking
point for traditional build environments with shared components.

One of the immediate benefits of containerization for CI is that we can
leverage tools such as Rancher to manage distributed build environments
across multiple hosts. In this article, we’re going to launch a
distributed Jenkins cluster with Rancher Compose. This work builds upon
the earlier
work
** **by
one of the authors, and further streamlines the process of spinning up
and scaling a Jenkins stack.

Our Jenkins Stack

jenkins_master_slave
For our stack, we’re using Docker in Docker (DIND) images for
Jenkins master** and slave **running
on top of Rancher compute nodes launched in Amazon EC2. With DIND, each
Jenkins container runs a Docker daemon within itself. This allows us to
create build pipelines for dockerized applications with Jenkins.

Prerequisites

  • [AWS EC2
    account]
  • [IAM credentials for docker
    machine]
  • [Rancher
    Server v0.32.0+]
  • [Docker 1.7.1+]
  • [Rancher
    Compose]
  • [Docker
    Compose]

Setting up Rancher

Step 1: Setup an EC2 host for Rancher server

First thing first, we need an EC2 instance to run the Rancher server. We
recommend going with Ubuntu 14.04
AMI
for
it’s up-to-date kernel. Make sure[ to configure the security group for
the EC2 instance with access to port 22 (SSH) and 8080 (rancher web
interface):]

launch_ec2_instance_for_rancher_step_2

[Once the instance starts, the first order of business is
to ][install the
latest version of Docker by following the steps below (for Ubuntu
14.04):]

  1. [sudo apt-get
    update]
  2. [curl -sSL https://get.docker.com/ | sh (requires sudo
    password)]
  3. [sudo usermod -aG docker
    ubuntu]
  4. [Log out and log back in to the
    instance]

At this point you should be able to run docker without sudo.

[Step 2: Run and configure Rancher]

[To install and run the latest version of Rancher (v0.32.0 at the time
of writing), follow the instructions in the docs.
In a few minutes your Rancher server should be up and ready to serve
requests on port
8080. ][If you
browse to http://YOUR_EC2_PUBLIC_IP:8080/ you will be greeted with a
welcome page and a notice asking you to configure
access. ][This is
an important step to prevent unauthorized access to your Rancher server.
Head over to the settings section and follow the instructions to
configure access
control. ]

rancher_setup_step_1

[We typically create a separate environment for hosting all developer
facing tools, e.g., Jenkins, Seyren, Graphite etc to isolate them from
the public facing live services. To this end, we’re going to create an
environment called *Tools. *From the environments menu (top left),
select “manage environments” and create a new environment. Since
we’re going to be working in this environment exclusively, let’s go
ahead and make this our default environment by selecting “set as
default login environment”
from the environments
menu. ]

rancher_setup_step_2_add_tools_env

The next step is to tell Rancher about our hosts. For this tutorial,
we’ll launch all hosts with Ubuntu 14.04. Alternatively, you can add an
existing host using the custom host** **option
in Rancher. Just make sure that your hosts are running Docker 1.7.1+.

rancher_setup_step_3_add_ec2_host

One of the hosts (JENKINS_MASTER_HOST) is going to run Jenkins master
and would need some additional configuration. First, we need to open up
access to port 8080 (default Jenkins port). You can do that by updating
the security group used by that instance fom the AWS console. In our
case, we updated the security group ( “rancher-machine” ) which was
created by rancher. Second, we need to attach an additional EBS-backed
volume to host Jenkins configuration. Make sure that you allocate enough
space for the volume, based on how large your build workspaces tend to
get. In addition, make sure the flag “delete on termination” is
unchecked. That way, the volume can be re-attached to another instance
and backed up easily:

[![launch_ec2_ebs_volume_for_jenkins](https://cdn.rancher.com/wp-content/uploads/2015/08/01132712/launch_ec2_ebs_volume_for_jenkins.png)](https://cdn.rancher.com/wp-content/uploads/2015/08/01132712/launch_ec2_ebs_volume_for_jenkins.png)

Lastly, let’s add a couple of labels for the JENKINS_MASTER_HOST; 1)
add a label called “profile” with the value as “jenkins” and 2) add
a label called “jenkins-master” with the value “true“. We’re going
to use these labels later to schedule master and slave containers on
our hosts.

Step 3: Download and install rancher-compose CLI

As a last step, we need to install the rancher-compose CLI on our
development machine. To do that, head over to the applications tab in
Rancher and download the rancher compose CLI for your system. All you
need is to add the path-to-your-rancher-compose-CLI to
your *PATH *environment variable.

rancher_setup_step_5_install_rancher_compose

With that, our rancher server is ready and we can now launch and manage
containers with it.

Launching Jenkins stack with Rancher

Step 1: Stack configuration

Before we launch the Jenkins stack, we need to create a new Rancher API
key from API & Keys section under settings. Save the API key pair
some place safe as we’re going to need it with rancher-compose. For the
rest of the article, we refer to the API key pair as [RANCHR_API_KEY
and RANCHER_API_KEY_SECRET]. Next, open up a
terminal to fetch the latest version of Docker and Rancher Compose
templates from Github:

git clone https://github.com/rancher/jenkins-rancher.git
cd jenkins-rancher

Before we can use these templates, let’s quickly update the
configuration. First, open up the Docker Compose file and update the
Jenkins username and password to a username and password of your choice.
Let’s call these credentials JENKINS_USER and JENKINS_PASSWORD.
These credentials will be used by the Jenkins slave to talk to master.
Second, update the host tag for slave and master to match the tags you
specified for your rancher compute hosts. Make sure that the
io.rancher.scheduler.affinity:host_label has a value of
“profile=jenkins” for jenkins-slave. Similarly, for
jenkins-master, make sure that the value
for io.rancher.scheduler.affinity:host_label is
“jenkins-master=true“. This will ensure that rancher containers are
only launched on the hosts that you want to limit them to. For example,
we are limiting our Jenkins master to only run on a host with an
attached EBS volume and access to port 8080.

jenkins-slave:
  environment:
    JENKINS_USERNAME: jenkins
    JENKINS_PASSWORD: jenkins
    JENKINS_MASTER: http://jenkins-master:8080
  labels:
    io.rancher.scheduler.affinity:host_label: profile=jenkins
  tty: true
  image: techtraits/jenkins-slave
  links:
  - jenkins-master:jenkins-master
  privileged: true
  volumes:
  - /var/jenkins
  stdin_open: true
jenkins-master:
  restart: 'no'
  labels:
    io.rancher.scheduler.affinity:host_label: jenkins-master=true
  tty: true
  image: techtraits/jenkins-master
  privileged: true
  stdin_open: true
  volume_driver: /var/jenkins_home
jenkins-lb:
  ports:
  - '8080'
  tty: true
  image: rancher/load-balancer-service
  links:
  - jenkins-master:jenkins-master
  stdin_open: true

Step 2: Create the Jenkins stack with Rancher compose

[Now we’re all set to launch the Jenkins stack. Open up a terminal,
navigate to the “jenkins-rancher” directory and type:
]

rancher-compose --url http://RANCHER_HOST:RANCHER_PORT/v1/ --access-key RANCHER_API_KEY --secret-key RANCHER_API_KEY_SECRET --project-name jenkins --verbose create

[The output of the rancher compose command should look something
like:]

[DEBU[0000] Opening compose file:
docker-compose.yml]
[ DEBU[0000]
Opening rancher-compose file:
/home/mbsheikh/jenkins-rancher/rancher-compose.yml]

[ DEBU[0000] [0/3] [jenkins-slave]:
Adding]
[ DEBU[0000] Found
environment: jenkins(1e9)]
[
DEBU[0000] Launching action for
jenkins-master]
[ DEBU[0000]
Launching action for jenkins-slave]
[
DEBU[0000] Launching action for
jenkins-lb]
[ DEBU[0000] Project
[jenkins]: Creating project]
[
DEBU[0000] Finding service
jenkins-master]
[ DEBU[0000] [0/3]
[jenkins-master]: Creating]
[
DEBU[0000] Found service jenkins-master]

[ DEBU[0000] [0/3] [jenkins-master]:
Created]
[ DEBU[0000] Finding service
jenkins-slave]
[ DEBU[0000] Finding
service jenkins-lb]
[ DEBU[0000]
[0/3] [jenkins-slave]: Creating]
[
DEBU[0000] Found service jenkins-slave]

[ DEBU[0000] [0/3] [jenkins-slave]:
Created]
[ DEBU[0000] Found service
jenkins-lb]
[ DEBU[0000] [0/3]
[jenkins-lb]: Created]

Next, verify that we have a new stack with three services:

rancher_compose_2_jenkins_stack_created

Before we start the stack, let’s make sure that the services are
properly linked. Go to your stack’s settings and select “View Graph”
which should display the links between various services:

rancher_compose_3_jenkins_stack_graph

Step 3: Start the Jenkins stack with Rancher compose

To start the stack and all of Jenkins services, we have a couple of
options; 1) select “Start Services” option from Rancher UI, or 2)
invoke rancher-compose CLI with the following command:

rancher-compose --url http://RANCHER_HOST:RANCHER_PORT/v1/ --access-key RANCHER_API_KEY --secret-key RANCHER_API_KEY_SECRET --project-name jenkins --verbose start

Once everything is running, find out the public IP of the host running
“jenkins-lb” from the Rancher UI and browse
to http://HOST_IP_OF_JENKINS_LB:8080/. If everything is configured
correctly, you should see the Jenkins landing page. At this point, both
your Jenkins master and slave(s) should be running; however, if you
check the logs for your Jenkins slave, you would see 404 errors where
the Jenkins slave is unable to connect to the Jenkins master. We need to
configure Jenkins to allow for slave connections.

Configuring and Testing Jenkins

In this section, we’ll go through the steps needed to configure and
secure our Jenkins stack. First, let’s create a Jenkins user with the
same credentials (JENKINS_USER and JENKINS_PASSWORD) that you
specified in your docker compose configuratio[n
file. ]Next, to enable security for Jenkins,
navigate to “manage Jenkins” and select “enable security” from the
security configuration. Make sure to specify 5000 as a fixed port for
“TCP port for JNLP slave agents“. Jenkins slaves communicate with the
master node on this port.

setup_jenkins_1_security

For the Jenkins slave to be able to connect to the master, we first need
to install the Swarm
plugin
. The
plugin can be installed from the “manage plugins” section in Jenkins.
Once you have the swarm plugin installed, your Jenkins slave should show
up in the “Build Executor Status” tab:

setup_jenkins_2_slave_shows_up

Finally, to complete the master-slave configuration, head over to
“manage Jenkins“. You should now see a notice about enabling master
security subsystem. Go ahead and enable the subsystem; it can be used to
control access between master and slaves:

setup_jenkins_3_master_slave_security_subsystem

Before moving on, let’s configure Jenkins to work with Git and Java
based projects. To configure git, simply install the git plugin. Then,
select “Configure” from “Manage Jenkins” settings and set up the JDK
and maven installers you want to use for your projects:

[setup_jenkins_4_jdk_7
]

setup_jenkins_5_maven_3

The steps above should be sufficient for building docker or maven based
Java projects. To test our new Jenkins stack, let’s create a docker
based job. Create a new “Freestyle Project” type job named
“docker-test” and add the following build step and select “execute
shell” with the following commands:

docker -v
docker run ubuntu /bin/echo hello world
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker rmi $(docker images -q)

Save the job and run. In the console output, you should see the version
of docker running inside your Jenkins container and the output for other
docker commands in our job.

Note: The stop, rm and rmi commands used in the above shell script
stops and cleans up all containers and images. Each Jenkins job should
only touch it’s own containers, and therefore, we recommend deleting
this job after a successful test.

Scaling Jenkins with Rancher

This is an area where Rancher really shines; it makes managing and
scaling Docker containers trivially easy. In this section we’ll show
you how to scale up and scale down the number of Jenkins slaves based on
your needs.

In our initial setup, we only had one EC2 host registered with Rancher
and all three services (Jenkins load balancer, Jenkins master and
Jenkins slave) running on the same host. It looks like:

rancher_one_host

We’re now going to register another host by following the instructions:

rancher_setup_step_4_hosts

jenkins_scale_upTo launch more
Jenkins slaves, simply click “Scale up” from your “Jenkins” stack in
Rancher. That’s it! Rancher will immediately launch a new Jenkins slave
container. As soon as the slave container starts, it will connect with
Jenkins master and will show up in the list of build hosts:

jenkins_scale_up_2

To scale down, select “edit” from jenkins-slave settings and adjust
the number of slaves to your liking:

jenkins_scale_down

In a few seconds you’ll see the change reflected in Jenkins list of
available build hosts. Behind the scenes, Rancher uses labels to
schedule containers on hosts. For more details on Rancher’s container
scheduling, we encourage you to check out the documentation.

Conclusion

In this article, we built Jenkins with Docker and Rancher. We deployed
up a multi-node Jenkins platform with Rancher Compose which can be
launched with a couple of commands and scaled as needed. Rancher’s
cross-node networking allows us to seamlessly scale the Jenkins cluster
on multiple nodes and potentially across multiple clouds with just a few
clicks. Another significant aspect of our Jenkins stack is the DIND
containers for Jenkins master and slave, which allows the Jenkins setup
to be readily used for dockerized and non dockerized applications.

In future articles, we’re going to use this Jenkins stack to create
build pipelines and highlight CI best practices for dockerized
applications. To learn more about managing applications through the
upgrade process, please join our next online meetup where we’ll dive
into the details of how to manage deployments and upgrades of
microservices with Docker and Rancher.

Bilal and Usman are server and infrastructure engineers, with
experience in building large scale distributed services on top of
various cloud platforms. You can read more of their work at
techtraits.com, or follow them on twitter
@mbsheikh and
@usman_ismail respectively.

Build a CI/CD Pipeline with Kubernetes and Rancher
Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.

Tags: ,, Category: Uncategorized Comments closed