Announcing Preview Support for Istio

Thursday, 20 June, 2019

 

Today we are announcing support for Istio with Rancher 2.3 in Preview mode.

Why Istio?

Istio, and service mesh generally, has developed a huge amount of excitement
in the Kubernetes ecosystem. Istio promises to add fault tolerance, canary rollouts, A/B testing, monitoring
and metrics, tracing and observability, and authentication and authorization, eliminating the need for
developers to instrument or write specific code to enable these capabilities. In effect, developers can just
focus on their business logic and leave the rest to Kubernetes and Istio.

The claims above aren’t new. About 10 years ago, PaaS vendors made exactly the same claim and even delivered
on it to an extent. The problem was that their offerings required specific languages, frameworks, and, for
the most part, only worked with very simple applications. The workloads were also tied to the vendor’s
unique implementation, which meant that if you wanted your applications to use the PaaS services, you were
potentially locked-in for a very long time.

With containers and Kubernetes, these limitations are virtually nonexistent. As long as you can containerize
your application, Kubernetes can run it for you.

How Istio Works in Rancher 2.3 Preview 2

Our users count on us to make managing and operating Kubernetes and related tools and technologies easy,
without locking them in to a specific cloud vendor. With Istio, we take the same approach.

In this Preview mode, we provide users with a simple UI to enable Istio under the Tools menu. Reasonable
default configurations are provided but can be changed as required:

Announcing Istio

In order to monitor your traffic, Istio needs to inject an Envoy sidecar. In Rancher 2.3 Preview, users can
enable automatic sidecar injection for each namespace. Once this option is selected, Rancher will inject the
sidecar container into each workload:

Announcing Istio

Rancher’s simplified installation and configuration of Istio comes with a built-in, supported Kiali dashboard for traffic and telemetry visualization, Jaeger for tracing, and even its own Prometheus and Grafana (separate
instances than the ones used for Advanced Monitoring).

After you deploy workloads in the namespaces with automatic sidecar injection enabled, head over to the Istio
menu entry and observe the traffic as it flows across your microservice applications:

Announcing Istio

Clicking on Kiali, Jaeger, Prometheus, or Grafana will take you to the respective UI of each tool, where you
can find more details and options:

Announcing Istio

As mentioned earlier, the power of Istio is its ability to bring features like fault tolerance, circuit
breaking, canary deployment, and more to your services. To enable these, you will need to develop and apply
the appropriate YAML files. Istio is not supported for Windows workloads yet, so it should not be enabled in
Windows clusters.

Conclusion

Istio is one of the most talked about and requested features in the Rancher and Kubernetes communities today.
However, there are also a lot of questions around the best way to deploy and manage it. With Rancher 2.3.0
Preview 2, our goal is to make this journey quick and easy.

For release notes and installation steps, please visit
https://github.com/rancher/rancher/releases/tag/v2.3.0-alpha5

An Introduction to Big Data Concepts

Wednesday, 27 March, 2019

Gigantic amounts of data are being generated at high speeds by a variety of sources such as mobile devices, social media, machine logs, and multiple sensors surrounding us. All around the world, we produce vast amount of data and the volume of generated data is growing exponentially at a unprecedented rate. The pace of data generation is even being accelerated by the growth of new technologies and paradigms such as Internet of Things (IoT).

What is Big Data and How Is It Changing?

The definition of big data is hidden in the dimensions of the data. Data sets are considered “big data” if they have a high degree of the following three distinct dimensions: volume, velocity, and variety. Value and veracity are two other “V” dimensions that have been added to the big data literature in the recent years. Additional Vs are frequently proposed, but these five Vs are widely accepted by the community and can be described as follows:

  • Velocity: the speed at which the data is been generated
  • Volume: the amount of the data that is been generated
  • Variety: the diversity or different types of the data
  • Value: the worth of the data or the value it has
  • Veracity: the quality, accuracy, or trustworthiness of the data

Large volumes of data are generally available in either structured or unstructured formats. Structured data can be generated by machines or humans, has a specific schema or model, and is usually stored in databases. Structured data is organized around schemas with clearly defined data types. Numbers, date time, and strings are a few examples of structured data that may be stored in database columns. Alternatively, unstructured data does not have a predefined schema or model. Text files, log files, social media posts, mobile data, and media are all examples of unstructured data.

Based on a report provided by Gartner, an international research and consulting organization, the application of advanced big data analytics is part of the Gartner Top 10 Strategic Technology Trends for 2019, and is expected to drive new business opportunities. The same report also predicts that more than 40% of data science tasks will be automated by 2020, which will likely require new big data tools and paradigms.

By 2017, global internet usage reached 47% of the world’s population based on an infographic provided by DOMO. This indicates that an increasing number of people are starting to use mobile phones and that more and more devices are being connected to each other via smart cities, wearable devices, Internet of Things (IoT), fog computing, and edge computing paradigms. As internet usage spikes and other technologies such as social media, IoT devices, mobile phones, autonomous devices (e.g. robotics, drones, vehicles, appliances, etc) continue to grow, our lives will become more connected than ever and generate unprecedented amounts of data, all of which will require new technologies for processing.

The Scale of Data Generated by Everyday Interactions

At a large scale, the data generated by everyday interactions is staggering. Based on research conducted by DOMO, for every minute in 2018, Google conducted 3,877,140 searches, YouTube users watched 4,333,560 videos, Twitter users sent 473,400 tweets, Instagram users posted 49,380 photos, Netflix users streamed 97,222 hours of video, and Amazon shipped 1,111 packages. This is just a small glimpse of a much larger picture involving other sources of big data. It seems like the internet is pretty busy, does not it? Moreover, it is expected that mobile traffic will experience tremendous growth past its present numbers and that the world’s internet population is growing significantly year-over-year. By 2020, the report anticipates that 1.7MB of data will be created per person per second. Big data is getting even bigger.

At small scale, the data generated on a daily basis by a small business, a start up company, or a single sensor such as a surveillance camera is also huge. For example, a typical IP camera in a surveillance system at a shopping mall or a university campus generates 15 frame per second and requires roughly 100 GB of storage per day. Consider the storage amount and computing requirements if those camera numbers are scaled to tens or hundreds.

Big Data in the Scientific Community

Scientific projects such as CERN, which conducts research on what the universe is made of, also generate massive amounts of data. The Large Hadron Collider (LHC) at CERN is the world’s largest and most powerful particle accelerator. It consists of a 27-kilometer ring of superconducting magnets along with some additional structures to accelerate and boost the energy of particles along the way.

During the spin, particles collide with LHC detectors roughly 1 billion times per second, which generates around 1 petabyte of raw digital “collision event” data per second. This unprecedented volume of data is a great challenge that cannot be resolved with CERN’s current infrastructure. To work around this, the generated raw data is filtered and only the “important” events are processed to reduce the volume of data. Consider the challenging processing requirements for this task.

The four big LHC experiments, named ALICE, ATLAS, CMS, and LHCb, are among the biggest generators of data at CERN, and the rate of the data processed and stored on servers by these experiments is expected to reach about 25 GB/s (gigabyte per second). As of June 29, 2017, the CERN Data Center announced that they had passed the 200 petabytes milestone of data archived permanently in their storage units.

Why Big Data Tools are Required

The scale of the data generated by famous well-known corporations, small scale organizations, and scientific projects is growing at an unprecedented level. This can be clearly seen by the above scenarios and by remembering again that the scale of this data is getting even bigger.

On the one hand, the mountain of the data generated presents tremendous processing, storage, and analytics challenges that need to be carefully considered and handled. On the other hand, traditional Relational Database Management Systems (RDBMS) and data processing tools are not sufficient to manage this massive amount of data efficiently when the scale of data reaches terabytes or petabytes. These tools lack the ability to handle large volumes of data efficiently at scale. Fortunately, big data tools and paradigms such as Hadoop and MapReduce are available to resolve these big data challenges.

Analyzing big data and gaining insights from it can help organizations make smart business decisions and improve their operations. This can be done by uncovering hidden patterns in the data and using them to reduce operational costs and increase profits. Because of this, big data analytics plays a crucial role for many domains such as healthcare, manufacturing, and banking by resolving data challenges and enabling them to move faster.

Big Data Analytics Tools

Since the compute, storage, and network requirements for working with large data sets are beyond the limits of a single computer, there is a need for paradigms and tools to crunch and process data through clusters of computers in a distributed fashion. More and more computing power and massive storage infrastructure are required for processing this massive data either on-premise or, more typically, at the data centers of cloud service providers.

In addition to the required infrastructure, various tools and components must be brought together to solve big data problems. The Hadoop ecosystem is just one of the platforms helping us work with massive amounts of data and discover useful patterns for businesses.

Below is a list of some of the tools available and a description of their roles in processing big data:

  • MapReduce: MapReduce is a distributed computing paradigm developed to process vast amount of data in parallel by splitting a big task into smaller map and reduce oriented tasks.
  • HDFS: The Hadoop Distributed File System is a distributed storage and file system used by Hadoop applications.
  • YARN: The resource management and job scheduling component in the Hadoop ecosystem.
  • Spark: A real-time in-memory data processing framework.
  • PIG/HIVE: SQL-like scripting and querying tools for data processing and simplifying the complexity of MapReduce programs.
  • HBase, MongoDB, Elasticsearch: Examples of a few NoSQL databases.
  • Mahout, Spark ML: Tools for running scalable machine learning algorithms in a distributed fashion.
  • Flume, Sqoop, Logstash: Data integration and ingestion of structured and unstructured data.
  • Kibana: A tool to visualize Elasticsearch data.

Conclusion

To summarize, we are generating a massive amount of data in our everyday life, and that number is continuing to rise. Having the data alone does not improve an organization without analyzing and discovering its value for business intelligence. It is not possible to mine and process this mountain of data with traditional tools, so we use big data pipelines to help us ingest, process, analyze, and visualize these tremendous amounts of data.

Learn to deploy databases in production on Kubernetes

For more training in big data and database management, watch our free online training on successfully running a database in production on kubernetes.

Tags: ,,, Category: Uncategorized Comments closed

Considerations When Designing Distributed Systems

Monday, 11 March, 2019

Introduction

Today’s applications are marvels of distributed systems development. Each function or service that makes up
an application may be executing on a different system, based upon a different system architecture, that is
housed in a different geographical location, and written in a different computer language. Components of
today’s applications might be hosted on a powerful system carried in the owner’s pocket and communicating
with application components or services that are replicated in data centers all over the world.

What’s amazing about this, is that individuals using these applications typically are not aware of the
complex environment that responds to their request for the local time, local weather, or for directions to
their hotel.

Let’s pull back the curtain and look at the industrial sorcery that makes this all possible and contemplate
the thoughts and guidelines developers should keep in mind when working with this complexity.

The Evolution of System Design

Designing Distributed

Figure 1: Evolution of system design over time

Source: Interaction Design Foundation, The
Social Design of Technical Systems: Building technologies for communities

Application development has come a long way from the time that programmers wrote out applications, hand
compiled them into the language of the machine they were using, and then entered individual machine
instructions and data directly into the computer’s memory using toggle switches.

As processors became more and more powerful, system memory and online storage capacity increased, and
computer networking capability dramatically increased, approaches to development also changed. Data can now
be transmitted from one side of the planet to the other faster than it used to be possible for early
machines to move data from system memory into the processor itself!

Let’s look at a few highlights of this amazing transformation.

Monolithic Design

Early computer programs were based upon a monolithic design with all of the application components were
architected to execute on a single machine. This meant that functions such as the user interface (if users
were actually able to interact with the program), application rules processing, data management, storage
management, and network management (if the computer was connected to a computer network) were all contained
within the program.

While simpler to write, these programs become increasingly complex, difficult to document, and hard to update
or change. At this time, the machines themselves represented the biggest cost to the enterprise and so
applications were designed to make the best possible use of the machines.

Client/Server Architecture

As processors became more powerful, system and online storage capacity increased, and data communications
became faster and more cost-efficient, application design evolved to match pace. Application logic was
refactored or decomposed, allowing each to execute on different machines and the ever-improving networking
was inserted between the components. This allowed some functions to migrate to the lowest cost computing
environment available at the time. The evolution flowed through the following stages:

Terminals and Terminal Emulation

Early distributed computing relied on special-purpose user access devices called terminals. Applications had
to understand the communications protocols they used and issue commands directly to the devices. When
inexpensive personal computing (PC) devices emerged, the terminals were replaced by PCs running a terminal
emulation program.

At this point, all of the components of the application were still hosted on a single mainframe or
minicomputer.

Light Client

As PCs became more powerful, supported larger internal and online storage, and network performance increased,
enterprises segmented or factored their applications so that the user interface was extracted and executed
on a local PC. The rest of the application continued to execute on a system in the data center.

Often these PCs were less costly than the terminals that they replaced. They also offered additional
benefits. These PCs were multi-functional devices. They could run office productivity applications that
weren’t available on the terminals they replaced. This combination drove enterprises to move to
client/server application architectures when they updated or refreshed their applications.

Midrange Client

PC evolution continued at a rapid pace. Once more powerful systems with larger storage capacities were
available, enterprises took advantage of them by moving even more processing away from the expensive systems
in the data center out to the inexpensive systems on users’ desks. At this point, the user interface and
some of the computing tasks were migrated to the local PC.

This allowed the mainframes and minicomputers (now called servers) to have a longer useful life, thus
lowering the overall cost of computing for the enterprise.

Heavy client

As PCs become more and more powerful, more application functions were migrated from the backend servers. At
this point, everything but data and storage management functions had been migrated.

Enter the Internet and the World Wide Web

The public internet and the World Wide Web emerged at this time. Client/server computing continued to be
used. In an attempt to lower overall costs, some enterprises began to re-architect their distributed
applications so they could use standard internet protocols to communicate and substituted a web browser for
the custom user interface function. Later, some of the application functions were rewritten in Javascript so
that they could execute locally on the client’s computer.

Server Improvements

Industry innovation wasn’t focused solely on the user side of the communications link. A great deal of
improvement was made to the servers as well. Enterprises began to harness together the power of many
smaller, less expensive industry standard servers to support some or all of their mainframe-based functions.
This allowed them to reduce the number of expensive mainframe systems they deployed.

Soon, remote PCs were communicating with a number of servers, each supporting their own component of the
application. Special-purpose database and file servers were adopted into the environment. Later, other
application functions were migrated into application servers.

Networking was another area of intense industry focus. Enterprises began using special-purpose networking
servers that provided fire walls and other security functions, file caching functions to accelerate data
access for their applications, email servers, web servers, web application servers, distributed name servers
that kept track of and controlled user credentials for data and application access. The list of networking
services that has been encapsulated in an appliance server grows all the time.

Object-Oriented Development

The rapid change in PC and server capabilities combined with the dramatic price reduction for processing
power, memory and networking had a significant impact on application development. No longer where hardware
and software the biggest IT costs. The largest costs were communications, IT services (the staff), power,
and cooling.

Software development, maintenance, and IT operations took on a new importance and the development process was
changed to reflect the new reality that systems were cheap and people, communications, and power were
increasingly expensive.

Designing Distributed

Figure 2: Worldwide IT spending forcast

Source: Gartner Worldwide IT
Spending Forecast, Q1 2018

Enterprises looked to improved data and application architectures as a way to make the best use of their
staff. Object-oriented applications and development approaches were the result. Many programming languages
such as the following supported this approach:

  • C++
  • C#
  • COBOL
  • Java
  • PHP
  • Python
  • Ruby

Application developers were forced to adapt by becoming more systematic when defining and documenting data
structures. This approach also made maintaining and enhancing applications easier.

Open-Source Software

Opensource.com offers the following definition for open-source
software: “Open source software is software with source code that anyone can inspect, modify, and enhance.”
It goes on to say that, “some software has source code that only the person, team, or organization who
created it — and maintains exclusive control over it — can modify. People call this kind of software
‘proprietary’ or ‘closed source’ software.”

Only the original authors of proprietary software can legally copy, inspect, and alter that software. And in
order to use proprietary software, computer users must agree (often by accepting a license displayed the
first time they run this software) that they will not do anything with the software that the software’s
authors have not expressly permitted. Microsoft Office and Adobe Photoshop are examples of proprietary
software.

Although open-source software has been around since the very early days of computing, it came to the
forefront in the 1990s when complete open-source operating systems, virtualization technology, development
tools, database engines, and other important functions became available. Open-source technology is often a
critical component of web-based and distributed computing. Among others, the open-source offerings in the
following categories are popular today:

  • Development tools
  • Application support
  • Databases (flat file, SQL, No-SQL, and in-memory)
  • Distributed file systems
  • Message passing/queueing
  • Operating systems
  • Clustering

Distributed Computing

The combination of powerful systems, fast networks, and the availability of sophisticated software has driven
major application development away from monolithic towards more highly distributed approaches. Enterprises
have learned, however, that sometimes it is better to start over than to try to refactor or decompose an
older application.

When enterprises undertake the effort to create distributed applications, they often discover a few pleasant
side effects. A properly designed application, that has been decomposed into separate functions or services,
can be developed by separate teams in parallel.

Rapid application development and deployment, also known as DevOps, emerged as a way to take advantage of the
new environment.

Service-Oriented Architectures

As the industry evolved beyond client/server computing models to an even more distributed approach, the
phrase “service-oriented architecture” emerged. This approach was built on distributed systems concepts,
standards in message queuing and delivery, and XML messaging as a standard approach to sharing data and data
definitions.

Individual application functions are repackaged as network-oriented services that receive a message
requesting they perform a specific service, they perform that service, and then the response is sent back to
the function that requested the service.

This approach offers another benefit, the ability for a given service to be hosted in multiple places around
the network. This offers both improved overall performance and improved reliability.

Workload management tools were developed that receive requests for a service, review the available capacity,
forward the request to the service with the most available capacity, and then send the response back to the
requester. If a specific service doesn’t respond in a timely fashion, the workload manager simply forwards
the request to another instance of the service. It would also mark the service that didn’t respond as failed
and wouldn’t send additional requests to it until it received a message indicating that it was still alive
and healthy.

What Are the Considerations for Distributed Systems

Now that we’ve walked through over 50 years of computing history, let’s consider some rules of thumb for
developers of distributed systems. There’s a lot to think about because a distributed solution is likely to
have components or services executing in many places, on different types of systems, and messages must be
passed back and forth to perform work. Care and consideration are absolute requirements to be successful
creating these solutions. Expertise must also be available for each type of host system, development tool,
and messaging system in use.

Nailing Down What Needs to Be Done

One of the first things to consider is what needs to be accomplished! While this sounds simple, it’s
incredibly important.

It’s amazing how many developers start building things before they know, in detail, what is needed. Often,
this means that they build unnecessary functions and waste their time. To quote Yogi Berra, “if you don’t
know where you are going, you’ll end up someplace else.”

A good place to start is knowing what needs to be done, what tools and services are already available, and
what people using the final solution should see.

Interactive Versus Batch

Since fast responses and low latency are often requirements, it would be wise to consider what should be done
while the user is waiting and what can be put into a batch process that executes on an event-driven or
time-driven schedule.

After the initial segmentation of functions has been considered, it is wise to plan when background, batch
processes need to execute, what data do these functions manipulate, and how to make sure these functions are
reliable, are available when needed, and how to prevent the loss of data.

Where Should Functions Be Hosted?

Only after the “what” has been planned in fine detail, should the “where” and “how” be considered. Developers
have their favorite tools and approaches and often will invoke them even if they might not be the best
choice. As Bernard Baruch was reported to say, “if all you have is a hammer, everything looks like a nail.”

It is also important to be aware of corporate standards for enterprise development. It isn’t wise to select a
tool simply because it is popular at the moment. That tool just might do the job, but remember that
everything that is built must be maintained. If you build something that only you can understand or
maintain, you may just have tied yourself to that function for the rest of your career. I have personally
created functions that worked properly and were small and reliable. I received telephone calls regarding
these for ten years after I left that company because later developers could not understand how the
functions were implemented. The documentation I wrote had been lost long earlier.

Each function or service should be considered separately in a distributed solution. Should the function be
executed in an enterprise data center, in the data center of a cloud services provider or, perhaps, in both.
Consider that there are regulatory requirements in some industries that direct the selection of where and
how data must be maintained and stored.

Other considerations include:

  • What type of system should be the host of that function. Is one system architecture better for that
    function? Should the system be based upon ARM, X86, SPARC, Precision, Power, or even be a Mainframe?
  • Does a specific operating system provide a better computing environment for this function? Would Linux,
    Windows, UNIX, System I, or even System Z be a better platform?
  • Is a specific development language better for that function? Is a specific type of data management tool?
    Is a Flat File, SQL database, No-SQL database, or a non-structured storage mechanism better?
  • Should the function be hosted in a virtual machine or a container to facilitate function mobility,
    automation and orchestration?

Virtual machines executing Windows or Linux were frequently the choice in the early 2000s. While they offered
significant isolation for functions and made it easily possible to restart or move them when necessary,
their processing, memory and storage requirements were rather high. Containers, another approach to
processing virtualization, are the emerging choice today because they offer similar levels of isolation, the
ability to restart and migrate functions and consume far less processing power, memory or storage.

Performance

Performance is another critical consideration. While defining the functions or services that make up a
solution, the developers should be aware if they have significant processing, memory or storage
requirements. It might be wise to look at these functions closely to learn if that can be further subdivided
or decomposed.

Further segmentation would allow an increase in parallelization which would potentially offer performance
improvements. The trade off, of course, is that this approach also increases complexity and, potentially,
makes them harder to manage and to make secure.

Reliability

In high stakes enterprise environments, solution reliability is essential. The developer must consider when
it is acceptable to force people to re-enter data, re-run a function, or when a function can be unavailable.

Database developers ran into this issue in the 1960s and developed the concept of an atomic function. That
is, the function must complete or the partial updates must be rolled back leaving the data in the state it
was in before the function began. This same mindset must be applied to distributed systems to ensure that
data integrity is maintained even in the event of service failures and transaction disruptions.

Functions must be designed to totally complete or roll back intermediate updates. In critical message passing
systems, messages must be stored until an acknowledgement that a message has been received comes in. If such
a message isn’t received, the original message must be resent and a failure must be reported to the
management system.

Manageability

Although not as much fun to consider as the core application functionality, manageability is a key factor in
the ongoing success of the application. All distributed functions must be fully instrumented to allow
administrators to both understand the current state of each function and to change function parameters if
needed. Distributed systems, after all, are constructed of many more moving parts than the monolithic
systems they replace. Developers must be constantly aware of making this distributed computing environment
easy to use and maintain.

This brings us to the absolute requirement that all distributed functions must be fully instrumented to allow
administrators to understand their current state. After all, distributed systems are inherently more complex
and have more moving parts than the monolithic systems they replace.

Security

Distributed system security is an order of magnitude more difficult than security in a monolithic
environment. Each function must be made secure separately and the communication links between and among the
functions must also be made secure. As the network grows in size and complexity, developers must consider
how to control access to functions, how to make sure than only authorized users can access these function,
and to to isolate services from one other.

Security is a critical element that must be built into every function, not added on later. Unauthorized
access to functions and data must be prevented and reported.

Privacy

Privacy is the subject of an increasing number of regulations around the world. Examples like the European
Union’s GDPR and the U.S. HIPPA regulations are important considerations for any developer of
customer-facing systems.

Mastering Complexity

Developers must take the time to consider how all of the pieces of a complex computing environment fit
together. It is hard to maintain the discipline that a service should encapsulate a single function or,
perhaps, a small number of tightly interrelated functions. If a given function is implemented in multiple
places, maintaining and updating that function can be hard. What would happen when one instance of a
function doesn’t get updated? Finding that error can be very challenging.

This means it is wise for developers of complex applications to maintain a visual model that shows where each
function lives so it can be updated if regulations or business requirements change.

Often this means that developers must take the time to document what they did, when changes were made, as
well as what the changes were meant to accomplish so that other developers aren’t forced to decipher mounds
of text to learn where a function is or how it works.

To be successful as a architect of distributed systems, a developer must be able to master complexity.

Approaches Developers Must Master

Developers must master decomposing and refactoring application architectures, thinking in terms of teams, and
growing their skill in approaches to rapid application development and deployment (DevOps). After all, they
must be able to think systematically about what functions are independent of one another and what functions
rely on the output of other functions to work. Functions that rely upon one other may be best implemented as
a single service. Implementing them as independent functions might create unnecessary complexity and result
in poor application performance and impose an unnecessary burden on the network.

Virtualization Technology Covers Many Bases

Virtualization is a far bigger category than just virtual machine software or containers. Both of these
functions are considered processing virtualization technology. There are at least seven different types of
virtualization technology in use in modern applications today. Virtualization technology is available to
enhance how users access applications, where and how applications execute, where and how processing happens,
how networking functions, where and how data is stored, how security is implemented, and how management
functions are accomplished. The following model of virtualization technology might be helpful to developers
when they are trying to get their arms around the concept of virtualization:

Designing Distributed

Figure 3: Architure of virtualized systems

Source: 7 Layer Virtualizaiton Model, VirtualizationReview.com

Think of Software-Defined Solutions

It is also important for developers to think in terms of “software defined” solutions. That is, to segment
the control from the actual processing so that functions can be automated and orchestrated.

Tools and Strategies That Can Help

Developers shouldn’t feel like they are on their own when wading into this complex world. Suppliers and
open-source communities offer a number of powerful tools. Various forms of virtualization technology can be
a developer’s best friend.

Virtualization Technology Can Be Your Best Friend

  • Containers make it possible to easily develop functions that can execute without
    interfering with one another and can be migrated from system to system based upon workload demands.
  • Orchestration technology makes it possible to control many functions to ensure they are
    performing well and are reliable. It can also restart or move them in a failure scenario.
  • Supports incremental development: functions can be developed in parallel and deployed
    as they are ready. They also can be updated with new features without requiring changes elsewhere.
  • Supports highly distributed systems: functions can be deployed locally in the
    enterprise data center or remotely in the data center of a cloud services provider.

Think In Terms of Services

This means that developers must think in terms of services and how services can communicate with one another.

Well-Defined APIs

Well defined APIs mean that multiple teams can work simultaneously and still know that everything will fit
together as planned. This typically means a bit more work up front, but it is well worth it in the end. Why?
Because overall development can be faster. It also makes documentation easier.

Support Rapid Application Development

This approach is also perfect for rapid application development and rapid prototyping, also known as DevOps.
Properly executed, DevOps also produces rapid time to deployment.

Think In Terms of Standards

Rather than relying on a single vendor, the developer of distributed systems would be wise to think in terms
of multi-vendor, international standards. This approach avoids vendor lock-in and makes finding expertise
much easier.

Summary

It’s interesting to note how guidelines for rapid application development and deployment of distributed
systems start with “take your time.” It is wise to plan out where you are going and what you are going to do
otherwise you are likely to end up somewhere else, having burned through your development budget, and have
little to show for it.

Sign up for Online Training

To continue to learn about the tools, technologies, and practices in the modern development landscape, sign up for free online training sessions. Our engineers host
weekly classes on Kubernetes, containers, CI/CD, security, and more.

Microservices vs. Monolithic Architectures

Friday, 1 March, 2019

Enterprises are increasingly pressured by competitors and their own customers to get applications working and online quicker while also minimizing development costs. These divergent goals have forced enterprise IT organization to evolve rapidly. After undergoing one forced evolution after another since the 1960s, many are prepared to take the step away from monolithic application architectures to embrace the microservices approach.

Figure 1: Architecture differences between traditional monolithic applications and microservices

Figure 1: Architecture differences between traditional monolithic applications and microservices

Image courtesy of BMC

Higher Expectations and More Empowered Customers

Customers that are used to having worldwide access to products and services now expect enterprises to quickly respond to whatever other suppliers are doing.

CIO magazine, in reporting upon Ovum’s research, pointed out:

“Customers now have the upper hand in the customer journey. With more ways to shop and less time to do it, they don’t just gather information and complete transactions quickly. They often want to get it done on the go, preferably on a mobile device, without having to engage in drawn-out conversations.”

IT Under Pressure

This intense worldwide competition also forces enterprises to find new ways to cut costs or find new ways to be more efficient. Developers have seen this all before. This is just the newest iteration of the perennial call to “do more with less” that enterprise IT has faced for more than a decade. Even though IT budgets grow, they’ve learned, the investments are often in new IT services or better communications.

Figure 2: Forcasted 2018 worldwide IT spending growth

Figure 2: Forcasted 2018 worldwide IT spending growth

Source: Gartner Market Databook, 4Q17

As enterprise IT organizations face pressure to respond, they have had to revisit their development processes. The traditional two-year development cycle, previously acceptable, is no longer satisfactory. There is simply no time for that now.

Enterprise IT has also been forced to respond to a confluence of trends that are divergent and contradictory.

  • The introduction of inexpensive but high-performance network connectivity that allows distributed functions to communicate with one another across the network as fast as processes previously could communicate with one another inside of a single system.
  • The introduction of powerful microprocessors that offer mainframe-class performance in inexpensive and small packages. After standardizing on the X86 microprocessor architecture, enterprises are now being forced to consider other architectures to address their need for higher performance, lower cost, and both lower power consumption and heat production.
  • Internal system memory capacity continues to increase making it possible to deploy large-scale applications or application components in small systems.
  • External storage use is evolving away from the use of rotating media to solid state devices to increase capability, reduce latency, decrease overall cost, and deliver enormous capacity.
  • The evolution of open-source software and distributed computing functions make it possible for the enterprise to inexpensively add a herd of systems when new capabilities are needed rather than facing an expensive and time-consuming forklift upgrade to expand a central host system.
  • Customers demand instant and easy access to applications and data.

As enterprises address these trends, they soon discover that the approach that they had been relying on — focusing on making the best use of expensive systems and networks — needs to change. The most significant costs are now staffing, power, and cooling. This is in addition to the evolution they made nearly two decades ago when their focus shifted from monolithic mainframe computing to distributed, X86-based midrange systems.

The Next Steps in a Continuing Saga

Here’s what enterprise IT has done to respond to all of these trends.

They are choosing to move from using the traditional waterfall development approach to various forms of rapid application development. They also are moving away from compiled languages to interpreted or incrementally compiled languages such as Java, Python, or Ruby to improve developer productivity.

IDC, for example, predicts that:

“By 2021 65% of CIOs will expand agile/DevOps practices into the wider business to achieve the velocity necessary for innovation, execution, and change.”

Complex applications are increasingly designed as independent functions or “services” that can be hosted in several places on the network to improve both performance and application reliability. This approach means that it is possible to address changing business requirements as well as to add new features in one function without having to change anything else in parallel. NetworkWorld’s Andy Patrizio pointed out in his predictions for 2019 that he expects “Microservices and serverless computing take off.”

Another important change is that these services are being hosted in geographically distributed enterprise data centers, in the cloud, or both. Furthermore, functions can now reside in a customer’s pocket or in some combination of cloud-based or corporate systems.

What Does This Mean for You?

Addressing these trends means that enterprise developers and operations staff have to make some serious changes to their traditional approach including the following:

  • Developers must be willing to learn technologies that better fits today’s rapid application development methodology. An experienced “student” can learn quickly through online schools. For example, Learnpython.org offers free courses in Python, while codecademy offers free courses in Ruby, Java, and other languages.
  • They must also be willing to learn how to decompose application logic from a monolithic, static design to a collection of independent, but cooperating, microservices. Online courses are available for this too. One example of a course designed to help developers learn to “think in microservices” comes from IBM. Other courses are available from Lynda.com.
  • Developers must adopt new tools for creating and maintaining microservices that support quick and reliable communication between them. The use of various commercial and open-source messaging and management tools can help in this process. Rancher Labs, for example, offers open-source software for delivering Kurbernetes-as-a-service.
  • Operations professionals need to learn orchestration tools for containers and Kubernetes to understand how they allow teams to quickly develop and improve applications and services without losing control over data and security. Operations has long been the gatekeepers for enterprise data centers. After all, they may find their positions on the line if applications slow down or fail.
  • Operations staff must allow these functions to be hosted outside of the data centers they directly control. To make that point, analysts at Market Research Future recently published a report saying that, “the global cloud microservices market was valued at USD 584.4 million in 2017 and is expected to reach USD 2,146.7 million by the end of the forecast period with a CAGR of 25.0%”.
  • Application management and security issues must now be part of developers’ thinking. Once again, online courses are available to help individuals to develop expertise in this area. LinkedIn, for example, offers a course in how to become an IT Security Specialist.

It is important for both IT and operations staff to understand that the world of IT is moving rapidly and everyone must be focused on upgrading their skills and enhancing their expertise.

How Do Microservices Benefit the Enterprise?

This latest move to distributed computing offers a number of real and measurable benefits to the enterprise. Development time and cost can be sharply reduced after the IT organization incorporates this form of distributed computing. Afterwards, each service can be developed in parallel and refined as needed without requiring an entire application to be stopped or redesigned.

The development organization can focus on developer productivity and still bring new application functions or applications online quickly. The operations organization can focus on defining acceptable rules for application execution and allowing the orchestration and management tools to enforce them.

What New Challenges Do Enterprises Face?

Like any approach to IT, the adoption of a microservices architecture will include challenges as well as benefits.

Monitoring and managing many “moving parts” can be more challenging than dealing with a few monolithic applications. The adoption of an enterprise management framework can help address these challenges. Security in this type of distributed computing needs to be top of mind as well. As the number of independent functions grows on the network, each must be analyzed and protected.

Should All Monolithic Applications Migrate to Microservices?

Some monolithic applications can be difficult to change. This may be due to technological challenges or may be due to regulatory constraints. Some components in use today may have come from defunct suppliers, making changes difficult or impossible.

It can be both time consuming and costly for the organization to go through a complete audit process. Often, organizations continue investing in older applications much longer than is appropriate in the belief that they’re saving money.

It is possible to evaluate what an monolithic application does to learn if some individual functions can be separated and run as smaller, independent services. These can be implemented either as cloud-based services or as container-based microservices.

Rather than waiting and attempting to address older technology as a whole, it may be wise to undertake a series of incremental changes to make enhancing or replacing an established system more acceptable. This is very much like the old proverb, “the best time to plant a tree was 20 years ago. The second best time is now.”

Is the Change Worth It?

Enterprises that have made the move towards the adoption of microservices-based application architectures have commented that their IT costs are often reduced. They also often point out that once their team mastered this approach, it was far easier and quicker to add new features and functions when market demands changed.

If your enterprise hasn’t adopted this approach, it would be wise to learn more about it. Suppliers like Rancher Labs have helped their clients safely make this journey and they may be able to help your organization.

Go Deeper with Online Training

Get free online training on our container management software, Rancher, or continue your education with more advaned topics in the Kubernetes master classes.

Tags: ,,, Category: Products Comments closed

What Do People Love About Rancher?

Thursday, 28 February, 2019
Read the Guide to Kubernetes with Rancher
This guide shows the challenges in running Kubernetes in production and how Rancher helps.

More than 20,000 environments have chosen Rancher as the solution to make the Kubernetes adventure painless in as many ways as possible. More than 200 businesses across finance, health care, military, government, retail, manufacturing, and entertainment verticals engage with Rancher commercially because they recognize that Rancher simply works better than other solutions.

Why is this? Is it really about one feature set versus another feature set, or is it about the freedom and breathing room that come from having a better way?

A Tale of Two Houses

Imagine that you’re walking down a street, and each side of the street is lined with houses. The houses on one side were constructed over time by different builders, and you can see that although every house contains walls, a floor, a roof, doors, and windows, they’re all completely different. Some were built from custom plans, while others were modified over time by the owner to fit a personal need.

You see a person working on his house, and you stop to ask him about the construction. You learn that the company that built his house did so with special red bricks that only come from one place. He paid a great deal of money to import the bricks and have the house built, and he beams with pride as he tells you about it.

“It’s artesenal,” he tells you. “The company who built my house is one of the biggest companies in the world. They’ve been building houses for years, so they know what they’re doing. My house only took a month to build!”

“What if you want to expand?” You point to other houses on his side of the street. “Does the builder come out and do the work?”

“Nope! I decide what I want to build, and then I build it. I like doing it this way. Being hands-on makes me feel like I’m in control.”

Your gaze moves to the other side of the street, where the houses were built by following a different strategy. Each house has an identical core, and where an owner made customizations, each house with that customization has it constructed the same.

You see a man outside of one of the houses, relaxing on his porch and drinking tea. He waves at you, so you walk over and strike up a conversation with him.

“Can you tell me about your house?”

“My house?” He smiles at you. “Sure thing! All of the houses on this side were built by one company. They use pre-fabricated components that are built off-site, brought in and assembled. It only takes a day to build one!”

“What about adding rooms and other features?”

“It’s easy,” he replies. The company has a standard interface for rooms, terraces, and any other add-on. When I want to expand, I just call them, and they come out and connect the room. Everything is pre-wired, so it goes in and comes online almost as fast as I can think of it.”

You ask if he had to do any extra work to connect to public utilities.

“Not at all!” he exclaims. “There’s a panel inside where I can choose which provider I want to connect to. I just had to pick one. If I want to change it in the future, I make a different selection. The house lets me choose everything – lawn care service provider, window cleaner, painter, everything I need to make the house liveable and keep it running. I just go to the panel, make my choice, and then go back to living.

“And best of all, my house was free.”

Rancher Always Works For You

Rancher Labs has designed Rancher to do the heaviest tasks around building and maintaining Kubernetes clusters.

Easily Launch Secure Clusters

Let’s start with the installation. Are you installing on bare metal? Cloud instances? Hosted provider? A mix? Do you want to give others the ability to deploy their own clusters, or do you want the flexibility to use multiple providers?

Maybe you just want to use AWS, or GCP, so multiple providers isn’t a big deal. Flexibility is still important. Your requirements today might be different in a month or a year.

With Rancher you can simply fire up a new cluster in another provider and begin migrating workloads, all from within the same interface.

Global Identity and RBAC

Whether you’re using multiple providers or not, the normal way of configuring access to a single cluster in one provider requires work. Access control policies take time to configure and maintain, and generally, once provisioned, are forgotten. If using multiple providers, it’s like learning multiple languages. Russian for AWS, Swahili for Google, Flemish for Azure, Uzbek for DigitalOcean or Rackspace…and if someone leaves the organization, who knows what they had access to? Who remembers how to speak Latin?

Rancher connects to backend identity providers, and from a global configuration it applies roles and policies to all of the clusters that it manages.

When you can deploy and manage multiple clusters as easily as you can a single one, and when you can do so securely, then it’s no big deal to spin up a cluster for UAT as part of the CI/CD test suite. It’s trivial to let developers have their own cluster to work on. You could even let multiple teams can share one cluster.

Solutions for Cluster Multi-Tenancy

How do you keep people from stepping on each other?

You can use Kubernetes Namespaces, but provisioning Roles across multiple Namespaces is tedious. Rancher collects Namespaces into Projects and lets you map Roles to the Project. This creates single-cluster multi-tenancy, so now you can have multiple teams, each only able to interact with their own Namespaces, all on the same cluster. You can have a dev/staging environment built exactly like production, and then you can easily get into the CD part of CI/CD.

Tools for Day Two Operations

What about all of the add-on tools? Monitoring. Alerts. Log shipping. Pipelines. You could provision and configure all of this yourself for every cluster, but it takes time. It’s easy to do wrong. It requires skills that internal staff may not have – do you want your staff learning all of the tools above, or do you want them focusing on business initiatives that generate revenue? To put it another way, do you want to spend your day spinning copper wire to connect to the phone system, or would you rather press a button and be done with it?

Rancher ships with tools for monitoring your clusters, dashboards for visualizing metrics, an engine for generating alerts and sending notifications, a pipeline system to enable CI/CD for those not already using an external system. With a click it ships logs off to Elasticsearch, Kafka, Fluentd, Splunk, or syslog.

Designed to Grow With You

The more a Kubernetes solution scales (the bigger or more complicated that it gets), the more important it is to have fast, repeatable ways to do things. What about using scripts like Ansible, Terraform, kops, or kubespray to launch clusters? They stop once the cluster is launched. If you want more, you have to script it yourself, and this adds a dependency on an internal asset to maintain and support those scripts. We’ve all been at companies where the person with the special powers left, and everyone who stayed had to scramble to figure out how to keep everything running. Why go down that path when there’s a better way?

Rancher makes everything related to launching and managing clusters easy, repeatable, fast, and within the skill set of everyone on the team. It spins up clusters reliably in any provider in minutes, and then it gives you a standard, unified interface for interacting with those clusters via UI or CLI. You don’t need to learn each provider’s nuances. You don’t need to manage credentials in each provider. You don’t need to create configuration files to add the clusters to monitoring systems. You don’t need to do a bunch of work on the hosts before installing Kubernetes. You don’t need to go to multiple places to do different things – everything is in one place, easy to find, and easy to use.

No Vendor Lock-In

This is significant. Companies who sell you a Kubernetes solution have a vested interest in keeping you locked to their platform. You have to run their operating system or use their facilities. You can only run certain software versions or use certain components. You can only buy complementary services from vendors they partner with.

Rancher Labs believes in something different. They believe that your success comes from the freedom to choose what’s best for you. Choose the operating system that you want to use. Build your systems in the provider you like best. If you want to build in multiple providers, Rancher gives you the tools to manage them as easily as you manage one. Use any provisioner.

What Rancher accelerates is the time between your decision to do something and when that thing is up and running. Rancher gets you out the gate and onto the track faster than any other solution.

The Wolf in a DIY Costume

Those who say that they want to “go vanilla” or “DIY” are usually looking at the cost of an alternative solution. Rancher is open source and free to use, so there’s no risk in trying it out and seeing what it does. It will even uninstall cleanly if you decide not to continue with it.

If you’re new to Kubernetes or if you’re not in a hands-on, in-the-trenches role, you might not know just how much work goes into correctly building and maintaining a single Kubernetes cluster, let alone multiple clusters. If you go the “vanilla Kubernetes” route with the hope that you’ll get a better ROI, it won’t work out. You’ll pay for it somewhere else, either in staff time, additional headcount, lost opportunity, downtime, or other places where time constraints interfere with progress.

Rancher takes all of the maintenance tasks for clusters and turns them into a workflow that saves time and money while keeping everything truly enterprise-grade. It will do this for single and multi-cluster Kubernetes environments, on-premise or in the cloud, for direct use or for business units offering Kubernetes-as-a-service, all from the same installation. It will even import the Kubernetes clusters you’ve already deployed and start managing them.

Having more than 20,000 deployments in production is something that we’re proud of. Being the container management platform for mission-critical applications in over 200 companies across so many verticals also makes us proud.

What we would really like is to have you be part of our community.

Join us in showing the world that there’s a better way. Download Rancher and start living in the house you deserve.

Read the Guide to Kubernetes with Rancher
This guide shows the challenges in running Kubernetes in production and how Rancher helps.
Tags: ,,, Category: Uncategorized Comments closed

Kubernetes in the Region: Observations and an Offer

Tuesday, 19 February, 2019

Find a Rodeo workshop near you
Rancher Rodeos are free, in-depth workshops where you can learn to deploy containers and Kubernetes in production.

Since joining Rancher Labs to head up the Australia, New Zealand, and Singapore region, my day revolves around discussing containers/Kubernetes use cases and adoption with many of the top enterprises, DevOps groups, and executives in the area. Not only is this a great learning experience and a fantastic way to meet people, it is also a huge eye opener into the many reasons why Kubernetes adoption is growing so rapidly and what the current challenges are. I want to quickly share some of my observations and make an offer for you to join us for some free hands-on training.

Some Observations

Everyone is Doing Something with Kubernetes

It doesn’t matter which event, meetup, or customer discussion I’m in — every enterprise is doing something with Kubernetes. It’s like the adoption of virtualization, only the discussion is slightly different. It’s not so much about which vendor or standard — Kubernetes is the focus. Instead, it’s about how to do Kubernetes and what are the associated best practices, scalable architectures, and security considerations.

Kubernetes Native, but How to Do It at Operational Scale?

The community and ecosystem around Kubernetes is growing every day, with strong capabilities, so there is a strong desire to stay on “native” Kubernetes and not get sucked down a branch, fork, or vendor-specific offshoot of Kubernetes. It seems that most enterprise and groups begin this way and get into production with Kubernetes. However, there is a clear point at which scale becomes an operational challenge and basic tools need to be supplemented or worked on to help manage multiple Kubernetes namespaces, multiple clusters, authentication, RBAC, policy, monitoring, and logging across many development teams.

It’s About Consuming Kubernetes, Not “Making” Kubernetes

Nobody wants to be in the business of creating Kubernetes snowflakes, or be in the business of allocating their resources to do work that adds no value. There is a learning curve for operationalizing Kubernetes, using Kubernetes, and deploying workloads into Kubernetes environments. Many enterprises are looking for ways to eliminate the learning curve or the need for specialized skills and instead just consume Kubernetes, using a Kubernetes-as-a-Service model. Much larger and faster gains can be made if consuming Kubernetes becomes the focus instead of making Kubernetes.

Both On-Premise and Public Cloud Kubernetes

As enterprises grow, iterate, and merge, an ever-increasing mixture of infrastructure environments and needs emerges. The same enterprise may create Kubernetes clusters using on-premise bare metal, with OpenStack and VMware-type infrastructures, as well as out on public clouds using Amazon, Google, Azure, Alibaba, and others. The portability and rapid pace of containers lends itself to these hybrid or multi-cloud scenarios (more so than VMs) and is quite quickly sprawling in this way. There is also quite an urgent need for air-gapped Kubernetes environments.

Public Cloud Kubernetes Providers

Most enterprises are now seriously looking at the Kubernetes services offered by public cloud providers, like EKS (now available in Australia & Singapore), GKE, and AKS. These are viable options and really do support some of the notions mentioned in my other observations, like consumability. Technical discussions here become much less about the Kubernetes cluster control planes and architecture, and more about integration of these clusters into enterprise management capabilities like authentication domains, security models, deployment pipelines, and multi-cloud strategies (e.g. on-premise or multiple public clouds).

Our Offer

We run free, half-day training sessions called Rancher Rodeos throughout the world. Among others, this month we have Rodeos in Sydney, Melbourne, and Singapore (registration for Singapore is not open yet). During these sessions, DevOps and IT professionals can get hands-on experience with how to quickly deploy an enterprise-ready Kubernetes environment on any infrastructure or cloud provider (or multiples of these) using Rancher. We will show how Rancher helps make enterprise Kubernetes consumable and native, with rapid results for development and infrastructure teams.

Please take us up on the offer, register here, and join us!

Find a Rodeo workshop near you
Rancher Rodeos are free, in-depth workshops where you can learn to deploy containers and Kubernetes in production.

Introduction to Kubernetes Namespaces

Monday, 28 January, 2019
Expert Training in Kubernetes and Rancher
Join our free online training sessions to learn more about Kubernetes, containers, and Rancher.

Introduction

Kubernetes clusters can manage large numbers of unrelated workloads concurrently and organizations often choose to deploy projects created by separate teams to shared clusters. Even with relatively light use, the number of deployed objects can quickly become unmanageable, slowing down operational responsiveness and increasing the chance of dangerous mistakes.

Kubernetes uses a concept called namespaces to help address the complexity of organizing objects within a cluster. Namespaces allow you to group objects together so you can filter and control them as a unit. Whether applying customized access control policies or separating all of the components for a test environment, namespaces are a powerful and flexible concept for handling objects as a group.

In this article, we’ll discuss how namespaces work, introduce a few common use cases, and cover how to use namespaces to manage your Kubernetes objects. Towards the end, we’ll also take a look at a Rancher feature called projects that builds on and extends the namespaces concept.

What are Namespaces and Why Are They Important?

Namespaces are the organizational mechanism that Kubernetes provides to categorize, filter by, and manage arbitrary groups of objects within a cluster. Each workload object added to a Kubernetes cluster must be placed within exactly one namespace.

Namespaces impart a scope for object names within a cluster. While names must be unique within a namespace, the same name can be used in different namespaces. This can have some important practical benefits for certain scenarios. For example, if you use namespaces to segment application life cycle environments — like development, staging, and production — you can maintain copies of the same objects, with the same names, in each environment.

Namespaces also allow you to easily apply policies to specific slices of your cluster. You can control resource usage by defining ResourceQuota objects, which set limits on consumption on a per-namespace basis. Similarly, when using a CNI (container network interface) that supports network policies on your cluster, like Calico or Canal (Calico for policy with flannel for networking), you can apply a NetworkPolicy to the namespace with rules that dictate how pods can be communicate with one another. Different namespaces can be given different policies.

One of the greatest benefits of using namespaces is being able to take advantage of Kubernetes RBAC (role-based access control). RBAC allows you to develop roles, which group a list of permissions or abilities, under a single name. ClusterRole objects exist to define cluster-wide usage patterns, while the Role object type is applied to a specific namespace, giving greater control and granularity. Once a Role is created, a RoleBinding can grant the defined capabilities to a specific user or group of users within the context of a single namespace. In this way, namespaces let cluster operators map the same policies to organized sets of resources.

Common Namespace Usage Patterns

Namespaces are an incredibly flexible feature that doesn’t impose a specific structure or organizational pattern. That being said, there are some common patterns that many teams find useful.

Mapping Namespaces to Teams or Projects

One convention to use when setting up namespaces is to create one for each discrete project or team. This melds well with many of the namespace characteristics we mentioned earlier.

By giving a team a dedicated namespace, you can allow self-management and autonomy by delegating certain responsibilities with RBAC policies. Adding and removing members from the namespace’s RoleBinding objects is a simple way to control access to the team’s resources. It is also often useful to set resource quotas for teams and projects. This way, you can ensure equitable access to resources based the organization’s business requirements and priorities.

Using Namespaces to Partition Life Cycle Environments

Namespaces are well suited for carving out development, staging, and production environments within cluster. While it recommended to deploy production workloads to an entirely separate cluster to ensure maximum isolation, for smaller teams and projects, namespaces can be a workable solution.

As with the previous use case, network policies, RBAC policies, and quotas are big factors in why this can be successful. The ability to isolate the network to control communication to your components is a fundamental requirement when managing environments. Likewise, namespace-scoped RBAC policies allow operators to set strict permissions for production environments. Quotas help you guarantee access to important resources for your most sensitive environments.

The ability to reuse object names is also helpful here. Objects can be rolled up to new environments as they they are tested and released while retaining their original name. This helps avoid confusion around which objects are analogous across environments and reduces cognitive overhead.

Using Namespaces to Isolate Different Consumers

Another use case that namespaces can help with is segmenting workloads by their intended consumers. For instance, if your cluster provides infrastructure for multiple customers, segmenting by namespace allows you to manage each independently while keeping track of usage for billing purposes.

Once again, namespace features allow you to control network and access policies and define quotas for your consumers. In cases where the offering is fairly generic, namespaces allow you to develop and deploy a different instance of the same templated environment for each of your users. This consistency can make management and troubleshooting significantly easier.

Understanding the Preconfigured Kubernetes Namespaces

Before we take a look at how to create your own namespaces, let’s discuss what Kubernetes sets up automatically. By default, three namespaces are available on new clusters:

  • default: Adding an object to a cluster without providing a namespace will place it within the default namespace. This namespace acts as the main target for new user-added resources until alternative namespaces are established. It cannot be deleted.
  • kube-public: The kube-public namespace is intended to be globally readable to all users with or without authentication. This is useful for exposing any cluster information necessary to bootstrap components. It is primarily managed by Kubernetes itself.
  • kube-system: The kube-system namespace is used for Kubernetes components managed by Kubernetes. As a general rule, avoid adding normal workloads to this namespace. It is intended to be managed directly by the system and as such, it has fairly permissive policies.

While these namespaces effectively segregate user workloads the system-managed workloads, they do not impose any additional structure to help categorize and manage applications. Thankfully, creating and using additional namespaces is very straightforward.

Working with Namespaces

Managing namespaces and the resources they contain is fairly straightforward with kubectl. In this section we will demonstrate some of the most common namespace operations so you can start effectively segmenting your resources.

Viewing Existing Namespaces

To display all namespaces available on a cluster, use use the kubectl get namespaces command:

kubectl get namespaces
NAME            STATUS    AGE
default         Active    41d
kube-public     Active    41d
kube-system     Active    41d

The command will show all available namespaces, whether they are currently active, and the resource’s age.

To get more information about a specific namespace, use the kubectl describe command:

kubectl describe namespace default
Name:         default
Labels:       field.cattle.io/projectId=p-cmn9g
Annotations:  cattle.io/status={"Conditions":[{"Type":"ResourceQuotaInit","Status":"True","Message":"","LastUpdateTime":"2022-11-17T23:17:48Z"},{"Type":"InitialRolesPopulated","Status":"True","Message":"","LastUpda...
              field.cattle.io/projectId=c-7tf7d:p-cmn9g
              lifecycle.cattle.io/create.namespace-auth=true
Status:       Active

No resource quota.

No resource limits.

This command can be used to display the labels and annotations associated with the namespace, as well as any quotas or resource limits that have been applied.

Creating a Namespace

To create a new namespace from the command line, use the kubectl create namespace command. Include the name of the new namespace as the argument for the command:

kubectl create namespace demo-namespace
namespace "demo-namespace" created

You can also create namespaces by applying a manifest from a file. For instance, here is a file that defines the same namespace that we created above:

# demo-namespace.yml
apiVersion: v1
kind: Namespace
metadata:
  name: demo-namespace

Assuming the spec above is saved to a file called demo-namespace.yml, you can apply it by typing:

kubectl apply -f demo-namespace.yml

Regardless of how we created the namespace, if we check our available namespaces again, the new namespace should be listed (we use ns, a shorthand for namespaces, the second time around):

kubectl get ns
NAME             STATUS    AGE
default          Active    41d
demo-namespace   Active    2m
kube-public      Active    41d
kube-system      Active    41d

Our namespace is available and ready to use.

Filtering and Performing Actions by Namespace

If we deploy a workload object to the cluster without specifying a namespace, it will be added to the default namespace:

kubectl create deployment --image nginx demo-nginx
deployment.extensions "demo-nginx" created

We can verify the deployment was created in the default namespace with kubectl describe:

kubectl describe deployment demo-nginx | grep Namespace
Namespace:              default

If we try to create a deployment with the same name again, we will get an error because of the namespace collision:

kubectl create deployment --image nginx demo-nginx
Error from server (AlreadyExists): deployments.extensions "demo-nginx" already exists

To apply an action to a different namespace, we must include the --namespace= option in the command. Let’s create a deployment with the same name in the demo-namespace namespace:

kubectl create deployment --image nginx demo-nginx --namespace=demo-namespace
deployment.extensions "demo-nginx" created

This newest deployment was successful even though we’re still using the same deployment name. The namespace provided a different scope for the resource name, avoiding the naming collision we experienced earlier.

To see details about the new deployment, we need to specify the namespace with the --namespace= option again:

kubectl describe deployment demo-nginx --namespace=demo-namespace | grep Namespace
Namespace:              demo-namespace

This confirms that we have created another deployment called demo-nginx within our demo-namespace namespace.

Selecting Namespace by Setting the Context

If you want to avoid providing the same namespace for each of your commands, you can change the default namespace that commands will apply to by configuring your kubectl context. This will modify the namespace that actions will apply to when that context is active.

To list your context configuration details, type:

kubectl config get-contexts
CURRENT   NAME      CLUSTER   AUTHINFO   NAMESPACE
*         Default   Default   Default

The above indicates that we have a single context called Default that is being used. No namespace is specified by the context, so the default namespace applies.

To change the namespace used by that context to our demo-context, we can type:

kubectl config set-context $(kubectl config current-context) --namespace=demo-namespace
Context "Default" modified.

We can verify that the demo-namespace is currently selected by viewing the context configuration again:

kubectl config get-contexts
CURRENT   NAME      CLUSTER   AUTHINFO   NAMESPACE
*         Default   Default   Default    demo-namespace

Validate that our kubectl describe command now uses demo-namespace by default by asking for our demo-nginx deployment without specifying a namespace:

kubectl describe deployment demo-nginx | grep Namespace
Namespace:              demo-namespace

Deleting a Namespace and Cleaning Up

If you no longer require a namespace, you can delete it.

Deleting a namespace is very powerful because it not only removes the namespaces, but it also cleans up any resources deployed within it. This can be very convenient, but also incredibly dangerous if you are not careful.

It is always a good idea to list the resources associated with a namespace before deleting to verify the objects that will be removed:

kubectl get all --namespace=demo-namespace
NAME                              READY     STATUS    RESTARTS   AGE
pod/demo-nginx-676fc7d85d-gkdz2   1/1       Running   0          56m

NAME                         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/demo-nginx   1         1         1            1           56m

NAME                                    DESIRED   CURRENT   READY     AGE
replicaset.apps/demo-nginx-676fc7d85d   1         1         1         56m

Once we are comfortable with the scope of the action, we can delete the demo-namespace namespace and all of the resources within it by typing:

kubectl delete namespace demo-namespace

The namespace and its resources will be removed from the cluster:

kubectl get namespaces
NAME            STATUS    AGE
default         Active    41d
kube-public     Active    41d
kube-system     Active    41d

If you previously changed the selected namespace in your kubectl context, you can clear the namespace selection by typing:

kubectl config set-context $(kubectl config current-context) --namespace=
Context "Default" modified.

While cleaning up demo resources, remember to remove the original demo-nginx deployment we initially provisioned to the default namespace:

kubectl delete deployment demo-nginx

Your cluster should now be in the state you began with.

Extending Namespaces with Rancher Projects

If you are using Rancher to manage your Kubernetes clusters, you have access to the extended functionality provided by the projects feature. Rancher projects are an additional organizational layer used to bundle multiple namespaces together.

Rancher projects overlay a control structure on top of namespaces that allow you to group namespaces into logical units and apply policy to them. Projects mirror namespaces in most ways, but act as a container for namespaces instead of for individual workload resources. Each namespace in Rancher exists in exactly one project and namespaces inherit all of the policies applied to the project.

By default, Rancher clusters define two projects:

  • Default: This project contains the default namespace.
  • System: This project contains all of the other preconfigured namespaces, including kube-public, kube-system, and any namespaces provisioned by the system.

You can see the projects available within your cluster by visiting the Projects/Namespaces tab after selecting your cluster:

Fig. 1: Rancher projects/namespaces view

Fig. 1: Rancher projects/namespaces view

From here, you can add projects by clicking on the Create Project button. When creating a project, you can configure the project members and their access rights and can configure security policies and resource quotas.

You can add a namespace to an existing project by clicking the project’s Create Namespace button. To move a namespace to a different project, select the namespace and then click the Move button. Moving a namespace to a new project switches immediately modifies the permissions and policies applied to the namespace.

Rather than introducing new organizational models, Rancher projects simply apply the same abstractions to namespaces that namespaces apply to workload objects. They fill in some usability gaps if you appreciate namespaces functionality but need an additional layer of control.

Conclusion

In this article, we introduced the concept of Kubernetes namespaces and how they can help organize cluster resources. We discussed how namespaces segment and scope resource names within a cluster and how policies applied at the namespace level can influence user permissions and resource allotment.

Afterwards, we covered some common patterns that teams employ to segment their clusters into logical pieces and we described Kubernetes’ preconfigured namespaces and their purpose. Then we took a look at how to create and work with namespaces within a cluster. We ended by taking a look at Rancher projects and how they extend the namespaces concept by grouping namespaces themselves.

Namespaces are an incredibly straightforward concept that help teams organize cluster resources and compartmentalize complexity. Taking a few minutes to get familiar with their benefits and characteristics can help you configure your clusters effectively and avoid trouble down the road.

Expert Training in Kubernetes and Rancher
Join our free online training sessions to learn more about Kubernetes, containers, and Rancher.
Tags: , Category: Products, Rancher Kubernetes Comments closed

Kubernetes vs Docker: What’s the difference?

Tuesday, 9 October, 2018

Expert Training in Kubernetes and Rancher
Join our free online training sessions to learn more about Kubernetes, containers, and Rancher.

Docker vs Kubernetes: The Journey from Docker to Kubernetes

The need to deploy applications from one computing environment to another quickly, easily, and reliably has become a critical part of enterprise’s business requirements and DevOps team’s daily workflow.

It’s unsurprising then, that container technologies, which make application deployment and management easier for teams of all sizes, have risen dramatically in recent years. At the same time, however, virtual machines (VM) as computing resources have reached their peak use in virtualized data centers. Since VMs existed long before containers, you may wonder what the need is for containers and why they have become so popular.

The Benefits and Limitations of Virtual Machines

Virtual machines allow you to run a full copy of an operating system on top of virtualized hardware as if it is a separate machine. In cloud computing, physical hardware on a bare metal server are virtualized and shared between virtual machines running on a host machine in a data center by the help of hypervisor (i.e. virtual machine manager).

Even though virtual machines bring us great deal of advantages such as running different operating systems or versions, VMs can consume a lot of system resources and also take longer boot time. On the other hand, containers share the same operating system kernel with collocated containers each one running as isolated processes. Containers are lightweight alternative by taking up less space (MBs) and can be provisioning rapidly (milliseconds) as opposed to VM’s slow boot time (minutes) and more storage space requirements (GBs). This allows containers to operate at an unprecedented scale and maximize the number of applications running on minimum number of servers. Therefore, containerization shined drastically in the recent years because of all these advantages for many software projects of enterprises.

Need for Docker Containers and Container Orchestration Tools

Since its initial release in 2013, Docker has become the most popular container technology worldwide, despite a host of other options, including RKT from CoreOS, LXC, LXD from Canonical, OpenVZ, and Windows Containers.

However, Docker technology alone is not enough to reduce the complexity of managing containerized applications, as software projects get more and more complex and require the use tens of thousands of Docker containers. To address these larger container challenges, substantial number of container orchestration systems, such as Kubernetes and Docker Swarm, have exploded onto the scene shortly after the release of Docker.

There has been some confusion surrounding Docker and Kubernetes for awhile: “what they are?”, “what they are not?”, “where are they used?”, and “why are both needed?”

This post aims to explain the role of each technology and how each technology helps companies ease their software development tasks. By the end of this article, you’ll understand that the choice is not Docker vs Kubernetes, but Kubernetes vs alternative container orchestrators.

Let’s use a made-up company, NetPly (sounds familiar?), as a case study to highlight the issues we are addressing.

NetPly is an online and on-demand entertainment movie streaming company with 30 million members in over 100 countries. NetPly delivers video streams to your favorite devices and provides personalized movie recommendations to their customers based on their previous activities, such as sharing or rating a movie. To run their application globally, at scale, and provide quality of service to their customers, NetPly runs 15,000 production servers worldwide and follow agile methodology to deploy new features and bug fixes to the production environment at a fast clip.

However, NetPly has been struggling with two fundamental issues in their software development lifecycle:

Issue 1- Code that runs perfectly in a development box, sometimes fails on test and/or production environments. Therefore, NetPly would like to keep code and configuration consistent across their development, test, and production environments to reduce the issues arising from application hosting environments.

Issue 2- Viewers experience a lot of lags as well as poor quality and degraded performance for video streams during weekends, nights, and holidays, when incoming requests spike. To resolve this potentially-devastating issue, NetPly would like to use load-balancing and auto scaling techniques and automatically adjust the resource capacity (e.g. increase or decrease number of computing resources) to maintain application availability, provide stable application performance, and optimize operational costs as computing demand increases or decreases. These requests also require NetPly to manage the complexity of computing resources and the connections between the flood of these resources in production.

Docker can be used to resolve Issue 1 by following a container-based approach; in other words, packaging application code along with all of its dependencies, such as libraries, files, and necessary configurations, together in a Docker image.

Docker is an open-source operating system level virtualized containerization platform with a light-weight application engine to run, build and distribute applications in Docker containers that run nearly anywhere. Docker containers, as part of Docker, are portable and light-weight alternative to virtual machines, and eliminate the waste of esources and longer boot times of the virtual-machine approach. Docker containers are created using Docker images, which consist of a prebuilt application stack required to launch the applications inside the container.

With that explanation of a Docker container in mind, let’s go back our successful company that is under duress: NetPly. As more users simultaneously request movies to watch on the site, NetPly needs to scale up more Docker containers at a reasonably fast rate and scale down when the traffic lowers. However, Docker alone is not capable of taking care of this job, and writing simple shell scripts to scale the number of Docker containers up or down by monitoring the network traffic or number of requests that hit to the server would not be a viable and practicable solution.

As the number of containers increases to tens of hundreds to thousands, and the NetPly IT team starts managing fleets of containers across multiple heterogeneous host machines, it becomes a nightmare to execute Docker commands like “docker run”, “docker kill”, and “docker network” manually.

Right at the point where the team starts launching containers, wiring them together, ensuring high availability even when a host goes down, and distributing the incoming traffic to the appropriate containers, the team wishes they had something that handled all these manual tasks with no or minimal intervention. Exit human, enter program.

To sum up: Docker by itself is not enough to handle these resources demands at scale. Simple shell commands alone are not sufficient to handle tasks for a tremendous number of containers on a cluster of bare metal or virtual servers. Therefore, another solution is needed to handle all these hurdles for the NetPly team.

This is where the magic starts with Kubernetes. Kubernetes is as container orchestration engine (COE), originally developed by Google and used to resolve NetPly’s Issue 2. Kubernetes allows you to handle fleets of containers. Kubernetes automatically manages the deployment, scaling and networking of containers, as well as container failovers by launching a new one with ease.

The following are some of the fundamental features of Kubernetes.

  • Load balancing

  • Configuration management

  • Automatic IP assignment

  • Container scheduling

  • Health checks and self healing

  • Storage management

  • Auto rollback and rollout

  • Auto scaling

Container Orchestration Alternatives

Although Kubernetes seems to solve the challenges our NetPly team faces, there are a good deal of container management tool alternatives for Kubernetes out there.

Docker Swarm, Marathon on Apache Mesos, and Nomad are all container orchestration engines that can also be used for managing your fleet of containers.

Why choose anything other than Kubernetes? Although Kubernetes has a lot of great qualities, it has challenges too. The most arresting issues people face with Kubernetes are:

  1. the steep learning curve to its commands;

  2. setting Kubernetes up for different operating systems.

As opposed to Kubernetes, Docker Swarm uses the Docker CLI to manage all container services. Docker Swarm is easy to set up, has less commands to learn to get started rapidly, and is cheaper to train employees. A drawback of Docker Swarm bounds you to the limitations of the Docker API.

Another option is the Marathon framework on Apache Mesos. It’s extremely fault-tolerant and scalable for thousands of servers. However, it may be too complicated to set up and manage small clusters with Marathon, making it impractical for many teams.

Each container management tool comes with its own set of advantages and disadvantages. However, Kubernetes with its heritage based in Google’s Borg system, has been greatly adopted and supported by the large community as well as industry for many years and become the most popular container management solution among other players. With the power of both Docker and Kubernetes, it seems like journey of the power and popularity of these technologies will continue to rise and being adopted by even larger communities.

In our next article in this series, we will compare in more depth Kubernetes and Docker Swarm.

Expert Training in Kubernetes and Rancher
Join our free online training sessions to learn more about Kubernetes, containers, and Rancher.

The Metrics that Matter: Horizontal Pod Autoscaling with Metrics Server

Tuesday, 26 June, 2018

Take a deep dive into Best Practices in Kubernetes Networking
From overlay networking and SSL to ingress controllers and network security policies, we’ve seen many users get hung up on Kubernetes networking challenges. In this video recording, we dive into Kubernetes networking, and discuss best practices for a wide variety of deployment options.

Sometimes I feel that those of us with a bend toward distributed systems engineering like pain. Building distributed systems is hard. Every organization regardless of industry, is not only looking to solve their business problems, but to do so at potentially massive scale. On top of the challenges that come with scale, they are also concerned with creating new features and avoiding regression. And even if they achieve all of those objectives with excellence, there’s still concerns about information security, regulatory compliance, and building value into all the investment of the business.

If that picture sounds like your team and your system is now in production – congratulations! You’ve survived round 1.

Regardless of your best attempts to build a great system, sometimes life happens. There’s lots of examples of this. A great product, or viral adoption, may bring unprecidented success, and bring with it an end to how you thought your system may handle scale.

Pokémon GO Cloud Datastore Transactions Per Second Expected vs. Actual

Source: Bringing Pokémon GO to life on Google Cloud, pulled 30 May 2018

You know this may happen, and you should be prepared. That’s what this series of posts is about. Over the course of this series we’re going to cover things you should be tracking, why you should track it, and possible mitigations to handle possible root causes.

We’ll walk through each metric, methods for tracking it and things you can do about it. We’ll be using different tools for gathering and analyzing this data. We won’t be diving into too many details, but we’ll have links so you can learn more. Without further ado, let’s get started.

Metrics are for Monitoring, and More

These posts are focused upon monitoring and running Kubernetes clusters. Logs are great, but at scale they are more useful for post-mortem analysis than alerting operators that there’s a growing problem. Metrics Server allows for the monitoring of container CPU and memory usage as well as on the nodes they’re running.

This allows operators to set and monitor KPIs (Key Performance Indicators). These operator-defined levels give operations teams a way to determine when an application or node is unhealthy. This gives them all the data they need to see problems as they manifest.

In addition, Metrics Server allows Kubernetes to enable Horizontal Pod Autoscaling. This capability allows Kubernetes autoscaling to scale pod instance count for a number of API objects based upon metrics reported by the Kubernetes Metrics API, reported by Metrics Server.

If you’re just getting underway with Kubernetes, read the Introduction to Kubernetes Monitoring, which will help you get the most out of the rest of this article.

Setting up Metrics Server in Rancher-Managed Kubernetes Clusters

Metrics Server became the standard for pulling container metrics starting with Kubernetes 1.8 by plugging into the Kubernetes Monitoring Architecture. Prior to this standardization, the default was Heapster, which has been deprecated in favor of Metrics Server.

Today, under normal circumstances, Metrics Server won’t run on a Kubernetes Cluster provisioned by Rancher 2.0.2. This will be fixed in a later version of Rancher 2.0. Check our Github repo for the latest version of Rancher.

In order to make this work, you’ll have to modify the cluster definition via the Rancher Server API. Doing so will allow the Rancher Server to modify the Kubelet and KubeAPI arguments to include the flags required for Metrics Server to function properly.

Instructions for doing this on a Rancher Provisioned cluster, as well as instructions for modifying other hyperkube-based clusters is availabe on github here.

Take a deep dive into Best Practices in Kubernetes Networking
From overlay networking and SSL to ingress controllers and network security policies, we’ve seen many users get hung up on Kubernetes networking challenges. In this video recording, we dive into Kubernetes networking, and discuss best practices for a wide variety of deployment options.