Showing posts with label open source. Show all posts
Showing posts with label open source. Show all posts

June 14, 2023

Changing the focus of this blog: now... Observability

My previous post about Infrastructure as Code concludes the exploration of Data Center and Cloud solutions and the methodologies that are related to automation and optimization of IT processes.

I've been working in this area for 15 years, after spending the previous 15 in software development. 
It's been an amazing adventure and I really enjoied learning new things, exploring and challenging some limits - and sharing the experience with you.

Now I start focusing on a new adventure... possibly for the next 15 years 😜

I assumed a new professional role, that is the technical lead for Full Stack Observability, EMEA sales, at Cisco Appdynamics. From now on, I will tell you stories about my experience with the ultimate evolution of monitoring: it's all about collecting telemetry from every single component of your business architecture, including digital services (= distributed software applications), computing infrastructure, network, cloud resources, IoT, etc.

It's not just putting all those data together, but correlating them to create an insight. Transforming raw data into information about the health state of your processes, matching business KPI with the state of the infrastructure that supports the services.

To visualize that information and to navigate it, you can subscribe to (or create your own) different domain models, that are views of the world built specifically for each stakeholder: from lines of business to applications managers, from SRE to network operations and security teams...

A domain model is made of entities and their relationships, where entities represent what is relevant in your (business or technical) domain. They might be the usual entities in a APM domain (applications, services, business transactions...) or infrastructure entities (servers, VM, clusters, K8s nodes, etc.). 
You can also model entities in a business domain (e.g. trains, stations, passengers, tickets, etc.).




Unlike Application Performance Monitoring (APM), where solutions like Appdynamics and its competitors excel in drilling down in the application architecture and its topology, with Full Stack Observability you really have full control end-to-end and have a context shared among all the teams that collaborate at building, running and operating the business ecosystem.

New standards like OpenTelemetry make it easy to convey Metrics, Events, Logs and Traces (MELT) to a unique platform from every single part of the system, including eventually robots in manufacturing, GPS tracing from your supply chain, etc.

All these data will be filtered according to the domain model and those that are relevant will feed the databases containing information about the domain entities and their relationships, that are used to populate the dashboard.




Those data will be matched with any other source of information that is relevant in your business domain (CRM, sales, weather forecast, logistics...) so that you can analyse and forecast the health of the business and relate it to the technologies and the processes behind. You can immediately remediate any problem because you detect the root cause easily, and even be proactive in preventing problems before they occur (or before they are perceived by end users). At the same time, you are able to spot opportunities for optimising the preformances and the cost efficiency of the system.

To see what is the official messaging from Cisco about the Full Stack Observability, check this page describing the FSO Platform

Stay tuned, interesting news are coming...


July 19, 2019

Just one button to provision a production-grade Kubernetes cluster

(this is a guest post, authored by my esteemed colleague Fabio Di Niro)

Do you remember?


I bet all of you who are working or playing with Kubernetes remember perfectly the first time you tried to install it.
And the second.
And the third.
...
And the one that finally worked out.

And if you’re a professional you remember also the long path that brought you to own the expertise on Kubernetes that you need to install and fine-tune production grade clusters.
Or, if you’re not a Kubernetes professional, you probably remember how much time it took for you to find someone able to perform a valid Kubernetes install...and how much it costed.

To save all this time and effort to our customers Cisco released the Cisco Container Platform (CCP), a turnkey solution to easily provision production-grade Kubernetes clusters on-prem or in the cloud in minutes, with few mouse clicks and requiring little to no knowledge of K8s.
All the needed integrations with network, storage, computing and security are done automatically by CCP so that the provisioned K8s clusters are ready to run in production.
Clusters provisioned by CCP are already equipped with finely configured monitoring and logging tools like FluentD, Grafana, ElasticSearch, Kibana.
Through the Container Network Interface (CNI) you can choose whether to leverage Cisco ACI as network infrastructure or Calico (no dependence on the underlying infrastructure).

This is already great, but I thought to create a demo that may push the simplicity of those “few mouse clicks” to its limit, making possible to create a production grade cluster in just one click.







Introducing the Kubernetes dash button.

The concept is fairly simple: build a dash button that, once pressed, creates a production grade Kubernetes cluster ready to use.

Leveraging the rich set of the Cisco Container Platform (CCP) APIs this is even too easy, so I thought to add some more feature on top:

- I wanted to provision the cluster and access it just through the dash button. So, I want CCP to display on the dash button itself the IP address of the master node of the cluster created
- The start and finish of the cluster provisioning process had to be confirmed, so the communications had to be bi-directional with the dash button
- I wanted a fair battery life that would avoid me to recharge the button every day, so I needed to have electronics able to sleep or hibernate
- My lab, where I have the infrastructure and the CCP, is behind a proxy, so I can’t listen for calls inside the lab, I can just initiate communications from the lab. So, I needed a way to change the “push” of the button in a “pull” of the button press information
- I wanted to use the button everywhere I go without worrying about the local Wi-Fi settings



How it works

To satisfy all the above requirements I added a couple of elements in the picture, ending up with the following architecture:



The button is based on an Arduino ESP 32 board, it connects via Wi-Fi to my smartphone and uses its internet connection, this way I can use the button everywhere my phone has data signal. The button leverages a publish-subscribe message service (MQTT) in the cloud to bypass the limitation of the proxy I have behind my lab and reach a couple of scripts that calls the right API in the Cisco Container Platform to trigger the provisioning of a shiny new Kubernetes cluster.
Once the cluster is provisioned the IP address of the master node is returned to the dash button that shows it on its display, at this point it is ready to accept connection and be used.

A 3D printed enclosure completed my project, I took an existing model but then I decided to  leverage the capabilities of CCP to deploy K8s clusters on-prem or in the cloud so I designed the two different enclosures you can see in the picture to have two different dash buttons for the two different deployment target.
All the code and 3D designs have been released and are publicly available at: https://github.com/fdiniro/CCPDashButton




Now, before doing my demo, I can ask to my customers: “How much time and effort takes you to install a production-grade, fully operationalized and secured kubernetes cluster?” and whatever answer I get I know I can answer “I can do it in 2 minutes blindfolded and cuffed”.

You can see the recorded demo here: https://youtu.be/-F-xR0XNPBs



June 15, 2017

The best open source solution for microservices networking



Introduction

There a big (justified) hype around containers and microservices.
Indeed, many people speak about the subject but few have implemented a real project.
There is also a lot of excellent resources on the web, so there is no need for my additional contribution.
I just want to offer my few readers another proof that a great solution exists for containers networking, and it works well.
You will find evidences in this post and pointers to resources and tutorials.
I will explain it in very basic terms, as I did for Cisco ACI and SDN, because I’m not talking to network specialists (you know I’m not either) but to software developers and designers. 
Most of the content here is reused from my sessions at Codemotion 2017 in Rome and Amsterdam (you can see the recording on youtube). 

Containers Networking

When the world moved from bare metal servers to virtual machines, virtual networks were also created and added great value (plus some need for management).


physical hosts connected to a network
Initially networking was simple

Of course virtual networks make the life of developers and servers managers easier, but they also add complexity for network managers: now there are two distinct networks that need to be managed and integrated.
You cannot simply believe that an overlay network runs on a physical dumb pipe with infinite bandwidth, zero latency and no need for end to end troubleshooting.


Virtual Machines connected to an overlay network
Virtual Machines connected to an overlay network

With the advent of containers, their virtual networking layered on top of the VM virtual network (the majority of containers run inside VM for a number of reasons), though there are good examples of container runtime on physical hosts.
So now you have 3 network layers stacked on top of each other, and a need to manage the network end to end that makes your work even more complex.


Containers inside VM: many layers of overlay networks
Containers inside VM: many layers of overlay networks

This increased abstraction creates some issues when you try to leverage the value of resources in the physical environment:
- connectivity: it's difficult to insert network services, like load balancers and firewalls, in the data path of microservices (regardless the virtual or physical nature of the appliances).
- performances: every overlay tier brings its own encapsulation (e.g. vxlan). Encapsulation over encapsulation over encapsulation starts penalizing the performances... just a little  ;-)
- hardware integration: some advanced features of your network (performances optimization, security) cannot be leveraged
Do not despair: we will see that a solution exists for this mess.

Microservices Networking


This short paragraph describes the existing implementation of the networking layer inside the containers runtime.
Generally it is based on a pluggable architecture, so that you can use a plugin that is delegated by the container engine to manage the container's traffic. You can choose among a number of good solutions from the open source community, including the default implementation from Docker.

Minimally the networking layer provides:
- IP Connectivity in Container’s Network Namespace
- IPAM, and Network Device Creation (eth0)
- Route Advertisement or Host NAT for external connectivity



containers networking
The networking for containers

There are two main architectures that allow to plug an external implementation for networking: CNM and CNI. Let's have a look at them.

The Container Network Model (CNM)    


Proposed by Docker to provide networking abstractions/API for container networking. 
It is based on the concept of a Sandbox that contains configuration of a container's network stack (Linux network namespace).
An endpoint is container's interface into a network (a couple of virtual ethernet interfaces).  
A network is collection of arbitrary endpoints that can communicate. 
A container can belong to multiple endpoints (and therefore multiple networks).
  
CNM allows for co-existence of multiple drivers, with a network managed by one driver   
Provides Driver APIs for IPAM and for Endpoint creation/deletion.  
- IPAM Driver APIs: Create/Delete Pool, Allocate/Free IP Address Network   
- Driver APIs: Network Create/Delete, Endpoint Create/Delete/Join/Leave  

Used by docker engine, docker swarm, and docker compose.  
Also works with other schedulers that runs standard containers e.g. Nomad or Mesos.


Container Network Model

Container Network Model


 

The Container Network Interface (CNI)

 

Proposed by CoreOS as part of appc specification, used also by Kubernetes. 
Common interface between container run time and network plugin.
Gives driver freedom to manipulate network namespace.
Network described by JSON configuration.

Plugins support two commands: 
- Add Container to Network 
- Remove Container from Network     

Container Network Interface

Container Network Interface

Many good implementations of the models above are available on the web.
You can pick one to complement the default implementation with a more sophisticated solution and benefit from better features.  

It looks so easy on my laptop. Why is it complex?


When a developer sets up the environment on its laptop, everything is simple.
You test your code and the infrastructure just works (you can also enjoy managing... the infrastructure as code).
No issues with performances, security, bandwidth, logs, conflicts on resources (ip address, vlan, names…).
But when you move to an integration test environment, or to a production environment, it’s no longer that easy.
IT administrators and the operations team are well aware of the need for stability, security, multi tenancy and other enterprise grade features.
So not all solutions are equal, especially for networking. Let's discuss their impact on Sally and Mike:





Sally (software developer) - she expects:
Develop and test fast 
Agility and Elasticity 
Does not care about other users




Mike (IT Manager) - he cares for: 
Manage infrastructure 
Stability and Security 
Isolation and Compliance 

These conflicting goals and priorities challenge the collaboration and the possibility to easily adopt a DevOps approach.
A possible solution is a Policy-based Container Networking.
Policy based management is simpler thanks to Declarative Tags (used instead of complex commands syntax), and it is faster because you manage Groups of resources instead of single objects (think of the cattle vs pets example).

What is Contiv


Contiv unifies containers, VMs, and bare metal servers with a single networking fabric, allowing container networks to be addressable from VM and bare-metal network endpoints.  Contiv combines strong network performance, support for industry-leading hardware, and an application-oriented policy that can move across networks together with the application.

Contiv's goal is to manage the "operational intent" of your deployment in a declarative way, as you generally do for the "application intent" of your microservices. This allows for a true infrastructure as code management and easy implementation of DevOps practices.



 


Contiv provides an IP address per container and eliminates the need for host-based port NAT. It works with different kinds of networks like pure layer 3 networks, overlay networks, and layer 2 networks, and provides the same virtual network view to containers regardless of the underlying technology. 

Contiv works with all major schedulers like Kubernetes, Docker Swarm, Mesos and Nomad. These schedulers provide compute resources to your containers and Contiv provides networking to them. Contiv supports both CNM (Docker networking Architecture) and CNI (CoreOS and Kubernetes networking architecture). 
Contiv has L2, L3 (BGP), Overlay (VXLAN) and ACI modes. It has built in east-west service load balancing. Contiv also provides traffic isolation through control and data traffic. 
It manages global resources: IPAM, VLAN/VXLAN pools.


Contiv Architecture


Contiv is made of a master node and an agent that runs on every host of your server farm:

Contiv and its clustered architecture
Contiv's support for clustered deployments

The master node(s) offer tools to manipulate Contiv objects. It is called Netmaster and implements CRUD (create, read, update, delete) operations using a REST interface. It is expected to be used by infra/ops teams and offers RBAC (role based access control).

The host agent (Netplugin) implements cluster-wide network and policy enforcement. It is stateless: very useful in case of a node failure/restart and upgrade.

A command line utility (that is a client of the master's REST API) is provided: it's named netctl.

Contiv and its cluster wide architecture
Contiv's architecture


Examples


Learning Contiv is very easy: from the Contiv website there is a great tutorial that you can download and run locally.
For your convenience, I executed it on my computer and copied some screenshots here, with my comments to explain it step by step.

First, let's look at normal docker networks (without Contiv) and how you create a new container and connect it to the default network:



Networks in Docker
Networks in Docker


You can inspect the virtual bridge (in the linux server) that is managed by Docker: look at the IPAM section of the configuration and its Subnet, then at the vanilla-c container and its ip address.


How Docker sees its networks
How Docker sees its networks


You can also look at the network config from within the container:


the network config from within the container

Now we want to create a new network with Contiv, using its netctl command line interface:

Contiv's netctl command line interface
Contiv's netctl command line interface

Here you can see how Docker lists and uses a Contiv network:


how Docker lists and uses a Contiv network

Look at the IPAM section, the name of the Driver, the name of the network and of the tenant:



We now connect a new container to the contiv-net network as it is seen by Docker: the command is identical when you use a network created by Contiv.



Multi tenancy

You can create a new Tenant using the netctl tenant create command:

Creating tenants in Contiv
Creating tenants in Contiv

A Tenant will have its own networks, that can overlap other tenants' network names and even their subnets: in the example below, the two networks are completely isolated and the default tenant and the blue tenant ignore each other - even though the two networks have the same name and use the same subnet
Everything works as if the other network did not exist (look at the "-t blue" argument in the commands).

Two different networks, with identical name and subnet, can exist in different tenants
Two different networks, with identical name and subnet


Let’s attach a new container to the contiv-net network in the blue tenant (the tenant name is explicitly used in the command, to specify the tenant's network):


All the containers connected to this network will communicate. The network extends all across the cluster and benefits of all the features of the Contiv runtime (see the website for a complete description).


The policy model: working with Groups


Contiv provides a way to apply isolation policies among containers groups (regardless of the tenants, eventually within the tenants).  To do that we create a simple policy called db-policy, then we associate the policy to a group (db-group, that will contain all the containers that need to be treated the same) and add some rules to the policy to define which ports are allowed.

Creating a policy in Contiv
Creating a policy in Contiv


(click on the images to zoom in)

Adding rules to a policy
Adding rules to a policy


Finally, we associate the policy with a group (a group is an arbitrary collection of containers, e.g. a tier for a microservice) and then run some containers that belong to db-group:


Creating a group in Contiv, so that many containers get the same policies
Creating a group




The policy named db-policy (defining, in this case, what ports are open and closed) is now applied to all the 3 containers: managing many end points as a single object makes it easy and fast, just think about auto-scaling (especially when integrated with Swarm, Kubernetes, etc.).

The tutorial shows many other interesting features in Contiv, but I don't want to make this post too long  :-)


Features that make Contiv the best solution for microservices networking


  • Support for grouping applications or applications' components.
  • Easy scale-out: instances of containerized applications are grouped together and managed consistently.
  • Policies are specified on a micro-service tier, rather than on individual container workloads.
  • Efficient forwarding between microservice tiers.
  • Contiv allows for a fixed VIP (DNS published) for a micro-service
  • Containers within the micro-services can come and go fast, as resource managers auto-scale them, but policies are already there... waiting for them.
  • Containers' IP addresses are immediately mapped to the service IP for east-west traffic.
  • Contiv eliminates the single point of forwarding (proxy) between micro-service tiers.
  • Application visibility is provided at the services level (across the cluster).
  • Performances are great (see references below).
  • It mirrors the policy model that made Cisco ACI an easy and efficient solution for SDN, regardless the availability of an ACI fabric (Contiv also works with other hw and even with all-virtual networks).

I really invite you to have a look and test it yourself using the tutorial

It's easy and not invasive at all, seeing is believing.


May 10, 2016

A simpler framework for hybrid cloud

Hybrid cloud is one of top mind projects for most IT managers, and there's little content that one can add to be original   ;-)

The hype and the attempt of many vendors (including... Cisco) to provide relevant solutions have populated the space of an incredible number of offers that make it hard to distinguish what works, what's manageable and cost effective, from what is only marketecture.




Recently Cisco decided to invest even more on cloud and, with the advent of a new CTO and some acquisitions, a revision of our approach to hybrid cloud made it easier and more effective. This post is not from official marketing and is not echoing company's direction: it's my attempt to rationalize my understanding of the new framework and to solicit your comments and feedback, so that I can leverage it when I discuss with my customers and partners.
The following picture represents the area where Cisco plays a role, offering hardware and software solutions.
When it comes to the software stack to manage the infrastructure and provide services to the users, we have a mix of Cisco products, open source solutions and integration with 3rd parties. The objective is to offer a set of pre-validated stacks that can match the different needs, granting a deterministic result.



I shared some thoughts with a group of colleagues because we're planning educational activities for our field people: instead of just providing a reference architecture (that would end being a list of products to be forced in every deal) we tried to represent the functions in the system as components of a framework, from which we'll pull the specific architecture for a given project. This, used cum grano salis, should help to be pragmatic and realize quick wins (for both the customers - think of Fast IT initiatives - and of course for Cisco).

As a result, next picture is separating the different functional layers so that they can be explained to sales guys and to customers.
It could also help to manage the possible overlap with alternative solutions that customers may choose – or already have – because every element is replaceable in the picture, based on the open API they expose/consume (as well as any well designed 3rd party product).

It is important to note that the top two layers in the picture are optional, since not all customers need those functions in their system. Based on the level of Governance that they want to have, the existing processes and the way they develop business applications (or use commercial software that only need a resource pool to be deployed), the entry point could be directly at the third layer (Multi-Cloud Management) and ITSM and PaaS would be removed.




So, while we explain all the possibilities as said above, we need to make them feel confident that it’s doable and not overly complex.
In that regard, my motto is that “cloud is not a product (or a set of), it’s a project and it’s complex in nature… regardless the products set you choose”. Generally the cost of hardware and software products is lower than development and consulting services, and customers know it.
If we can claim that a pre-built integration makes the project easier (and we can), I would stress the value of reducing the project risk and delivering outcomes faster rather than a cheaper implementation.

Selling licenses can be (almost) easy, but driving adoption with business outcomes for customers is different. Finally Cisco has built a practice that can deliver IT projects effectively and recruited partners that do the same: customers have different options to choose from.

Now, in the context of a end to end strategy defined with the customer, we can deliver projects based on agile methodologies (e.g. Scrum) and implement the architecture layers with a bottom up approach: from a strong capability to automate the Data Center (and the hybrid cloud) you can create services that are surfaced up to the consumption layers, including a self service catalog.


Software Defined What?

The bottom up approach stresses the value of the API exposed by UCS and ACI (with the further evolution from basic programmability to policy-based management, that I'm not mentioning yet - look out for next post). With the power and the granularity of those API, you can really realize a fully Software Defined Data Center (SDDC): servers and networks can be shaped via software interfaces.
By the way, I take the opportunity here to clarify that Software Defined Data Center does not mean Software Implemented Data Center: you don't necessarily need a software overlay that mimics the behavior of the hardware (living as a separate entity), you need software controllers that drive the shape and the behavior of both physical and virtual resources in the DC as a single system.
Better if they do that based on policies... like the Cisco architecture does  :-)
You will see a post dedicated to policies and application intent soon on this blog.



Competition?

We also recognize that many customers have already an ITSM solution in place, or any other form of governance. So we don't engage in competitive fights, like imposing Cisco Prime Service Catalog vs Service Now, but we rather integrate our solution with the existing components: this is a sort of compromise with a competitor that hurts my pride, but since it's for our customers' benefit... it's a good solution.

Cisco Cloud Center as a broker: the recent acquisition of Cliqr brings a great solution to Cisco to address the multi-cloud management use cases, the most important ones for the majority of customers. In the logical schema above you can see that the hybrid cloud scenario has been qualified better as Multi-Cloud management.
This reflects the fact that having a application deployed partly in your Data Center and partly in the public cloud is still a relevant use case, but many companies are more attracted by other scenarios... like moving from one project stage to next (e.g. Dev-Test-QA-Prod) using different resource pools (on premise or in cloud), or moving their assets from one cloud provider to a different one.


Cloud Brokering and Multi Cloud Management

In the first one (promotion to next stage) it could be useful to leverage resources that are allocated based on business convenience (e.g. cost or flexibility) or compliance (e.g. data sovereignty), so the application and all the needed infrastructure are moved back and forth to the public cloud.
In the second the driver could be a dual provider strategy, or maybe a change in the market conditions that makes one provider more appealing than the current one, or a strategic switch from private cloud to public (or vice versa).


In all these cases, we offer a solution to deploy a software stack (a complete custom application, a development platform, or a commercial software product) as a self service option, where the target can be selected dynamically from a list of available clouds.
You can deploy to your local private cloud, based on vmware or any other virtualization solution, or to a Openstack based cloud, or to any of the public cloud providers if you have an account there.
Any resource pool is a possible destination for the deployment (and the life cycle management, including autoscale or retirement of the application).
The model of the deployment of the application is completely de-coupled from the selection of the target, thanks to the capabilities of the orchestrator that can configure the needed resources in almost any cloud transparently.
It uses the API exposed by the element managers of a multi vendor infrastructure on premise (e.g. vcenter, UCS Manager, the ACI controller, etc.) and those exposed by public clouds like AWS, Azure, etc.



From a logical schema to a real deployment

So we can offer users a different entry point, based on their business needs (they might need a ticketing system, or a self service catalog, a PaaS solution or directly the web portal of the multi cloud manager to model deployments and deliver them).
The customer can have one or more resource pools, allocated wherever he likes (local or in cloud), and let the broker direct the selection of the target based on predefined policies.

The schema in next picture presents different products at every layer: a solution can be based on one of them, or a combination. We have the flexibility to match the specific needs with products from Cisco, from 3rd party vendors or open source.
As an example, MANTL is a new open source project that makes the development of microservices easier if you build cloud native applications.




I will expand the detail of the single products and the open source solutions shown in this picture in my next post.
Stay tuned...


References

http://www.cisco.com/c/en/us/solutions/executive-perspectives/fast_it.html
http://www.cisco.com/web/solutions/trends/futureofit/why-cisco.html
http://MANTL.io
http://Github.com/CiscoCloud/microservices-infrastucture 
http://lucarelandini.blogspot.it/2015/10/devops-docker-and-cisco-aci-part-1.html
http://lucarelandini.blogspot.it/2015/03/aci-for-dummies.html
http://lucarelandini.blogspot.it/2015/09/the-phoenix-project-how-devops-can.html