Wednesday, September 25, 2024

The impact of micro-services and containers on application and enterprise architecture - Part I

Historical paradigm shifts in programming

Since its inception in the mid-20th century, the field of software development has undergone several paradigm shifts. For example, the introduction of structured programming concepts in early coding languages (such as Fortran, Pascal and C) made the resulting code far more manageable than it had been originally.

Later, the advent of object-oriented programming techniques allowed encapsulation to be routinely applied in the next generation of languages, like C++, Smalltalk, Eiffel, etc. Similarly, the subsequent introduction of software design patterns added a significant level of sophistication to the practice of both application and enterprise architecture.

The resulting paradigm shift could be likened to:

  • Replacing “spaghetti” code (i.e. tangled modules with no structure or encapsulation)
  • By “lasagna” code (i.e. layered modules, with data abstraction and information hiding)

As desktop apps gradually became ubiquitous, various types of Windows forms were introduced to develop the interfaces for end users, for the interactions with databases and so on. Then, as the Internet became more pervasive, these were replaced by Web forms.

This became common in popular multi-purpose languages like C# and Java, as well as in tools intended for more specialized purposes (e.g. like Oracle Developer, for programming database applications). 

A common sense approach that led to implementing Agile development methodologies

Meantime, the advent of Agile development methodologies revolutionized the management of the software life cycle, by providing a viable alternative to the traditional “waterfall” model.  Over time, the progression of methodologies looked something like this:

The Waterfall method:  

  • This approach involved the sequential development of large, highly coupled modules
  • Thus the start of work on subsequent modules often depended on the completion of the previous ones
Extreme Programming:
  • This approach made use of paired programming and aggressive testing
  • The idea was to ensure the implementation of smaller modules that worked reliably and were easy to test, instead of huge monolithic blocks that were difficult to maintain
  • This was achieved by having programmers work in pairs, so that they could constantly compare ideas and ensure adequate test coverage
Incremental development and deployment, with constant customer feedback:
  • This is the essence of the “Agile” approach, which was an outgrowth of Extreme Programming
  • It involved deploying apps after relatively short “Sprints”, which each included their own analysis, design, development, testing and deployment phases
  • In other words, each Sprint became a microcosm of the entire software development life cycle
  • In order to ensure constant feedback from clients (and resulting adjustments as needed), the length of Sprints were typically measured in weeks
  • Hence this represents a huge change from the Waterfall model, where deployments were scheduled over periods of months or even years
  • Meanwhile the typical inefficient, boring and excessively long weekly status meetings were replaced by short, daily “stand up” team meetings (generally referred to as “Scrums”)
  • This ensured that developers always had timely feedback (and if necessary, help) from their peers
  • The three daily questions became:
i. What did you do yesterday?
ii What will you do today?
iii. What are your blockers (i.e. where can you use some help)?
  • Where possible, subject matter experts (SMEs) were embedded into the dev teams, to ensure that the code continued to conform to what the clients wanted
Hybrid approaches: 
  • Evolving Agile methodologies have resulted in sub-specialties, as such as:
    • “Lean” (i.e. minimal or just in time) development and
    • “Scrum Masters” (i.e. the people who manage the Scums and Sprints) 
  • By and large though, a hybrid approach to software development has emerged
  • Therefore, companies today will often pick and choose whatever development methodology (or combination thereof) works best for them.

Continuous integration and automated database management leads to the evolution of Dev Ops

As a result of the evolving Agile approaches to software development, the risky practice of infrequent monolithic deployments has been gradually replaced by providing smaller periodic ones. Thus, highly coupled and monolithic legacy apps were gradually either refactored or replaced, by a series of decoupled “black box” modules, which only talked to each other via strict APIs.

Many specialized languages also began to emerge for various highly specific needs (e.g. SQL, PEARL, PHP, R, Python, Angular, React, etc.). Meanwhile, various object relational data mapping (ORM) tools made database programming much more accessible to programmers who were not familiar with longstanding SQL coding practices. Similarly, with the introduction of concepts like business-intelligence, data mining and Analytics, the administration of databases started to become more of an ongoing, automated integration process.

Thus the various data types from relational database, data warehouses, Data Lakes and so on all had to be combined and optimized. This was necessary so that the underlying data could be effectively harnessed and explored. Then, as dev ops came to be increasingly automated, the use of continuous testing, integration and deployment became the norm.

This resulted in test-driven development and a greater emphasis on quality control. Lately this approach has been tempered somewhat by the concept of minimal viable products (MVP), which ensure that “the perfect does not lead to the sacrifice of the good”. Nevertheless, the trend remains to modularize software as much as possible.

Further replacing monolithic applications by the introduction of modular “micro-services”

Another major paradigm shift is occurring, as software apps are being split up into increasingly smaller components, through the increasing use of micro-services, containers and orechestration. These components are designed to provide highly specialized services, which generally communicate via RESTful APIs or asynchronous message queues.

The advantage of breaking up an application into small, manageable components (i.e. micro-services) is that the resulting modules can be maintained, enhanced of replaced independently of each other. As long as the interface “contracts” between them are respected by their APIs, the inner workings of each micro-service can be fully encapsulated.

This result is true regardless of whether the modules communicate via so-called “RESTful” HTTPS service calls, or via other API paradigms, such as asynchronous message queues. The former type of service calls are becoming more common and typically involve an HTTPS Put, Get or Pull of some kind.

This generates a “response code”, which indicates the status of the HTTP request, along with some data (e.g. using JSON, YML, etc.), if applicable.

Asynchronous message queues are very useful when the services that are being provided require a waiting period. Hence the message queue is polled periodically to determine if the results of a given query are available, but in the meantime the calling programs can continue to go about their business.

This approach represents a significant step forward, since it hugely diminishes the risk of making changes to a given service. For example, if a given micro-service is negatively affected, then it simply goes offline until it’s fixed. This approach was pioneered by Netflix when it first introduced streaming video services, but it’s now common among Big Data users, like Google, Facebook, Amazon and so on.

The ensuing need for containers and orchestration

The resulting micro-service modules lend themselves naturally to being “containerized”, using tools such as Docker. In particular, this approach allows the micro-services to be easily instantiated, tested and refactored independently … as long as the underlying communication contracts between them (i.e. their APIs) are respected. [1]

In fact, as a result of the increasing implementation of micro-services, the use of containers and orchestration actually become essential (rather than optional) tools. In particular, “orchestration” refers to the process of managing the containers automatically, via tools such as Kubernetes. This introduces the next level of features, such as automated scalability and “self-healing” to the containerized micro-services (more on this later).

The key here is that micro-services involve numerous small modules communicating with each other via RESTful or other APIs. When these modules are implemented, they become virtually impossible to manage manually. Hence the need to implement them via nodes, within containers, as well as the accompanying need for orchestration of the nodes (i.e. they’re required, rather than optional).

This means that a tradeoff results between the convenience of being able to modify the containerized micro-service independently and the ensuing need for some automated way to manage them.

While containers are implemented virtually, unlike the “virtual machines” which preceded them, containers do not require their own operating system.  Hence multiple containers can share a single OS and thus they consume very little in the way of resources. They can be “spun up” or “spun down” (i.e. created or destroyed) effortlessly, making them ideal for continuous integration and automated testing.

This means that continuous integration tools (e.g. Jenkins) can be configured to automatically build, test and integrate the containers.  Databases can also be spun up or down in this way, by implementing tools such as SQL Server Integration Services (SSIS) via containers. For example, tools like SSIS can automate the management of scripts that are used to deploy and maintain databases.

In the case of persistent data, the state must be maintained, which is a problem for containers that are constantly being spun up or down. Hence “volumes” are instantiated, which seamlessly connect to the containers (i.e. unlike containers, the volumes tend to exist on real media, such as disks).

This allows the data to persist, even when the containers are regularly created and destroyed. As far as the container is concerned, the connection to the database is essentially the same, whether it’s stored in a container or a volume.

Implementing micro-services via cloud-native orchestrated containers

The process of deploying applications “to the cloud” is intended to ensure that additional resources and scalability are automatically invoked, on demand, by the providers of the “platform as a service” (PaaS).

Tasks such as these are generally managed by Dev Ops engineers, who are specialized in working with tools like Docker, Kubernetes, Jenkins, SSIS, Kafka and so on.  This, in turn, frees up software developers to focus on what they do best - namely maintaining and evolving their application code. 

As alluded to earlier, implementing numerous parallel micro-services requires orchestration, because it’s simply not realistic to manage multiple containers manually.  Thus the orchestration tools provide the Dev Ops engineers with the ability to precisely configure the apps that they deploy. As a result, the number of instantiations available for a specific micro-service (and at any given time) is adjusted automatically.

For example, if it’s determined (e.g. through metrics, instrumentation, etc.) that N containers are needed to address the demand for a given service, then Kubernetes will ensure that there are always N containers available. In other words, if one or more Docker containers fail, then Kubernetes will gradually replace them until the desired number is restored.  This process is called “self-healing”. [2]

Similarly, if the demand increases or decreases, then the number of containers will automatically increase or decrease (as needed). This approach optimizes the use of resources, since the number of containers at any given time will be no more or less than what is dictated by demand. 



[1] Hence the remainder of this article will explore how the advent of containerized micro-services requires orchestration, which results in cloud-native apps that feature automatic scaling and provisioning.

[2] In the remainder of the article we’ll use Docker and Kubernetes as examples of containers and orchestration tools, respectively, for convenience. However, these concepts generally apply equally to the similar products provided by other vendors.

Update on my latest activities

In this article, we'll consider some of the software development environments that I've worked with  lately. This is a precursor to the subsequent posts on the impact that microservices and containers have had on  software development.

The .Net tech stack

Over the past 13 years I’ve used the typical .Net tools extensively, both as a developer and an architect. For example, this includes C#, SQL Server, SQL Server Reporting Services, Visual Studio, TypeScript and Angular.

Most of my projects have been built on the .Net platform, such that C# has been the primary programming tool for business logic. Similarly, SQL Server was the principal RDBMS for these projects, although I’ve often sued other tools interchangeably (e.g Oracle).

Visual Studio has been the editor of choice, giving us access to tools, such as Team Foundation Server, which was replaced by Azure Dev Ops.  Thus, we could use the built-in tools to track projects using an Agile approach, within the TFS/Azure Dev Ops framework.

For user interfaces, I’ve used various versions of Java Script, such as TypeScript. However, my preference is Angular, because it lends itself well to object-oriented design and development, very much like C#.  

In all the cases referred to above, as an architected I’ve been able to share my hands o experience with developers, allowing them to understand nuances that I came across as a programmer. I still enjoy being hands on to help my team solve problems and navigate proofs of concepts for approaches that they are not familiar with.

Serverless applications using microservices

I’ve been using microservices for virtually all of my applications since around 2019. Ther most extensive use has been for Simulation Magic, as described on the web site at www.SimulationMagic.com .  In the various projects for which I’ve been the architect, we’ve used primarily Kubernetes to manage the containers and Docker to create the containers themselves.

However, variations on this theme include various tools from AWS, which essentially mimic the Kubernetes/Docker paradigm (see below).

From my perspective as an architect, the primary advantage of this approach is to spin up/destroy containers as needed and automate the connectivity between them. I also like how microservices can be sued to create virtual servers for testing proofs of concepts, load management, etc.  

Finally, I like the way that responsibilities are compartmentalized within containers. To me, this is simply the next step in the OO paradigm, which aims to ensure that processes are black boxes that communicate via input/output parameters. This allows us to easily replace components within any system that I’m working on, either to repair or improve them.

AWS as microservices

I worked with AWS last summer for IBM on a project for Immigration Canada. This involved creating the infrastructure for a new digital application process for Canadian passports. The AWS system was the foundation for this project, which required various types of connectivity between platforms for security, verification of identity using AI, saving applications, training an image recognition, system, etc.

We encountered some latency issues with AWS, with respect to loading of services that were called automatically. However, we worked directly with Amazon engineers to address these problems. As a senior solutions architect, I was responsible for ensuring that dev team members were able to continue working without being blocked by such issues.

NoSQL

In the past 5 years, I’ve used NoSQL in projects where we needed an alternative to traditional Relational Database Management Systems. This has been useful in cases where the data being stored is in a format that is not easily accessible in relational database (e.g. Blobs, images, etc.).  For example, in the digital passport project for Immigration Canada, we used NoSQL to store photographs required for training of image recognition system.

However, it’s worth noting that NoSQL is not always appropriate. In particular, when we’re accessing small blocks of data that can be stored in tables, a relational database (using SQL) is still best. For Simulation Magic, we use a combination of both, since our Big Data requires Blobs of public and private data,

This data must first be filtered and then converted into relational data that can be treated by calculation engines used for Analytics and Simulations.

Education and learning software

I’ve actually been involved with education-oriented software since the start of my work as a software engineer career, going back to the early days of my career. More recently, I’ve been designing and developing Simulation Magic, the AI-based system referred to above. This is a joint project with SOLSER Information Technology, which is based out of Mexico.

The system uses Analytics and Simulations to mitigate risk, by helping people make decisions, based on whatever information is available to them. There are many applications for this, such as managing the allocation of funds for various projects with resources competing for attention.

However, a major application is in education, where simulations can be used to create realistic scenarios that approximate real life situations. For example, this approach can be applied to any scenario where diagnostics are required.

This includes medical diagnoses, decisions by technicians, supply chain management, operations, etc.  We are presently developing various versions for the minimal viable product (MVP) that will be used by Beta users in real life environments where they need to simulate and analyze the decisions required in their everyday workplace environment (see also www.SimulationMagic.com).   

Mocking

This is an approach that has been an integral part of all of my projects for the past 5 to 10 years. Here is how defined on the Stack Overflow web site: 

"Mocking is primarily used in unit testing. An object under test may have dependencies on other (complex) objects. To isolate the behaviour of the object you want to test you replace the other objects by mocks that simulate the behaviour of the real objects. This is useful if the real objects are impractical to incorporate into the unit test. In short, mocking is creating objects that simulate the behaviour of real objects."   [1]

------------------------------------------------

References 
1. See https://stackoverflow.com/questions/2665812/what-is-mocking