Leveraging openshift or kubernetes for automated performance tests (part 1) – rhd blog electricity bill nye


This is the first article in a series of three articles based on a session I hold at Red Hat Tech Exchange EMEA. In this first article, I present the rationale and approach for leveraging Red Hat OpenShift or Kubernetes for automated performance testing, give an overview of the setup, and discuss points that are worth considering when executing and analyzing performance tests. I will also say a few words about performance tuning.

In the second article, we will look at building an observability stack, which—beyond the support it provides in production—can be leveraged during performance tests. Open sources projects like Prometheus, Jaeger, Elasticsearch and Grafana will be used for the purpose. The third article will present the details for building an environment for performance testing and automating the execution with JMeter and Jenkins.

However, this brings a couple of challenges. electricity quiz grade 9 With monthly, weekly, or daily releases, it is critical to avoid breaking things when releasing code to production. Tests are a major aspect for building confidence in the code, but they have traditionally required weeks to months of effort, which is unsustainable with the pace of releases we are now talking about. Automation is becoming critical.

Functional tests have historically had a fairly good level of automation through the use of unit tests that run when an application gets built. That is not the case with nonfunctional integration and performance tests. Even though this article focuses on performance aspects, the approach and setup can be reused for nonfunctional and integration aspects.

I decided to use asynchronous communication because that is something I see increasingly with customers who are embracing microservices and event-driven design. It is also interesting because most JMeter and OpenTracing examples and documentation are focused on synchronous calls. Bringing asynchronicity into the picture makes it slightly more complex.

An important characteristic and the main benefit of container images is that they are immutable. This drastically simplifies the release process and decreases the level of risk inherent to it. Regarding tests, immutability also makes it easy to guarantee that what has been validated is what is promoted to the next environment: from integration to UAT to staging to production, for instance.

For this to work, it is important to externalize environment specific aspects such as the database, message broker addresses, or credentials from the application and the container image. In Kubernetes and OpenShift (Red Hat’s enterprise distribution of Kubernetes), this can easily be done by using configMaps and secrets. Both work in a similar way, but secrets have additional restrictions for guaranteeing the confidentiality of sensitive information. Data in configMaps and secrets can be injected into a running container either as environment variables or as files mounted into the container file system.

But what about application tuning parameters? By that, I mean things like the number of message consumers, the size of connection pools, etc. walmart with a gas station near me These parameters can have a big impact on the behavior of the application in production, which speaks for having them buried in the immutable container image. Remember: we want to promote exactly what we have tested. On the other hand, not being able to modify them during the performance tests (we would need to re-create the image) may slow down or reduce the breadth of the tests that can be run in a period of time.

• The second option is more pragmatic. It consists of externalizing these parameters into configMaps as well. To mitigate the risk of releasing something different from what has been tested, the source of the configMaps should be recorded into a version control system such as git or a CMDB and tagged for each release. I recommend not to have these files directly with the code. Having them in a different repository helps with “promoting” the configuration in a similar way as we would promote our code without the need to create a new code release when only the configuration has changed. Having separate repositories for each environment (integration, UAT, production) allows us to have a clear picture of what version is running in each environment and to easily promote the code from one to another.

Running automated performance tests is a great thing, but for them to bring their full value we need to understand how the application behaves when put under load. Leveraging the observability features built for production readiness is a straightforward way of getting this insight: identifying bottlenecks, error states, resource consumption under load, etc. There are three pillars can be used for that:

Applications rely on an infrastructure to fulfill usual functions such as message brokering here. Load balancing or state persistence are other examples. The challenge introduced by these systems is that they are often shared. The results of performance tests may hence be influenced by external factors. gas natural fenosa File system reads and writes and network communication may also be influenced. There are, however, a few mitigation strategies:

Regarding this last point, I like the approach taken by the message broker used in this demo. EnMasse, created with OpenShift in mind from day one, can spawn a new, dedicated broker on demand. We can have it provisioned for the test run and decommissioned afterward. No other application is using the broker, which provides isolation and the decommissioning after test runs ensures minimal resources reservation. Monitoring the broker will also provide confidence that it is not a limiting factor with respect to performance under load. Test automation

Many tools are available for supporting test automation, such as JMeter, Gatling, Locust, Grinder, and Tsung. They provide a robust, scalable, and flexible way to produce test loads. Message templates, test data sets, or load injection patterns can easily be configured. An aspect that I like in JMeter is the possibility to design and experiment tests using its UI and let them run afterward from the command line, which is a must for scheduled tests with higher load. The UI also helps when we need to interact with less-technical staff in the design phase or have them change and refine the test cases once the technical aspects have been settled.

Observability is also relevant for JMeter to make sure that it is not under resource constraints nor is it a limiting factor for performing the tests; shortly stated, it ensures that the thermometer is not broken. Another aspect is that it also helps with measuring performance at boundaries. When you use OpenTracing, it is not enough to know how long the application took to process a message; it is also important to know how long the message waited in the queue before being picked up. Instrumenting JMeter can provide a better approximation of that. Jenkins

It may also provide a high-level view of the test results: pass or fail. It also integrates nicely with JMeter to have a quick view of trends. By adding the capability to build from scratch and decommission at the end of the tests, Jenkins provides confidence that what is tested is what was intended and tracked in the version control system.

Performance tests often require a significant amount of resources because it is best to run the tests in an environment that replicates production. By leveraging OpenShift and Jenkins pipelines, it is possible to create in minutes an environment for the time the tests are run and to decommission it right after. By doing that, we don’t need to mobilize the resources for longer than required, which may mean significant savings in energy and costs. Repeatability

Being able to see the impact of code or configuration changes on performance allows us to understand the trade-offs made by a design or an implementation decision and to react quickly when we don’t feel comfortable with the implications. The delete and re-create approach offers a clean starting point for comparing apples to apples between runs. 10 gases Moreover, it provides confidence that what is being tested is also what is available and tagged in repositories (code source, configuration, and container registry).

With a highly dynamic platform such as OpenShift, it is important to make sure that the same amount of resources can get mobilized during runs in order to be able to compare them. Therefore, we need to configure the deployments with CPU/memory requests that are equal to the CPU/memory limits. We don’t want to allow any fluctuation of resources based on the load (by other applications) of the nodes where the component instances are running. This differs to what we may have in production where we may want to mobilize as many resources as available.

Latency and throughput are often significantly affected by the data being processed. It is important to have a data set representative of the data in production that can be reused between runs. As to the difference in functional testing, it is not only necessary to account for the diversity of production data but also for the occurrences of specific data sets. It is best to use a real production data set that may have been anonymized if the data is sensitive.

It is important to know the injection pattern of data in production for creating significant test cases because latency and, to some extent, throughput are affected by it. gas examples Is our application processing a batch or streams of messages? Are there strong variations during a day, a minute, or a second? 1.8 million messages per hour is not the same as 30,000 messages per minute or 500 per second. A uniform distribution is not granted. I have seen systems that were performing quite well with 1.8 million messages per hour uniformly distributed, but the SLA was missed for 90% of the messages in real life due to batch injection. It only took 20 minutes to inject the messages and this was happening every hour.

To get the real picture of how an application is performing, it is important to pay attention to the measurement points. When you are looking at applications using brokerage, the time the message is enqueued is more relevant than the time it is dequeued by the customer (also, both provide valuable information). When the component is not able to cope with the load, the time spent by messages waiting to be processed is affected the most. The above diagrams provide an idea: look for a usual pattern of how the time spent in “enqueued” and “read and processing” evolve when components get overwhelmed.

There is also quite a lot of confusion on memory consumption with Java applications. The heap size and its utilization ratio is just part of the picture. Metaspace, thread stacks (the default setting may be up to 1MB per thread), may also take a significant part. On top of that, comes the memory utilized at the system level for opened files/sockets, for instance. The memory used at the system level is not that easy to account for due to the way the operating system optimizes its use with sharing and caching. When the application runs inside a container on OpenShift, the values reported by cgroups are the ones to monitor. Coordinated omission

The idea with coordinated omission is that the response time of the system under test may affect the measurement. JMeter is configured with a limited number of threads and also has limited resources. If the call to send messages is blocked for a longer period of time, which also means high latency, that may also prevent JMeter from sending the targeted number of messages during the interval; hence, it will result in getting fewer measurements when the system behaves badly. This would distort the latency percentile and average values that are calculated. This demo aims at validating application (not broker) performance/behavior. There is also the possibility to publish asynchronously and, hence, nonblocking. With this in mind, I can make the following assumptions:

When reporting on test results, it is handy to have two levels. A simplistic one, which only tells whether the test passed or failed with a high-level view of trends, and a second level, which can leverage the observability and provides valuable information for troubleshooting performance issues or degradation. It is critical to see in a few minutes how the application has performed during tests. If it takes too long for the analysis, the team won’t look at the results on a regular basis. When we talk about performance, we usually look at three aspects:

In the past, tuning has usually been done for peak load, the time when the application is most challenged to meet its SLA. Startup and initialization were rare events. Moving to disposable containers, the choice may not be that clear anymore. Considering auto-scaling, cluster rebalancing containers may get stopped and started more frequently. Being able to auto-scale does not help if our application instance needs minutes to start and we have to respond to the load created by the start of a batch producing thousand of messages per seconds. electricity per kwh calculator Also, the first messages processed after startup may have a higher latency during the warmup phase (JIT compilation and optimization, pool loading, etc.).

• Prefetch strategy: Prefetch is a very useful optimization for throughput. It prevents the application threads from waiting for the message to be fetched from the broker over the network before the message can get processed. With nonexclusive consumers we should, however, pay attention that one instance does not starve the pool of waiting messages; otherwise, we may end up with the funny pattern where instance 1 consumes and process up to, let’s say, 50 messages (the prefetch size) when instance 2 does nothing followed by instance 2 consuming and processing 50 messages while instance 1 does nothing, and so on.