OPERA

Contents

  1. Introduction
  2. Scenarios
  3. Topology
  4. Workloads
  5. Requirements
  6. Results

1. Introduction

Welcome to Optimization Performance Evaluation and Resource Allocator (OPERA) tool.

OPERA helps you to design high performance distributed applications. Your application is composed of services. Services reside in containers; containers are hosted by computers or nodes which communicate through middlewares and networks.

What you provide. You provide an XML document written in Performance Extensible Language (pxl) and the document will have the extension pxl. You can find the grammar of the pxl document in model.dtd, a Data Type Definition.

It is assumed that you know:

  • Usage Scenarios of your application. For example, what happens when the end user clicks on a link to the web page or submits a form to your web server? What services are called and what kind of performance demands the click or submit caused to each service in the system? There are several elements that make a scenario:
    • Services. These are fine granular software elements in your application. You can think of services as being Java classes, EJBs or Web services, and so on. You have to add the services to Services XML element. Services are shared by all scenarios. If a scenario does not share a service, then the demand of the service in that scenario should be zero.
    • Resource Demands for each service and scenario, which is how many milliseconds CPU and DISK spend processing a request to that service on a specific scenario. You can find this out by using profilers or performance monitors. Once you have this information, you have to add the demands to Services XML element.
    • Calls or how the services call each other. You can find this out by using the application design documents, or profilers. Once you have this information, you have to add calls for each Scenario XML element
  • The deployment topology, such as nodes and their performance characteristics, networks and middleware. Services run in containers and containers run on nodes. Containers also belong to Clusters. Once you have this information, you have to add the Nodes, Clusters, Containers, Networks and Middleware XML elements to the Topology collection
  • What workload do you expect for your application, i.e. the maximum number of users you expect, and workload mixes of interest. You have to add this data to  the Workload XML element
  • What are the performance requirements of your application, i.e. what response time do you expect for each scenario. You have to add this data to  the Requirements XML element

What you get. After you solve the model, you will get a Result XML document with

  • The configuration that best supports your performance requirements, together with other configurations that outperform your initial configuration
  • The worst response times (with regard to any workload mix) you may encounter for those configurations
  • The maximum possible utilization for each service. This can help you to decide the number of replicas for each service, the number of threads you need or the service activation policy.

The following sections show how to fill the XML input model.

 


2. Scenarios

Scenarios are triggered by user actions and denote traces through the application. Scenarios can be derived from Use Cases, from Class diagrams, as defined by UML, or by tracing the application using application profilers. The diagram below shows the UML class diagram of an web application that implements an Internet Auction and supports three scenarios:

  • createBid: allows the user to publicize an item for sale and a starting price;
  • find: queries for specific items;
  • makeBid: submits bids for an item.
In any scenario, the Client calls the Proxy, the Proxy calls RPCRouter which calls the Data service through a session bean (EJBItemSession) and one or more entity beans (EJBItem).

To fill in the scenarios, you have to identify the Services, the number of calls from service to service on each scenario, and CpuDemand and DiskDemand of each service per scenario. The scenarios are identified by name, as shown in the examples below.

Services

For each service you add service attributes:

name
A string that denotes the name of the service. No spaces are allowed.
canMigrate
This field is false if the service is anchored to a container and true otherwise. When the service can migrate, OPERA will move the service around to find the best architecture. The services are moved only to containers belonging to the same cluster.
runsInContainer
This is the container name to which the service is allocated initially.

Example

Below is an example of how to define two services:

<Scenarios>
  <Services>
    <Service name="Browser" canMigrate="false" runsInContainer="Client"/>
    <Service name="RPCRouter" canMigrate="false" runsInContainer="WebContainer" />
    ...
  </Services> 
  ...
</Scenarios>

Scenario

For each scenario, you define the attributes:

name
the name that identifies the scenario.
triggeredByService
identifies the service that triggers the scenario.

The Scenario element contains the calls between services that make the scenario. These are described by the Call element with the following attributes:

caller
the service that makes the call.
callee
the service that gets called.
bytesSent
number of bytes sent during a call.
bytesReceived
number of bytes received during a call.
invocations
the number of calls from the caller to callee.
type
a call can be synchronous (and the value "s" is used) or asynchronous (value "a").

Each call makes use of CPU and DISK, as defined by the element Demand with the following attributes:

CPUDemand
a real number that specifies the CPU time (in milliseconds, seconds, etc.) consumed by this service per one request of scenario.
DiskDemand
the time required at the DISK for one request to get service. As with CPUDemand, can be measured in miliseconds, seconds, etc.

When writing the pxl file, it is important that all measurements use the same time unit (i.e. all demands are expressed in milliseconds). To find demands, use profilers and trace each individual services. Professional profilers can be very useful. Also, demands can be estimations from your previous work, especially when you are in the design stage.

Example

<Scenarios>
  <Services>    
    ...
  </Services> 

  <Scenario name="find" triggeredByService="Browser">
    <Call caller="Browser" callee="RPCRouter" invocations="1" type ="s"
          bytesSent="514" bytesReceived="575">
      <Demand CPUDemand="8" DiskDemand="2" />
    </Call>

    <Call caller="RPCRouter" callee="EJBItemSession" invocations="1" type="s"
          bytesSent="514" bytesReceived="575">
      <Demand CPUDemand="15" DiskDemand="1" />
    </Call>

    <Call caller="EJBItemSession" callee="EJBItem" invocations="1" type="s"
          bytesSent="10" bytesReceived="10">
      <Demand CPUDemand="9" DiskDemand="1" />
    </Call>

    <Call caller="EJBItem" callee="Data" invocations="1" type="s"
          bytesSent="10" bytesReceived="10">
      <Demand CPUDemand="13" DiskDemand="3" />
    </Call>
  </Scenario>

  ...

</Scenarios>

3. Topology

The topology denotes the nodes and the network hosting the application, the clusters, the middleware . For the Auction application exemplified above, the topology might look like in the following diagram: there are two nodes, ClientNode and WebNode. On the WebNode there are 3 containers, WebServer, AppServer and DataServer. The first two containers belong to the same cluster, Web Cluster.

The pxl description of the above diagram is given by this Topology XML element.

<Topology>
  <Node name="ClientNode"  type="client"  CPURatio="1" CPUMultiplicy="2"
        DiskRatio="1" DiskMultiplicity="2"/>
  <Node name="WebNode"  type="server" CPURatio="1" CPUMultiplicity="4"
        DiskRatio="1" DiskMultiplicity="1"/>
  <Node name="DataHost"  type="server" CPURatio="0.1" CPUMultiplicity="4"
        DiskRatio="1" DiskMultiplicity="1"/>

  <Cluster name="WebCluster">
    <Container name="WebServer" runsOnNode="WebNode"
               server="true" multiplicity="50"/>
    <Container name="AppServer" runsOnNode="WebNode"
               server="true" multiplicity="50"/>
  </Cluster>

  <Middlware name="http" fixedOverheadSend="0"
             fixedOverheadReceive="0" overheadPerByteSent="0"/>

  <Network name="Internet" connectsNodes="ClientNode WebNode"
           latency="10"  overheadPerByte="0.01" />
</Topology>

The xml elements (and their attributes) that should be in a topology are shown below:

Node

name
a string that denotes the name of the Node. No spaces are allowed.
CPURatio
A real number that shows the ratio between the CPU speed of the host on which the demands of the services were collected and the CPU speed of this host. If the ratio is 0.5 it means that this host CPU is two times faster than the CPU where the measurement was done.
DISKRatio
A real number that shows the ratio between the DISK speed of the host on which the demands of the services were collected and the DISK speed of this host. If the ratio is 0.5 it means that this DISK is two times faster than the DISK where the measurement was done.
type
the type of the Node. A Node can be client or server
CPUMultiplicity
DiskMultiplicity
are used to represent multiple CPUs and DISKs. For example, two identical nodes can be represented as one node with the CPUMultiplicity=2 and DiskMultiplicity=2.

Cluster

name
a string that denotes the name of the Cluster. No spaces are allowed.

A Cluster will have one or more Container elements.

Container

name
a string that denotes the name of the Container. No spaces are allowed.
runsOnNode
the name of the Node on which the Container is running.
server
if the Container is a server then this attribute should be set to true; else should be false.
multiplicity
this attribute can be used to set the number of threads that run on container.

Network

name
a string that denotes the name of the Network. No spaces are allowed.
connectsHosts
a list with the names of the Nodes connected by the network. The names are separated by spaces.
latency
the interval between the time a bit is sent and the time when this bit is received at destination (the time-unit should be the same as in the rest of the document).
overheadPerByte
the time needed to transmit a byte (the time-unit should be the same as in the rest of the document).

Middleware

name
a string that denotes the name of the Container. No spaces are allowed.
fixedOverheadReceive
fixedOverheadSend
The interval between the time a bit is sent and the time when this bit is received at destination
overheadPerByteReceived
overheadPerByteSent
How many milliseconds are needed to transmit a byte

4. Workloads

In the Workloads section you define:

kind attribute:   Type specifies the kind of workload the application is optimized for.
The possible values are:
    HL  - High Population Level,
    ML - Medium Population Level and
    LL  - Low Population level.
Currently, only ML is supported.

Users element. Total number of users do you want your application to support. It can be any integer greater than 1.  This number is used for finding the worst response time and highest utilization across all workload mixes.

Workload mixes define the number of users for each scenario.  You can define as many workload mixes as you need. Note that the workload mixes are independent of the Users element described above.  The workload mixes can consider the system as a closed or open model.  When the system is considered as open, then the openModel attribute should be set to “true”. When the system is modeled as a closed model, then the openModel attribute is set to “false”. The value of the openModel attribute has implications on the meaning of the Mix elements.

Mix element defines the load of a given scenario by setting the load attribute. When the system is modeled as an open system ( openModel=”true”) then the load attributes define the arrival rate in that scenario. When the system is modeled as a closed system ( openModel=”false”) then the load attributes define the number of users in that scenario. In this latter case, the number of users is complemented by the Think Times element defined below.

Think Times. Think times for each scenario. They denote the user idle time between two requests in milliseconds. These values are considered in tandem with the Users element defined above or with those Workload Mixes that refer to closed models.

Example

<Workloads kind="HL">

  <Users>150</Users>

  <WorkloadMixes openModel=”false”>
    <Mix scenario="find" load="100"/>
    <Mix scenario="makeBid" load="23"/>
    <Mix scenario="createBid" load="27"/>
  </WorkloadMixes>

  <ThinkTimes>
    <ThinkTime scenario="find" time="3000"/>
    <ThinkTime scenario="makeBid" time="3000"/>
    <ThinkTime scenario="createBid" time="3000"/>
  </ThinkTimes>

</Workloads>

5. Requirements

Response time. Here, for each scenario, you specify a lower (minResponseTime) and an upper (maxResponseTime) value of the targeted response time. The values are real numbers, greater than zero.   The tool will try to optimize the distribution of the services that can migrate  so it reaches the lower value for each scenario.

Example

<Requirements>
  
  <ResponseTime scenario="find" minResponseTime="1000" maxResponseTime="100000"/>
  <ResponseTime scenario="makeBid" minResponseTime="400" maxResponseTime="100000"/>
  <ResponseTime scenario="createBid" minResponseTime="400" maxResponseTime="100000"/>
  
</Requirements>

6. Results

The results are presented in an XML format and the name of the output file has “Result” appended at the name of the input XML file.

In the results you have all Architectures that are better than your initial configuration with regard to the performance requirements. These configurations are found by APERA while trying to find the best architecture. Below is an example of results for a recommended architecture.

Example

<Architecture name="0" recommend="true">
  
  ...
  
  <Service MaxUtilization="43.72830303256987"
                 runsInContainer="WebServer" name="RPCRouter"/>
  <Service MaxUtilization="36.299644824470704"
                 runsInContainer="AppServer" name="EJBItemSession"/>
  
  ...
  
  <Scenario name="find">
    <MaxResponseTime value="2378.3878753673257"/>
    <MinSatisfaction value="0.9860768901478048"/>
  </Scenario>
  
  ...
  
  <Workload users="1">
    
    <Scenario name="find" users="0.009999999776482582">
      <ResponseTime>
        88.90841516804296
      </ResponseTime>
      <Throughput>
        3.2479042660763243E-6
      </Throughput>
    </Scenario>
    
    ...
    
    <Node name="WebHost">
      
      <CPU>
        <Utilization>0.0175581708471691</Utilization>
      </CPU>
      
      <DISK>
        <Utilization>0.001656431211997085</Utilization>
      </DISK>
      
    </Node>
    
  </Workload>
  
  ...
</Architecture>

Service utilization. Replication and activation policies are closely related to the maximum service utilization.  The utilization of a process is the sum of its services utilizations. In general, processes with maximum utilization near zero need only activated by request. Depending on the request arrival distribution, the activation policy should be shared server (in which one instance of the service is shared by all requests) or server per request (an instance of the service is activated for each request). The latter may be more appropriate if startup overheads are low relative to the resource demands needed to provide the service. Terminating a process permits the allocation of system resources to other low-frequency processes.

In the example above, the MaximumUtilization of RPCRouter is 43.7. That means that, in average, WebServer should have  43.7 threads. A safe design is to set the number of threads to a higher level than that predicted by APERA. Following   the same logic, the number of threads for the EJB container should be greater than the MaximumUtilization of the EJBItemSession service which is 36.2.

Extreme metrics. These metrics give the worst case values for two metrics across all the workload mixes possible for the number of users defined by the Users element. If the total number of users is 100, their distribution across different scenarios can be arbitrary: for our sample, 98 might be doing scenario 0, 1 is doing scenario 1 and 1 user is doing the scenario 2. There are an extremely large number of combinations that users might be distributed across all scenarios. APERA will find the worst case scenario across all mixes.

·         MaxResponseTime gives the maximum response per scenario, across all workload mixes

·         MinSatisfaction is a measure between 0 and 1 that gives the degree in which the required response time for each scenario is fulfilled. 1 means that the response time of the scenario is less than or equal to the lower response time limit set in requirements. 0 means that the response time of that scenario is beyond the upper limit set in the requirements. The optimization algorithm tries to maximize the minimum satisfaction for all scenarios.

Workload. For the workloads declared in the Workloads XML element in the pxl file, APERA predicts the response time (ResponseTime XML element) and throughput (Throughput XML element) for each scenario. Also, for each workload, the utilizations of the Hosts (CPU and Disk) are reported.