Autoscaling in OpenStack Using Heat and Ceilometer, part 2

Posted: January 2, 2015 in Cloud Computing, Cool Projects, Heat, OpenStack, Software

After a long delay (I was moving into a new house and work keeps me very busy) here is the second part of my post on creating scale out workloads in OpenStack using Heat and Ceilometer.  In part one, we broke down the different parts of the Heat template that we will be using in this part of the posting.  We also covered how I had images and software repos configured to support the WordPress website the template will be deploying.  In this part, we will deploy the application, or stack as it is called in OpenStack lingo, and look at different ways to monitor the application to see what is going on.In case you need the template, here is a copy that you can copy and use:

heat_template_version: 2013-08-22

description: >
  HOT template to deploy two servers (database and WordPress web server) into an
  existing neutron tenant network and create a load balancer.  The load balancer
  will be assigned a floating IP addresses and balance traffic to the available
  web servers.  Depending on the CPU utilization, the number of web servers online
  will be scaled up or donw. 

parameter_groups:
  label: configuration_data
  description: These items pertain the the configuration of the instances.
  parameters:
    key_name:
      type: string
      description: Name of keypair to assign to servers
    image:
      type: string
      description: Name of image to use for servers
      default: rhel6.5-x86_64
      constraints:
      - allowed_values: [ rhel6.5-x86_64,rhel7-x86_64 ]
        description: Image ID must be either rhel6.5-x86_64 or rhel7-x86_64
    flavor:
      type: string
      description: Flavor to use for servers
      default: m1.small
      constraints:
      - allowed_values: [scale, m1.small, m1.medium]
  label: database_parameters
  description: These values are used to configure the database
  parameters:
    db_name:
      type: string
      description: WordPress database name
      default: wordpress
      constraints:
      - length: { min: 1, max: 64 }
        description: db_name must be between 1 and 64 characters
      - allowed_pattern: '[a-zA-Z][a-zA-Z0-9]*'
        description: >
          db_name must begin with a letter and contain only alphanumeric
          characters
    db_username:
      type: string
      description: The WordPress database admin account username
      default: admin
      hidden: true
      constraints:
      - length: { min: 1, max: 16 }
        description: db_username must be between 1 and 64 characters
      - allowed_pattern: '[a-zA-Z][a-zA-Z0-9]*'
        description: >
          db_username must begin with a letter and contain only alphanumeric
          characters
    db_password:
      type: string
      description: The WordPress database admin account password
      default: admin
      hidden: true
      constraints:
      - length: { min: 1, max: 41 }
        description: db_username must be between 1 and 64 characters
      - allowed_pattern: '[a-zA-Z0-9]*'
        description: db_password must contain only alphanumeric characters
    db_root_password:
      type: string
      description: Root password for MySQL
      default: admin
      hidden: true
      constraints:
      - length: { min: 1, max: 41 }
        description: db_username must be between 1 and 64 characters
      - allowed_pattern: '[a-zA-Z0-9]*'
        description: db_password must contain only alphanumeric characters
  label: network_parameters
  description: These values are used to configure the load balancer and the instances.
  parameters:
    public_net_id:
      type: string
      description: ID of public network for which floating IP addresses will be allocated
      default: 57177826-f330-4aae-b5ad-822356d3a906
    private_net_id:
      type: string
      description: ID of private network into which servers get deployed
      default: 967938a8-78ae-49b3-bef6-e293e9a6751d
    private_subnet_id:
      type: string
      description: ID of private sub network into which servers get deployed
      default: b7cb50e6-b650-46ce-a891-66b858761a2d

resources:
  wp_dbserver:
    type: OS::Nova::Server
    properties:
      name: wp_dbserver
      image: { get_param: image }
      flavor: { get_param: flavor }
      key_name: { get_param: key_name }
      networks:
        - port: { get_resource: wp_dbserver_port }
      user_data:
        str_replace:
          template: |
            #!/bin/bash -v

            lokkit --port=3306:tcp

            cat << EOF > /etc/yum.repos.d/local.repo
            [wordpress]
            name=WordPress for Enterprise Linux 6
            baseurl=http://192.168.0.1/repos/wordpress_rhel6
            enabled=1
            gpgcheck=0
            EOF

            yum -y install mysql mysql-server
            chkconfig mysqld on
            service mysqld start

            # Setup MySQL root password and create a user
            mysqladmin -u root password db_rootpassword
            cat << EOF | mysql -u root --password=db_rootpassword
            CREATE DATABASE db_name;
            GRANT ALL PRIVILEGES ON db_name.* TO "db_user"@"%" IDENTIFIED BY "db_password";
            FLUSH PRIVILEGES;
            EXIT
            EOF
          params:
            db_rootpassword: { get_param: db_root_password }
            db_name: { get_param: db_name }
            db_user: { get_param: db_username }
            db_password: { get_param: db_password }

  web_server_group:
    type: OS::Heat::AutoScalingGroup
    properties:
      min_size: 1
      max_size: 3
      resource:
        type: http://192.168.0.1/lb_server.yaml
        properties:
          flavor: {get_param: flavor}
          image: {get_param: image}
          key_name: {get_param: key_name}
          pool_id: {get_resource: pool}
          metadata: {"metering.stack": {get_param: "OS::stack_id"}}
          user_data:
            str_replace:
              template: |
                #!/bin/bash -v

                lokkit --service=http                

                cat << EOF > /etc/yum.repos.d/local.repo
                [wordpress]
                name=WordPress for Enterprise Linux 6
                baseurl=http://192.168.0.1/repos/wordpress_rhel6
                enabled=1
                gpgcheck=0
                EOF

                yum -y install httpd wordpress
                setsebool -P httpd_can_network_connect=1
                chkconfig httpd on
                service httpd start

                sed -i "/Deny from All/d" /etc/httpd/conf.d/wordpress.conf
                sed -i "s/Require local/Require all granted/" /etc/httpd/conf.d/wordpress.conf
                sed -i s/database_name_here/db_name/ /etc/wordpress/wp-config.php
                sed -i s/username_here/db_user/ /etc/wordpress/wp-config.php
                sed -i s/password_here/db_password/ /etc/wordpress/wp-config.php
                sed -i s/localhost/db_address/ /etc/wordpress/wp-config.php

                service httpd restart
              params:
                db_name: { get_param: db_name }
                db_user: { get_param: db_username }
                db_password: { get_param: db_password }
                db_address: { get_attr: [ wp_dbserver, first_address ] }

  wp_dbserver_port:
    type: OS::Neutron::Port
    properties:
      network_id: { get_param: private_net_id }
      fixed_ips:
        - subnet_id: { get_param: private_subnet_id }

  web_server_scaleup_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: web_server_group}
      cooldown: 60
      scaling_adjustment: 1

  web_server_scaledown_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: web_server_group}
      cooldown: 60
      scaling_adjustment: -1

  cpu_alarm_high:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-up if the average CPU > 50% for 1 minute
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 50
      alarm_actions:
        - {get_attr: [web_server_scaleup_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
      comparison_operator: gt

  cpu_alarm_low:
    type: OS::Ceilometer::Alarm
    properties:
      description: Scale-down if the average CPU < 15% for 10 minutes
      meter_name: cpu_util
      statistic: avg
      period: 600
      evaluation_periods: 1
      threshold: 15
      alarm_actions:
        - {get_attr: [web_server_scaledown_policy, alarm_url]}
      matching_metadata: {'metadata.user_metadata.stack': {get_param: "OS::stack_id"}}
      comparison_operator: lt

  lb_vip_port:
    type: OS::Neutron::Port
    properties:
      network_id: { get_param: private_net_id }
      fixed_ips:
        - subnet_id: { get_param: private_subnet_id }

  lb_vip_floating_ip:
    type: OS::Neutron::FloatingIP
    properties:
      floating_network_id: { get_param: public_net_id }
      port_id: { get_resource: lb_vip_port }

  lb_pool_vip:
    type: OS::Neutron::FloatingIPAssociation
    properties:
      floatingip_id: { get_resource: lb_vip_floating_ip }
      port_id: { 'Fn::Select': ['port_id', {get_attr: [pool, vip]}]}

  monitor:
    type: OS::Neutron::HealthMonitor
    properties:
      type: TCP
      delay: 3
      max_retries: 5
      timeout: 5

  pool:
    type: OS::Neutron::Pool
    properties:
      protocol: HTTP
      monitors: [{get_resource: monitor}]
      subnet_id: {get_param: private_subnet_id}
      lb_method: ROUND_ROBIN
      vip:
        protocol_port: 80
        ## session_persistence:
        ##   type: SOURCE_IP

  lb:
    type: OS::Neutron::LoadBalancer
    properties:
      protocol_port: 80
      pool_id: {get_resource: pool}

outputs:
  WebsiteURL:
    description: URL for WordPress wiki
    value:
      str_replace:
        template: http://host/wordpress
        params:
          host: { get_attr: [lb_vip_floating_ip, floating_ip_address] }

Logging Into OpenStack and Creating a Stack

Once you have the template, images, and software repos configured and available, it is time to log into OpenStack.  I will again be using the Red Hat version of OpenStack (full disclaimer – I work for them) to run through this demo.  Once you are logged in, navigate to the “Stack” menu item under Orchestration.

stacks

To create our application stack, click the Launch Stack button located in the upper right hand corner.  A new pop-up windows will appear.  Select “File” for Template Source and browse to the location that the template has been saved to.  We do not have an environment file for this stack, so it is safe to ignore the last two fields.  When done, click Next to continue.

stack-inputs

The next screen asks for the values that were configured in the Parameters section of the template file.  Several of them are already filled in, such as network and images names/UUIDS, but might need adjusting based on your environment.  Network information can be gathered via the OpenStack Neutron API by running the command “neutron net-list” as shown below.  Don’t forget to source your OpenStack user environment variables.  Also, be sure that you do not confuse the network ID (leftmost column) with the subnet ID (right-most column).

neutron-net-list

Once the form is filled out with all of the information for the parameters that we set in the template, such as stack name, passwords, and SSH keys, click the Launch button located at the bottom of the form to launch the application. If everything went well, you will be greeted with a message stating that the stack was launched successfully and showing your stack listed with a status of “In Progress”.

launch_in_progress

Navigating the OpenStack interface to the instances menu item will show that there are two instances being created.  Once is for the database, which will not be scaled, and the other is for the web server, which is configured in the Heat template to scale automatically when the CPU gets busy.   The web server has a unique name assigned to it, but if you look closely, you can see that it is a part of the WordPress application that we are deploying.

wp-instances

In addition to the instances, Heat also orchestrated the creation of a load balancer so that as the application needs to scale, requests can be balanced across all of the web servers that have been created by Heat.  You can click-through the different sections of the load balancer section of the Horizon interface to get an understanding of what was created and configured to support the application.

lb

Going back the Stack menu item, click on the name of the WordPress stack that we just deployed.  Each of the bubbles represents one of the resources that were listed in the template as well.  The resources tab provides that same information in a different format that is hyperlinked to pages that contain additional information about each resource.  Notice the alarms that are listed…we will talk more about them shortly.

stack-info
For now, click on the link for the WordPress website.  This link was generated in the output section of the Heat template and allows us easy access to the application that was created.  If everything goes well, you should see the WordPress configuration screen shown below.  The IP address that it is using to access the page is the VIP (Virtual IP) of the load balancer that we configured.

wp-screen

Causing a Load and Watching Alarms

Earlier I mentioned the alarms that were created with the application being deployed.  Let’s take a look at those.  To do so, we can use the Ceilometer API and get information on the alarms configurations.  As you can see below, the command “ceilometer alarm-list” will display quite a bit of information about the alarm including its current state.  The state can be either OK, Alarm or Insufficient Data.

alarmstate-1

Since we just deployed the application, our alarm is displaying the Insufficient Data state.  Let’s do something about that…

One easy way to artificially increase the load on a server is to create lots of random numbers.  To do this I am going to log into my WordPress webserver instance and run the command “dd if=/dev/urandom of=/dev/null &”about 6-7 times.  This should significantly increase the CPU load – which is what Ceilometer is monitoring based on the information that we included in our Heat template.

dd-running

You can continue to monitor the output of the “ceilometer alarm-list” command to watch for the alarm to switch states (it can take a few minutes) or you can tail out the Ceilometer logs to keep an eye on what Ceilometer is doing.  Every 2 minutes, Ceilometer gathers information about the performance of the virtual machine instance.  After a few minutes, you will see a log scroll by that states that the alarm has triggered the scale up policy of our Heat template. If you were monitoring the alarm-list, you would see the “CPU-alarm-high” alarm go into an alarm state.

alarm

This will cause OpenStack to start the creation of another web server instance and add it to the load balancer that was explored earlier.

two-instances

At this point, I stopped the dd commands that were running in the first web server instance.  If you continue to let them run for a while, more web servers will be created.  The cool thing about waiting for more instances to be created, is that you can see how scaling down the application works.  It’s in the Heat template and I will leave that for you to explore if you want to see it.

You can go back and check your WordPress configuration page, it should still be working as well.  To show that the load balancer is working, lets open a couple of console windows to the web servers and refresh the page a few times.  What you should see is that some of the requests are handled by one server and the other request are handled by the other web server instance.  See below for how it looks using my two instances.  Notice that the requests were made at the same time, thus proving that OpenStack is balancing the mode – albeit without any kind of session tracking since our Heat template did not include that information.

balanced2

balanced1Conclusion

I hope that helps you understand some of the power of OpenStack and how Heat and Ceilometer work together to balance workloads.  Where to go from here?  Try to create your own scale out application or see if you can get another template from GitHub working in your OpenStack environment.  Maybe you already have an application and can modify it to deploy with a Heat template instead of an image template and scale out as required to meet demand.  Good Luck!

Comments
  1. sara says:

    Thank you very much for this complete tuto( part 1& 2). I have one question that I hope you could answer. For the ceilometer Alarms are they fixed for each VM , a group of VMs or tenant? I need to scale each VM on the basis of cpu,ram and network workloads and I need the autoscaling to be done independently for each VM even if they belong to the same stack/tenant.

    • tedbrunell says:

      The alarms are associated with a scale up/down policy and the policy is associated to a group of VMs defined by the Heat::Autoscaling::Group.

      I’ve never tried scaling more that one VM in a stack before (maybe I should) but it makes sense that if you can have more than one type of VM in your stack that needs to scale, then you would need double the number of alarms and policies with each one associated with the appropriate parts of the stack. Another approach might be to put each VM in the same autoscaling group as a different resource and scale both that way. The issue with that is that they would scale together and not independently. Maybe that is what you are after though?

      Thanks for your feedback!

  2. OpenstackUser says:

    Excellent guide! You have helped me a lot. I have created the autoscaling group and scale up/down works fine! However the health monitor changes the status of my servers to inactive. It takes a lot of time(5 mins) for them to be ready and I guess the requests expire. I have set the values of delay and timeout of the health monitor to be really high(6000) and yet the the health monitor assumes that the instances are inactive. Is there any way to declare a dependency to the health monitor so that it starts checking their responses after they are ready?

    • tedbrunell says:

      Thanks for the feedback on the post. The health monitor should only display inactive for instances that are not yet fully running. Are you seeing the monitor turn currently active servers to inactive? What type of monitor are you using? Can you reply with you health monitor configuration block?

      • OpenstackUser says:

        When feeding the stack to the engine I have a health monitor sending http get requests. While the first instance is still booting the health monitor evaluation fails so the monitor changes their status to inactive. After the instance boots the monitor still believes the instances are inactive.
        I also get the message from haproxy that no backend server is available.
        However the instance is fully functioning. If I send a request to its floating ip I get a response.
        Here is the monitor configuration:

        monitor:
        type: OS::Neutron::HealthMonitor
        properties:
        type: HTTP
        delay: 6000
        max_retries: 10
        timeout: 6000
        http_method: GET
        url_path: /myapp

        With a ridiculously high delay and timeout.

      • tedbrunell says:

        You delay is incredibly high. Delay is the amount of time between check, so setting that to 100 minutes would create a long delay between checks and may be the cause of the delay between the server coming online and the health monitor setting the server to active. I would try a lower value – something in the 3-10 range and see how that goes.

  3. OpenstackUser says:

    It worked… It seems that the first time that I tried that it didn’t work, so I thought that the delay was in seconds… Damn, I don’t know how I managed to get stuck on this.
    Thanks a lot for the replies!

  4. ichi-the-one says:

    Excellent tuto. I found it very useful as it helped me understand everything writen in the heat template. So thank you very much.

    Actually, I have one request. Could you please provide me with the steps you did to set your environment. I mean setting up the web server to host the repo and lb_server.yaml. As I’m newbie to all this It will help me a lot get this demo working. And I’m sure that there are many people like me who would appreciate it as much as I me.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s