Monday, July 29, 2013

Using CloudMonkey to Automate CloudStack Operations

Background:


The CloudStack GUI does not suit repetitive tasks.  There is no macro mechanism in the GUI to allow an admin to record and replay long workflows.  Multi-step tasks such as the setup of a new zone or the registration of a template must be done by hand and are error prone.

Developers can automate CloudStack workflows with the CloudMonkey tool.  CloudMonkey provides a means of making CloudStack API calls from the command line, and thus from a script.

Problem:


The GUI does not tell you which API calls and parameters it is using for a task.  This makes it difficult to reproduce the same functionality in a CloudMonkey script.

Solution:


Parse the management server log file to see the sequence of commands executed during a GUI task.  Once the commands and parameters are known, reconstruct the steps in CloudMonkey.

Parse the CloudStack log file:

The management server logs the beginning and end of all API calls in a log file.  In the case of a development system, the log file is usually the file vmops.log in the root of the source tree. 

Use grep to obtain a list of API call log entries:

 grep 'command=' vmops.log > all_api_logs.txt 

The result is quite raw.  It will require additional clean up.  E.g.:
  root@mgmtserver:~/github/cshv3# grep 'command=' vmops_createtmplt_sh_problem.log > all_api_calls.txt   
  root@mgmtserver:~/github/cshv3# cat all_api_calls.txt   
  ...   
  2013-07-17 08:59:50,522 DEBUG [cloud.api.ApiServlet] (343904103@qtp-1389504071-7:null) ===START=== 10.70.176.29 -- GET command=listCapabilities&response=json&sessionkey=null&_=1374047990517   
  2013-07-17 08:59:50,540 DEBUG [cloud.api.ApiServlet] (343904103@qtp-1389504071-7:null) ===END=== 10.70.176.29 -- GET command=listCapabilities&response=json&sessionkey=null&_=1374047990517   
  ...   

Next, remove uninterested log entries using sed:

 sed -e '/^.*command=log/d; /^.*===END===/d; /^.*command=queryAsyncJobResult/d' all_api_logs.txt > ./reqd_api_logs.txt
How does this work?

Using the -e parameter, we pass sed a list of commands separated by a semicolon.  The meaning of each command is as follows:

/^.*command=log/d deletes login and logout commands.

/^.*===END===/d removes the second log message for a call, which is made at the end of the API call.

/^.*command=queryAsyncJobResult/d' removes polling commands that the GUI uses to determine if an asynchronous command has completed.  We will use Monkey in blocking mode, which means it will do the queryAsyncJobResult calls for us.

Next, convert logs entries to commands:

 sed -e 's/^.*command=//; s/&/ /g; s/_=.*//; s/sessionkey=[^ ]*//; s/response=[^ ]*//' ./reqd_api_logs.txt > ./encoded_api_calls.txt

How does this work?

s/^.*command=// removes from start of line to and including "command=".  We want everything after command=, because that is the actual command.

s/&/ /g replaces the '&' used to separate arguments in the API call with a space.  Its more readable, and CloudMonkey wants us to separate commands with a space.

s/_=.*// removes the 'cache buster' that prevents network infrastructure from responding to the HTTP request with a cached result.

s/sessionkey=[^ ]*// removes the session key.  CloudMonkey uses API keys.  Besides, the sessionkey will have expired by now!

s/response=[^ ]*// removes the response encoding parameter from the request.  CloudMonkey will insert a suitable version of this parameter automatically.

Next, enclose parameter values in single and double quotes

 sed -e 's/ \+/ /g; s/=/='"'"'"/g; s/ /"'"'"' /g; s/"'"'"'//' ./encoded_api_calls.txt > delimited_encoded_api_calls.txt  

We want to put double quotes around parameter values before converting from URL encoding to strings.  This will preserve the whitespace after decoding.  We also add single quotes.  The single quotes prevent the bash shell from removing the double quotes when we put these commands in a script.

The sed commands are complex due to a quirk with how bash parses single quotes...

s/ \+/ /g converts one or more spaces to a single space.

s/=/='"'"'"/g converts equals (=) to equals, single quote, double quote ( ='" )

s/ /"'"'"' /g converts all spaces to double quote, single quote ( "' ).

s/"'"'"'// removes the leading double quote, single quote.

Using the command above,
 createPhysicalNetwork zoneid=28444ba3-1405-4872-b23c-015cf5116415 name=Physical%20Network%201 isolationmethods=VLAN  

has all parameters enclosed in '" ... "', e.g.
 createPhysicalNetwork zoneid='"28444ba3-1405-4872-b23c-015cf5116415"' name='"Physical%20Network%201"' isolationmethods='"VLAN"'  

If you don't need the single quotes, just use the command below to insert your quotes.
 sed -e 's/ \+/ /g; s/=/="/g; s/ /" /g; s/"//' ./encoded_api_calls.txt > delimited_encoded_api_calls.txt


Finally, remove URL encoding from the parameters:

The parameters for our commands are URL encoded.  E.g.
 root@mgmtserver:~/github/cshv3# cat delimited_encoded_api_calls.txt  
 ...  
 addImageStore name="AWS+S3" provider="S3" details%5B0%5D.key="accesskey" details%5B0%5D.value="my_access_key" 
 details%5B1%5D.key="secretkey" details%5B1%5D.value="my_secret_key" details%5B2%5D.key="bucket" details%5B2%5D.value="cshv3eu" details%5B3%5D.key="usehttps"   
 details%5B3%5D.value="true" details%5B4%5D.key="endpoint" details%5B4%5D.value="s3.amazonaws.com"  
 ...  

You can decode them with the following (source):
 sed -e 's/+/ /g; s/\%0[dD]//g' delimited_encoded_api_calls.txt | awk '/%/{while(match($0,/\%[0-9a-fA-F][0-9a-fA-F]/)){$0=substr($0,1,RSTART-1)sprintf("%c",0+("0x"substr($0,RSTART+1,2)))substr($0,RSTART+3);}}{print}' > decoded_api_calls.txt   

This restores whitespace and punctuations E.g.
 root@mgmtserver:~/github/cshv3# cat decoded_api_calls.txt  
 ...  
 addImageStore name="AWS S3" provider="S3" details[0].key="accesskey" details[0].value="my_access_key" 
 details[1].key="secretkey" details[1].value="my_secret_key" details[2].key="bucket" details[2].value="cshv3eu" details[3].key="usehttps"   
 details[3].value="true" details[4].key="endpoint" details[4].value="s3.amazonaws.com"  
 ...  

Setup CloudMonkey:


Install CloudMonkey

Be careful not to use an out of date community maintained package.  The target version of CloudMonkey is listed at install time.  E.g

 root@mgmtserver:~/github/cshv3# apt-get install python-pip  
 Reading package lists... Done  
 Building dependency tree  
 ...  
 root@mgmtserver:~/github/cshv3# pip install cloudmonkey  
 Downloading/unpacking cloudmonkey  
  Downloading cloudmonkey-4.1.0-1.tar.gz (60Kb): 60Kb downloaded  
  Running setup.py egg_info for package cloudmonkey  
 root@mgmtserver:~/github/cshv3# which cloudmonkey  
 /usr/local/bin/cloudmonkey  

If you are a developer, use the instructions on the CloudMonkey wiki to build the latest version.  E.g.
 root@mgmtserver:~/github/cshv3# cd tools/cli  
 root@mgmtserver:~/github/cshv3/tools/cli# mvn clean install -P developer  
 [INFO] Scanning for projects...  
 [INFO]  
 [INFO] ------------------------------------------------------------------------  
 [INFO] Building Apache CloudStack cloudmonkey cli 4.2.0-SNAPSHOT  
 [INFO] ------------------------------------------------------------------------  
 [INFO]  
 ...  
 [INFO] --- maven-install-plugin:2.3.1:install (default-install) @ cloud-cli ---  
 [INFO] Installing /root/github/cshv3/tools/cli/pom.xml to /root/.m2/repository/org/apache/cloudstack/cloud-cli/4.2.0-SNAPSHOT/cloud-cli-4.2.0-SNAPSHOT.pom  
 [INFO] ------------------------------------------------------------------------  
 [INFO] BUILD SUCCESS  
 [INFO] ------------------------------------------------------------------------  
 [INFO] Total time: 5.190s  
 [INFO] Finished at: Mon Jul 22 22:33:01 BST 2013  
 [INFO] Final Memory: 16M/154M  
 [INFO] ------------------------------------------------------------------------  
 root@mgmtserver:~/github/cshv3/tools/cli# python setup.py build  
 running build  
 ...  
 writing manifest file 'cloudmonkey.egg-info/SOURCES.txt'  
 root@mgmtserver:~/github/cshv3/tools/cli# python setup.py install  
 running install  
 ...  
 Finished processing dependencies for cloudmonkey==4.2.0-0  
 root@mgmtserver:~/github/cshv3/tools/cli# which cloudmonkey  
 /usr/local/bin/cloudmonkey  


Configure CloudMonkey


As a minimum, CloudMonkey needs the URL for the management server and API keys to authenticate requests to the server. API keys are different from your password / username.  How to obtain API keys is described at 9:07 in this YouTube CloudMonkey overview by DIYCloudComputing.

Also, set CloudMonkey to use JSON output.  The alternative is difficult to parse.

Finally, use sync to tell CloudMonkey to discover the latest API.

These values can be set at the command line.  E.g.
 cloudmonkey set apikey WsiG7tva38gJpl082mBRQEnAic9g_BW15fK5aB4W3ak9GBoBeg0iOz9iGAIJ7eSnHecS1ONffEygi2xTkP4QOw   
 cloudmonkey set secretkey _Ov8DMed8WMWMscWaWX6cCHzF7kWCQU2SVwbQJo4ujL2-ocLdvkC5Mwe0XlrSDZ12ha52ieAtYOJj6viA1SFhQ   
 cloudmonkey set display json   
 cloudmonkey sync  

Now CloudMonkey can make API calls.  E.g.
 root@mgmtserver:~/github/cshv3# cloudmonkey list users  
 {  
  "count": 1,  
  "user": [  
   {  
    "account": "admin",  
    "accountid": "12a8380c-f2e3-11e2-b495-00155db1030e",  
    "accounttype": 1,  
    "apikey": "WsiG7tva38gJpl082mBRQEnAic9g_BW15fK5aB4W3ak9GBoBeg0iOz9iGAIJ7eSnHecS1ONffEygi2xTkP4QOw",  
    "created": "2013-07-22T17:26:25+0100",  
    "domain": "ROOT",  
    "domainid": "12a7d75c-f2e3-11e2-b495-00155db1030e",  
    "email": "admin@mailprovider.com",  
    "firstname": "Admin",  
    "id": "12a8686b-f2e3-11e2-b495-00155db1030e",  
    "iscallerchilddomain": false,  
    "isdefault": true,  
    "lastname": "User",  
    "secretkey": "_Ov8DMed8WMWMscWaWX6cCHzF7kWCQU2SVwbQJo4ujL2-ocLdvkC5Mwe0XlrSDZ12ha52ieAtYOJj6viA1SFhQ",  
    "state": "enabled",  
    "username": "admin"  
   }  
  ]  
 }  

Recreate GUI Commands in CloudMonkey


The parsed log file contains a list of API calls.  Pick out the ones you want to use.  I placed them in a file called myscript.

To make CloudMonkey API calls from the command line, simply add cloudmonkey api to the API call.  To save time, you can prepend every command using sed:
 sed 's/^/cloudmonkey api /' myscript > myscript2  

The results of one API call will provide the parameters for the next, so we want to be able to capture the results of our CloudMonkey calls.

Simply enclose your commands in reverse single quotes and assign the result to a bash variable.  To save time, use sed:
 sed -e 's/^/apiresult=`/; s/$/`/' myscript2 > myscript3  

CloudMonkey has strict case sensitivity rules that prevent it from using log file input.  CloudMonkey expects all parameter keys to be in lower case.  E.g. the  addTrafficType command above appears in the log with the parameter trafficType.  However, CloudMonkey expects it to be traffictype (all lower case).

Thus, a a file with the API calls below:
 createZone networktype='"Advanced"' securitygroupenabled='"false"' guestcidraddress='"10.1.1.0/24"' name='"HybridZone"' localstorageenabled='"true"' dns1='"4.4.4.4"' internaldns1='"10.70.176.118"' internaldns2='"10.70.160.66"'  
 createPhysicalNetwork zoneid='"28444ba3-1405-4872-b23c-015cf5116415"' name='"Physical Network 1"' isolationmethods='"VLAN"'  

We would get this result:
 apiresult=`cloudmonkey api createphysicalnetwork zoneid='"28444ba3-1405-4872-b23c-015cf5116415"' name='"Physical Network 1"' isolationmethods='"VLAN"' `  
 apiresult=`cloudmonkey api addtrafficType physicalnetworkid='"8ae03f63-efe9-46ea-9c31-f35164ef3dfc"' traffictype='"Management"' `  


Extract results as required


The variable apiresult includes a lot of information not useful for subsequent calls.  E.g.
 root@mgmtserver:~/github/cshv3# apiresult=`cloudmonkey api createZone networktype="Advanced" securitygroupenabled="false" guestcidraddress="10.1.1.0/24" name="HybridZoneA" localstorageenabled="true" dns1="4.4.4.4" internaldns1="10.70.176.118" internaldns2="10.70.160.66"`  
 root@mgmtserver:~/github/cshv3# echo $apiresult  
 { "zone": { "allocationstate": "Disabled", "dhcpprovider": "VirtualRouter", "dns1": "4.4.4.4", "guestcidraddress": "10.1.1.0/24", "id": "2347b5c8-378c-4a7e-9977-818bbba4f7ff", "internaldns1": "10.70.176.118", "internaldns2": "10.70.160.66", "localstorageenabled": true, "name": "HybridZoneA", "networktype": "Advanced", "securitygroupsenabled": false, "zonetoken": "b957e317-d661-30dd-a412-1f76f2736412" } }  

Usually, you will have to add code to extract specific parameters from the result.  For instance, here we extract the identifier of a newly create zone for a createPhysicalNetwork call:
 root@mgmtserver:~/github/cshv3# zoneid=`echo $apiresult | sed -e 's/^.*"id": //; s/,.*$//'`  
 root@mgmtserver:~/github/cshv3# echo $zoneid  
 "2347b5c8-378c-4a7e-9977-818bbba4f7ff"  
 root@mgmtserver:~/github/cshv3# apiresult=`cloudmonkey api createPhysicalNetwork zoneid=$zoneid name='Physical Network 1' isolationmethods='VLAN'`  
In a script, we can the $zoneid as the value variable of a variable.

Difficulties with this Approach:


CloudStack does not log the parameters of POST requests.  Commands such as addHost are recorded as received, but their parameters are not.  You have to refer to the developers guide to figure them out.  This is down to a lack of explicit support for logging incoming commands in CloudStack.

Final Remarks:


Parsing the GUI commands out of the log file is quite complex.  It would be a lot easier if the management server logged API calls in plain text rather than as URL encoded strings, and if commands sent by HTTP POST commands had their parameters clearly logged. 

Parsing JSON encoded text is poorly supported in bash.  CloudMonkey's 'filter' option would avoid this issue if it were available with the api command.  filter tells CloudMonkey to return only the values of a list of keys.    If the filter were available, code to parse the apiresult would not be required. 

CloudMonkey cannot be used with a clean deployment, because CloudStack initially has no API keys.  This issue can be avoided if username / password could be used to authenticate API calls.  username / passowrd authentication is used for login by the GUI and by tools such as the CloudStack.NET SDK (see the relevant Login method).

Fortunately, developers can disable database encryption and add API keys to the admin user before starting CloudStack.  To disable database encryption, set db.cloud.encryption.type=none in your db.properties file.  This is done automatically by the Maven project that runs Jetty.  E.g.
 root@mgmtserver:~/github/cshv3# grep -R "db\.cloud\.encrypt" * --include=db.properties  
 client/target/generated-webapp/WEB-INF/classes/db.properties:db.cloud.encryption.type=none  
 client/target/generated-webapp/WEB-INF/classes/db.properties:db.cloud.encrypt.secret=  

Next, add the desired API keys are set in the user table.  E.g. 
 mysql --user=root --password="" cloud -e "update user set secret_key='_Ov8DMed8WMWMscWaWX6cCHzF7kWCQU2SVwbQJo4ujL2-ocLdvkC5Mwe0XlrSDZ12ha52ieAtYOJj6viA1SFhQ',api_key='WsiG7tva38gJpl082mBRQEnAic9g_BW15fK5aB4W3ak9GBoBeg0iOz9iGAIJ7eSnHecS1ONffEygi2xTkP4QOw' where id=2;"  

No comments :