Oh my god. It's full of code!

Posts tagged “node

Create an Alexa skill in Node.Js and host it on Heroku

Ok, here we go.

So for the last couple weeks I’ve been totally enamored with working with Alexa. It’s really fun to have a programming project that can actually ‘interact’ with the real world. Until now I’ve just been using my hacked together web server with If This Then That commands to trigger events. After posting some of my work on Reddit, I got some encouragement to try and develop and actual Alexa skill, instead of having to piggyback off IFTTT. With some sample code, an awesome guy willing to help a bit along the way and more than a little caffeine I decided to give it a shot.

The Approach: Since I already have a decent handle on Node.Js and I was informed there are libraries for working with Alexa, I decided Node would be my language of choice for accomplishing this. I’ll be using alexa-app-server and alexa-app Node.Js libraries. I’ll be using 2 github repos, and hosting the end result on Heroku.

How it’s done: I’ll be honest, this is a reasonably complex project with about a million steps, so I’ll do my best to outline them all but forgive me if I gloss over a few things. We will be hosting our server in github, creating a repo for the server, making a sub module for the skill, and deploying it all to Heroku. Lets get started. First off, go and get yourself a github account and a Heroku account. If you haven’t used git yet, get that installed. Also install the heroku toolbelt (which may come with git, I can’t quite remember). Of course you’ll also need node.js and Node Package Manager (NPM) which odds are you already have.

If you don’t want to create all the code and such from scratch and want to just start with a functioning app, feel free to clone my test app https://github.com/Kenji776/AlexaServer1.git

Create a new directory for your project on your local machine for your project. Then head on over to github and create yourself a new project/repo. Call it AlexaServer or something similar. Go ahead and make it a public repo. Do not initialize it with a readme at this time. This is the repo where the core server code will live. It’s important to think of the server code as a separate component from each individual skill, they are distinct things. You should see this screen.

repo1

Open a command prompt and change to the directory you created for your project. Enter the commands show in the first section for creating a new repository. Once those commands are entered if you refresh the screen you shoud see your readme file there with the contents shown like this.

github1

Okay, now we are ready to get the Alexa-Server app, https://www.npmjs.com/package/alexa-app-server is what you are looking for. In your project directory type in

“npm install alexa-app-server”

This will take a few moments but should complete without any problems. In your project folder you’ll want to create a folder called apps. This is where each individual skill will live. We will cover that later. Now you’ll need to create your actual server file. Create a file called server.js. Put this in there.

'use strict';

var AlexaAppServer = require( 'alexa-app-server' );

var server = new AlexaAppServer( {
	httpsEnabled: false,
	port: process.env.PORT || 80
} );

server.start();

Pretty simple code overall. The weird bit of code in the port section is for Heroku (they give your app a port to use when hosted). If not on Heroku then it will default to using port 80. Now you need to create your Procfile. This is going to tell Heroku what to do when it tries to run your program. It should be a file named Procfile in the same directory with no file extension. The contents of which are simply

“web: node server.js”

without quotes. We will also want to create a package.json file. So again in your project directory run the command

npm init

This will run a script and it will ask you a few questions. Answer them and your package.json file should get generated. Go ahead and push this all into Github using the following command sequence.

git add .
git commit -m “added server.js and Procfile, along with alexa-app-server-dependency”
git push origin master

If you view your Github repo online you should see all your files there. It should look something like this.

first repo push

 

You can see all of our files got pushed in. Now with the server setup, it’s time to create our skill. Create another GitHub repo. Call this whatever you like, hopefully something descriptive of the skill you are making. In your command prompt get into the apps directory within your main project. In there create another folder with a name same or similar to your new GitHub repo. Follow the same steps as before to initialize the repo and do the initial commit/push. Now we are going to indicate to GitHub that the apps/test-skill folder is actually a sub module so any dependencies and such will be maintained within itself and not within the project root. To do this navigate to the root project folder and enter.

git submodule add https://github.com/Kenji776/AlexaTestSkill.git apps/test-skill

Replacing the github project url with the one for your skill, and the apps/test-skill with apps/whatever-your-skill-is-named. Now Git knows that this folder is a submodule, but NPM doesn’t know that. If you try and install anything using NPM for this skill it’s going to toss it into the root directory of the project. So we generate a package.json file for this skill and then NPM knows that this skill is a stand alone thing. So run

npm init

Again and go through all the questions again. This should generate a package.json file for your skill. Now we are ready to install the actual alexa-app package. So run…

npm install alexa-app –save

and you should see that the skill now has it’s own node_modules folder, in which is contained the alexa-app dependency. After this you’ll have to regenerate your package.json file again because you’ve now added a new dependency. Now it’s time to make our skill DO something. In your skill folder create a file called index.js. Just to get started as a ‘hello world’ kind of app, plug this into that file.

module.change_code = 1;
'use strict';

var alexa = require( 'alexa-app' );
var app = new alexa.app( 'test-skill' );


app.launch( function( request, response ) {
	response.say( 'Welcome to your test skill' ).reprompt( 'Way to go. You got it to run. Bad ass.' ).shouldEndSession( false );
} );


app.error = function( exception, request, response ) {
	console.log(exception)
	console.log(request);
	console.log(response);	
	response.say( 'Sorry an error occured ' + error.message);
};

app.intent('sayNumber',
  {
    "slots":{"number":"NUMBER"}
	,"utterances":[ 
		"say the number {1-100|number}",
		"give me the number {1-100|number}",
		"tell me the number {1-100|number}",
		"I want to hear you say the number {1-100|number}"]
  },
  function(request,response) {
    var number = request.slot('number');
    response.say("You asked for the number "+number);
  }
);

module.exports = app;

Now, if you aren’t familiar with how Alexa works, I’ll try and break it down for you real quick to explain what’s happening with this code. Every action a skill can perform is called an intent. It is called this because ideally there are many ways a person might trigger that function. The might say “say the number {number}” or they might say “give me the number {number}” or many other variations, but they all have the same intent. So hence the name. Your intent should account for most of the common ways a user a might try to invoke your functions. You do this by creating utterances. Each utterance represents something a user might say to make that function run. Slots are variables. You define the potential variables using slots, then use them in your utterances. There are different types of data a slot can contain, or you can create your own. Check out https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/alexa-skills-kit-interaction-model-reference for more information on slots types. I’m still figuring them out myself a bit. So the intent is triggered by saying one of the utterances. The slot is populated with the number the person says, and after reading the variable from the request, it is read back to the user.

So now, believe it or not, your skill is usable. You can test it by heading by starting your server. Again in your command line shell within the root project directory type

node server.js

A console window should show up that looks like this.

skill running

Now in your browser you should be able to bring up an emulator/debugger page be heading to

http://yourserver/alexa/your-skill-name       (EX: http://localhost/alexa/test-skill)

You should get a page that looks like this.

emulator

Holy cow, we did it! We have a functioning skill. The only problem now is that there is no way for Alexa to ‘get to it’. As in, Alexa doesn’t know about this skill, so there is no way for echo to route requests to it (also, I find it mildly creepy that when referring to Alexa, I keep accidentally wanting to type ‘her’. Ugh). So now we have to put this skill somewhere were it’s accessible. That is where Heroku is going to come in. Sure we could host this on our local machine, setup port forwarding, handle the SSL certs ourselves, but who wants to have their skill always running on their local box? I’ll do a bit on how to do that later since it requires creating a self signed SSL certificate using openSSL and such but for now I’m going to focus on getting this sucker into a cloud host.

Oh by the way, once everything is working smoothly, you should commit and push it all into github. Remember, you’ll have to do a separate commit and push for your server, and for the skill submodule since they are officially different things as far as github is concerned, even though the skill is a sub directory of your server. Also, I’m still learning exactly how submodules work, but if for some reason it doesn’t seem like your submodule is updating properly you can navigate to the project root folder and try this series of commands.

git submodule foreach git pull origin master
git add .
git commit -m “updated submodule”
git push origin master

with some luck that should do it. You’ll know it’s all working right when you open up your server repo in github, click the apps folder and you see a greyish folder icon for the name of your skill. Clicking it should show you the skill with all the files in there. Like this.

submoduleNow, on to Heroku. This part is pretty easy comparatively. Head on over to Heroku and create a new app. As usual, name it whatever you like, preferably something descriptive. Next you’ll be in the deploy area. You’ll be able to specify where the code comes from. Since we are smart, and we used github you can actually just link this heroku project straight to your github project.

hreoku connect
Just like that! However, this automatic integration does not seem to include sub modules, so we will have to add our skill submodule from the command line. So again navigate to your project folder from the shell, and run this command to link your directory with the heroku project.

heroku git:remote -a alexatestapp

Of course replacing alexatestapp with whatever the name of your heroku app is. You should get back a response saying that the heroku remote has been added. Now we need to push it all into Heroku (which in retrospect may have made the above linking done from the website un-needed, but whatever. It doesn’t seem to hurt anything). So now run

git push heroku master

You should be treated with a whole bunch of console output about compressing, creating runtimes, resolving dependencies, blah blah blah. Now with some luck your app should work via your hosted Heroku app. Just like when hosted locally we can access the emulator/debugger by going to

https://alexatestapp.herokuapp.com/alexa/test-skill

Replacing the domain with your heroku app, and the skill with whatever your skill is named. If all has gone well it should operate just like it did on your local machine. We are now finally ready to let Amazon and Echo know about our new skill. So head on over to https://developer.amazon.com/edw/home.html#/skills/list (you’ll need to sign up for a developer account if you do not have one) and click the add new skill button.

app create 1

On the next page it’s going to ask you for your intent schema. This is just a JSON representation of your intents and their slots. Thankfully that handy debug page generates that for you. So it’s just copy paste the intent schema from the debug page into that box. It will ask you about custom slot types, and odds are if you are making a more complicated app you are going to want to make your own, but really all that is is creating a name for your data type, common values for it (no it doesn’t have to be every value) and saving it. Spend some time reading the docs on custom slot types if you need to. It’s also going to want sample utterances, which again that debug page generates for you, so again, just copy paste. You might get a few odd errors here about formatting, or wrong slot types. Just do your best to work through them.

model

 

Next up is setting up the HTTPS. Thankfully Heroku handles that for us, so we don’t have to worry about it! Just select

“My development endpoint is a subdomain of a domain that has a wildcard certificate from a certificate authority”

Next up us account linking. I haven’t figured that one out yet, still working on it, so for now we just have to say no. Next up is the test screen. If everything is gone well up to this point, it should work just peachy for ya.

testsay

Next you’ll be asked to input some meta data about your app. Description, keywords, logo, icon, etc. Just fill it out and hit next. Finally you’ll be asked about privacy info. For now you have just say no to everything, and come back to update that when you’ve got your million dollar app ready to go. Hit save and you should be just about ready to test through your Echo. If you open your Alexa app on your phone and look for your skill by name, you should find it and it should be automatically enabled. You should now be able to go to your echo, use the wake word, speak the invocation name and ask for a number. If all has gone well you’ll hear the wonderful sound of Alexa reading back your number to you. For example

“Alexa Ask Test Get My Number”

Alexa should respond with what you defined as the app start message in your skill. You can now say

“Say the number 10”

Which should then be repeated back to you. Congrats. You just created a cloud hosted, Node.Js Alexa Skill that is now ready for you to make do something awesome. Get out there and blow my mind. Good luck!


Making Amazon Echo/Alexa order me a pizza

UPDATE! I’ve started a github project for my server software. Still very alpha but you can check it out here: https://github.com/Kenji776/AlexaHomeHub

If you follow my blog, you might have caught my post yesterday about how I bought an Amazon Echo device, and have begun creating my own custom actions for it. I started small, simply making it call out to my web server by using If This Then That (IFTTT – a website that allows you to connect and integrate different services). I got it to connect to my door lock and unlock service I had written, and even got it to chromecast specific pre-setup videos to one of my TV’s using a command line tool. Feeling somewhat confident I decided it was time to take on something a little more in depth, but oh so worth it. I was going going to make Alexa order me a pizza.

If you are an Echo/Alexa user you might know that there is already support for ordering a pizza but only from Dominos. It uses some system of having a saved order, then tweeting a specific bit of info at the Dominos twitter account that is tied to your order which then places it. This has a few drawbacks. Primarily being that it orders you a Dominos pizza (sorry guys, in all fairness Dominos has gotten a lot better in recent years). Also it requires twitter integration, and as far as I know only supports one order that you have saved (I could be wrong). The using a saved order was a good idea as it streamlines and simplifies the ordering process quite nicely. I wanted to do something like this, but instead I wanted Sarpinos pizza, and I wanted to be able to pick from several different pre-created orders. Using my knowledge of browser automation that I picked up from my door lock/unlock project and my existing web server, I got to work.

First off, I had to figure out all the things that needed to happen. Off the bat I knew I’d be using their online ordering interface. They don’t have an API, so I knew I’d have to be automating browser interactions. Next I had to break down the process of ordering the pizza online step by step, all the HTML elements involved and how to interact with them. Then I would be able to automate those interactions using the Selenium library. So I went through the process like a normal person and created this list. At each step I inspected the HTML elements involved and recorded them so I could figure out how to identify them and interact with them later. I created an order and saved it as a favorite so next time I’d come back in I would be prompted if I’d like to order that again. From there I was able to create the following list of things I knew needed to get done.

Order Steps:

1) Invoke: https://order.gosarpinos.com/Login/

2) Wait For Load

3) On load populate credential fields:
	- <input class="text-box single-line valid" data-val="true" data-val-required="Email is required" id="Email" name="Email" type="email" value="" >
	- <input class="text-box single-line password valid" data-val="true" data-val-required="Password is required" id="Password" name="Password" type="password" value="" >

4) Click Login button
	- <input type="submit" value="Login" class="ui-button ui-widget ui-state-default ui-corner-all" role="button" aria-disabled="false">
	
5) Wait For Page Load

6) Find button with provided favourite id (428388)
	- <button class="wcFavoriteSelectFavoriteButton ui-button ui-widget ui-state-default ui-corner-all ui-button-text-only" data-favorite-id="428388" id="wiSelectFavorite428388" type="button" role="button" aria-disabled="false">

7) For For Delivery Popup to load
	- 

 

A fair amount of steps, but none of them super complicated. I knew I’d have to learn a bit more about Selenium as this interaction was definitely more complicated than the door lock code, and that one was already seemingly over complicated. Thankfully I did, and found out that in my previous attempt I had been mixing synchronous and async methods unknowingly hence leading to perceived complexity (I thought driver.wait() was a async method and you put everything that depended on it inside. Turns out it’s synchronous and once the condition inside is true the program continues. No wonder it was acting a little funny). I knew also since I was going to be passing in a fair amount of data (username, password, order id, credit card info, etc) that I should probably define an object which would have all the required properties, then just pass JSON into my web service that mirrored that object. This is what I came up with.

{
"action":"pizza",
"username":"xxxxxxxx@xxxx.com",
"password":"xxxxx",
"orderId":"428388",
"ccNumber":"xxxxxxxxxxxxxxxxxx",
"ccExpMonth":"03",
"ccExpYear":"2018",
"ccv":"xxxxx",
"tip":"5.00",
"ccZip":"xxxxx"
}

 

Obviously the sensitive values are blacked out, but you get the jist. My webserver is already primed to look for post requests that have a JSON payload. The ‘router’ code looks at the ‘action’ attribute to figure out what function to send the payload to. I created a new ‘pizza’ action and related function. Here is that function.

function orderPizza(orderObject,callback)
{
	var orderResult = new Object();
	
	//create instance of selenium web driver
	var driver = new webdriver.Builder().
	withCapabilities(webdriver.Capabilities.chrome()).
	build();

		
	//request the login page with the locks page as the return url
	driver.get('https://order.gosarpinos.com/Login');
	
	driver.findElement(webdriver.By.name('Email')).sendKeys(orderObject.username);
	driver.findElement(webdriver.By.name('Password')).sendKeys(orderObject.password);
	
	//find and click submit button
	driver.findElement(webdriver.By.css("input[type='submit']")).submit();

	console.log('Logged in as: ' + orderObject.username);

	//wait for order page to load
	driver.wait(function() {
		return driver.getTitle();
	},5000);
	
	console.log('Attempting To Choose Favorite Order With Id: ' + orderObject.orderId);
		
	//have to wait until the proper order button appears since it's in a dialog. If it isn't found after 5 seconds, fail. Otherwise click the corresponding order button
	//the favorite order button has an attribute 'type' of 'button' and a 'data-favorite-id' attribute with the id of that order
	
	driver.wait(function () {
		return driver.findElement(webdriver.By.css("div[aria-describedby='wiFavoriteListDialog']")).isDisplayed();
	}, 5000);
	
	
	driver.wait(function () {
		return driver.findElement(webdriver.By.css("button[data-favorite-id='"+orderObject.orderId+"']")).isDisplayed();
	}, 5000);
	
	driver.findElement(webdriver.By.css("button[data-favorite-id='"+orderObject.orderId+"']")).click();
			
	//after the button above is clicked, that dialog closes and another one opens. This one asks the user to select delivery or pickup. We want delivery
	//the delivery button has an attribute with a 'data-type' of 'WBD' and an attribute 'role' of 'button'
	driver.wait(function () {
		return driver.findElement(webdriver.By.css("button[data-type='WBD'][role='button']")).isDisplayed();
	}, 5000);
	
	driver.findElement(webdriver.By.css("button[data-type='WBD'][role='button']")).click();
	
	//ensure the checkout button is visible
	driver.wait(function () {
		return driver.findElement(webdriver.By.css("button[id='wiLayoutColumnGuestcheckCheckoutBottom'][role='button']")).isDisplayed();
	}, 5000);
	
	//hacky method to ensure that the modal dialog should now be gone and we can click the checkout button
	driver.sleep(2000);
	
	driver.findElement(webdriver.By.css("button[id='wiLayoutColumnGuestcheckCheckoutBottom'][role='button']")).click(); 
	
	//then the browser will move to the order screen. Once it loads we have to populate the order field data

	//wait for payment page to load
	driver.wait(function() {
		return driver.getTitle();
	},5000);	

	//wait until the pay by credit card button shows up.
	driver.wait(function () {
		return driver.findElement(webdriver.By.id("wiCheckoutPaymentCreditCard")).isDisplayed();
	}, 5000);

	//check the pay by credit card radio button
	driver.findElement(webdriver.By.id("wiCheckoutPaymentCreditCard")).click(); 
	
	//wait until the credit card number box shows up
	driver.wait(function () {
		return driver.findElement(webdriver.By.id("Payment_CCNumber")).isDisplayed();
	}, 5000);
	
	//populate the form fields
	driver.findElement(webdriver.By.name('Payment_CCNumber')).sendKeys(orderObject.ccNumber);

	//stupid jQuery ui selects are impossible to set with normal selenium since the original select is hidden. So use an execute script to set em.
	driver.executeScript("$('#Payment_ExpMonth').val("+orderObject.ccExpMonth+");");

	driver.executeScript("$('#Payment_ExpYear').val("+orderObject.ccExpYear+");");
		
	driver.findElement(webdriver.By.name('Payment_CVV')).sendKeys(orderObject.ccv);
	//driver.findElement(webdriver.By.name('Payment_CCTip')).sendKeys(parseFloat(orderObject.tip));
	driver.findElement(webdriver.By.name('Payment_AVSZip')).sendKeys(orderObject.ccZip);
	
	driver.executeScript("$('#Payment_CCTip').val("+parseFloat(orderObject.tip)+");");
	
	driver.findElement(webdriver.By.id('wiCheckoutButtonNext')).click();

	//wait for confirmation page to load.
	driver.wait(function() {
		return driver.getTitle();
	},5000);

	
	driver.findElement(webdriver.By.id("wiPlaceOrderNow")).click();	

	driver.wait(function() {
		return driver.getTitle();
	},5000).then(function(){
		console.log('Ordering complete!');
		orderResult.success = true;
		orderResult.message = 'Order Placed Successfully';
		callback(orderResult);			
	});
	

}

 

Now if that code seems a little dense or confusing, don’t feel bad. It took me several hours of trial and error to figure it out, especially then it came to setting the select list values, and getting the script to wait while various elements where created and destroyed by the page. Selenium has this super fun behavior where if you ever try and reference an element that doesn’t exist, the whole script goes down in flames. In response to that I made my code very ‘defensive’ checking to make sure elements that are required frequently before attempting to interact with them.

With the script created and integrated into my web server ‘router’ I was ready to get IFTTT to invoke it. Once again it was as simple as creating a new recipe with Alexa as the If and the Maker make a web request feature as the do.

pizza 1pizza2

You can see that with the combination of specific phrases and the fact that you can have multiple saved orders, it would be easy to setup many different possibilities. My roommate is even going to create his own IFTTT account and link it to my Alexa. Then he can create his own orders, specify his own credit card information in the JSON payload, and order whenever he wants using the same device but have his own information. The next step I think is to encrypt the JSON payloads which contain the credit card info and then decrypt them when the arrive at my server. That way I’m not storing my CC info in plain text anywhere which right now is a bit of a concern. This was mostly just proof of concept stuff last night, but I was too excited not to share this as soon as I could, so some of the ‘polish’ features are missing but overall I think it’s a damn good start. Now if you’ll excuse me, I’m going to get myself a pizza.

Update: Adding encryption was pretty easy. First I got some encrypt and decrypt functions set up. Like this.
.

// Nodejs encryption with CTR
var crypto = require('crypto'),
    algorithm = 'aes-256-ctr',
    password = 'xxxxxxxxxxxxxxxxxxxxxxxxxx';

function encrypt(text){
  var cipher = crypto.createCipher(algorithm,password)
  var crypted = cipher.update(text,'utf8','hex')
  crypted += cipher.final('hex');
  return crypted;
}
 
function decrypt(text){
  var decipher = crypto.createDecipher(algorithm,password)
  var dec = decipher.update(text,'hex','utf8')
  dec += decipher.final('utf8');
  return dec;
}

Then update my incoming data object so that the encrypted data was in its own property so I could still tell what kind of request it was without having to decrypt the payload first (since my webserver supports other, unencrypted calls).

"action":"pizza",
"data":"f613f8f479bad299bdfedf [rest of encrypted string omitted]",
"encrypted":true

 

Then just had to change my ‘router’ to decrypt the incoming data if encryption was detected.

else if(action == 'pizza')
{
	//pizza request contains encrypted info. Decrypt and send to function
	var pizzaRequestData = new Object();
	
	if(parsedContent.encrypted)
	{
		console.log('Encryped Payload Detected. Decrypting Containted Data');
		
		pizzaRequestData = JSON.parse(decrypt(parsedContent.data));
		
		console.log('Decryption complete');

		responseObject.message = 'Ordering Pizza!';
		
		console.log(responseObject.message);
		//because order pizza is async the result data comes in a callback
		orderPizza(pizzaRequestData,function(data){
			responseObject.pizzaRequest = data;
			console.log(data);
			sendResponse(response,responseObject);
			return;
		});
		}
	else
	{
		console.log('Un-encrypted pizza order detected. Skipping');
	}
}

After that I just had to use the encrypt function to generate an encrypted version of my pizza request data, update the IFTTT recipe with the new request and that’s it! Now my CC information is safely encrypted and I don’t really have to worry about it getting intercepted. Yay security.


Even more Automation with Tasker and Geofencing

So if you saw my last blog, you know I spent some time writing a small application that would log into my home security providers website, then lock and unlock my door on a timed schedule. If you haven’t read it, check it out at https://iwritecrappycode.wordpress.com/2015/10/21/automating-things-that-should-already-be-automated-with-selenium-and-node-js to get some context. While that was great and all, I figured I could do more with it. I mean timed scheduled are great, but I do occasionally leave the house so you know what would be really cool? If my door automatically locked when I left, and unlocked itself again when I returned home. I figured it should be a reasonably simple process to make my script respond to remote commands, and trigger my phone to send such commands when leaving and entering an area. Turns out I was mostly right. At it’s core its pretty easy, but there are a lot of moving parts and places to make mistakes, which I did make plenty of (it’s crazy how often as a programmer I type the wrong numbers in somewhere).

So obviously the first part of this challenge was going to be to setup my script to respond somehow to external requests. Being written in Node.js it made sense that it respond to web requests. I figured the requester should provide the username and password to login to the security website with, as well as a desired action of either lock or unlock. I decided to make it respond to POST requests just to reduce the possibility of some web spider hitting it and by some fluke making it do something (I also put the service on a different port, but we’ll cover that later). So the service listens for post requests with some JSON encoded data in the request body and then makes the request to alarm.com. The code for that method is as follows.

//create a server
server = http.createServer( function(request, response) {

    console.log('Request received from ' + request.connection.remoteAddress);

	var responseObject = new Object();
	responseObject.message = 'Waiting';
	responseObject.status = 'OK';

	//listen for POST requests
    if (request.method == 'POST') {

		console.log("POST Request Received");

		//when we get the data from the post reqest
        request.on('data', function (data)
		{
			try
			{
				console.log('Raw Data: ' + data);
				
				//parse content in the body of the request. It should be JSON
				var parsedContent = JSON.parse(data);
				
				//read some variables from the parsed json
				var action = parsedContent['action'];
				var password = parsedContent['password'];
				var username = parsedContent['username'];

				responseObject.action = action;


				if(action == 'lock')
				{
					responseObject.message = 'Sent Lock Request';
					console.log('Locking Door!');
					
					//because toggledoor is async the result data comes in a callback
					toggleDoor(username,password,true,function(data){
						responseObject.lockRequestData = data;
						sendResponse(response,responseObject);
						return;
					});

				}
				else if(action == 'unlock')
				{
					responseObject.message = 'Send Unlock Request';
					console.log('Unlocking Door!');
					//because toggledoor is async the result data comes in a callback
					toggleDoor(username,password,false,function(data){
						responseObject.lockRequestData = data;
						sendResponse(response,responseObject);
						return;
					});

				}
				else
				{
					console.log('Invalid Post Acton ' + action);
					responseObject.message = 'No method defined with name: ' + action;
					sendResponse(response,responseObject);
				}
			}
			catch(exception)
			{
				responseObject.message = 'Error: ' + exception.message;
				console.log(exception);
				sendResponse(response,responseObject);
			}
		});
    }
	else
	{
		console.log('Non post request. Ignoring.');
		responseObject.message = 'Please use post request';
		sendResponse(response,responseObject);
	}


}).listen(PORT, function(){
    //Callback triggered when server is successfully listening. Hurray!
    console.log("Server listening on: http://localhost:%s", PORT);
});

So now with a server that listens, I need to allow requests from the outside world to come in and actually get to my server. This is where my old days of networking came in handy. First I decided I’d like to have a domain name to reach my server instead of just an IP address (maybe I can setup dynamic DNS later so if my home IP address changes, my registar it automatically updated with the new one and will change my zone file records). I know this is more of a programming blog, but just a basic bit of networking is required. First I headed over to GoDaddy and grabbed myself a cheap domain name. Then changed the DNS settings to point the @ record at my home IP address.

Simple DNS Config

Now going to my domain would send traffic to my home, but of course it would be stopped at my firewall. So I had to setup port forwarding in my router. This is where I was able to change from using the web traffic port 80 to my custom port to help obscure the fact that I am running a web server (as well as having to change the listening port in the Node script). Of course I decided to turn off DHCP on the machine hosting this and give it a static IP address. I would have used a reservation but my crappy router doesn’t have that option. Lame.

port config

After some fiddling with settings for a bit my service was now responding to outside traffic. The only bummer part is that my domain name does not work within my network because again my router is crappy and doesn’t support NAT loopback, so it doesn’t know how to route the request to the right machine internally. I could modify my hosts file or setup my own DNS server but it’s not worth it seeing as this service is only useful when I’m outside my own network anyway.

Now the trickiest part, I needed to find a way to make my phone detect when I entered or left a given geographic region and when it detected that, send the specially crafted POST request to my server. I knew of two options off the top of my head that I had heard of but never played with. The first one I tried was a service called ‘If this then that’ or IFTTT for short. While the website and app are very sleek and they make configuration of these rules very easy, there was one problem. It just didn’t work. No matter what I tried, what recipes I configured nothing worked. Not even when triggered manually would it send the POST request. After playing with that for a bit, I decided to give up and give the other app I had heard of a shot. Tasker.

So if IFTTT is the mac of task automation (sleek, easy, unable to do the most basic things). Then Tasker is Linux. It’s extremely powerful, flexible, a bit difficult to understand and not much to look at. It does however have all the features I needed to finish my project. I ended up buying both Tasker (its like 3 bucks) and a plugin for it called AutoLocation. You see Tasker by itself is fairly powerful but it allowes for additional plugins to perform other actions and gather other kinds of data. AutoLocation allowed me to easy configure a ‘geofence’ basically a geographic barrier upon passing through which you can trigger actions. So I configured my geofence and then imported that config data into Tasker. Then it was simple a matter of created two profiles. One for entering the area and one for leaving it.

GeoFence Takser Profiles

I also added a simple vibrate rule so that my phone will buzz when either of the rules trigger so I know the command was sent. Later that night when I headed off to the store, that little buzz was the sweet vibration of success. I may or may not have yelled for joy in my car and frightened other motorists. I hope perhaps this post might inspire you to create own you crazy automation service. With the combination of selenium, node, tasker, and a bit of networking know how it’s possible to create all kinds of cool things.  If you’d like to download the source code for my auto lock program you can grab it below (I don’t really feel like making a git project for something so small. Also I realize its not the best quality code in the world, it was meant to be a simple script not a portfolio demo).

Download AutoLock Source

Till next time!
-Kenji


Automating Things That Should Already Be Automated With Selenium and Node.js

So I had invited my parents over a while back, and it came up that I don’t often lock my front door. Of course being good parents they chided me about it saying that I should really do so. I really have no excuse because I even have an app that allows me to to it remotely (yay home automation) but the be honest I’m just forgetful when it comes to things like that. However I decided to heed their warning and do something about it. I decided if I was going to get diligent about locking my door, I couldn’t be the one in charge of actually doing it, I’d have to make a computer do it. The problem is, that while my home security provider does offer a web app, and a phone app for locking the door and a very basic ‘rule’ system (arm the panel when the door is locked, vice versa) there are no time based controls, so I’d still have to actually do something. Totally unacceptable.

After some investigation I found as I had figured that my security provider does not offer any sort of API. Nor would it be easy to try and replicate the post request that is send from the app to trigger the lock door command due to numerous session variables and cookies and things include unique to each login session. Nope, if I was going to automate this it seemed like I’d actually have to interact with the browser as much as the thought displeased me (I’m all for dirty hacks, but c’mon). At first I looked at python for a solution, but as seems to often be the case with python every discussion of a solution was disjointed with no clear path and generally unsatisfactory (sorry python). Instead I turned to Node to see what potential solutions awaited me there. After a bit of looking around I found Selenium for Node. While it’s obvious and stated focus was on automated web app testing, not intentional browser automation scripts I could see no reason it wouldn’t work.

Quickly I spun up a new Node project and used NPM to grab the Selenium package (even after many uses NPM still feels like some kind of magic after manually handling javascript libraries for so long). Followed the guide to getting the Selenium web drivers to work, which at first seemed a bit odd having to install executable on my system for a javascript package but it makes perfect sense in retrospect. After finding a basic Selenium tutorial I was ready to attempt to get my script to login to alarm.com’s web page. First I had to get the names of the inputs I wanted Selenum to fill. Of course chrome makes this easy, just right click, inspect element and snag the names of the inputs.

Finding the required name property is easy.

Finding the required name property is easy.

 

Then simply tell Selenium to populate the boxes and click the login button.

 

var driver = new webdriver.Builder().
withCapabilities(webdriver.Capabilities.firefox()).
build();
   
driver.get('https://www.alarm.com/login?m=no_session&ReturnUrl=/web/Automation/Locks.aspx');
driver.findElement(webdriver.By.name('ctl00$ContentPlaceHolder1$loginform$txtUserName')).sendKeys(username);
driver.findElement(webdriver.By.name('txtPassword')).sendKeys(password);
driver.findElement(webdriver.By.name('ctl00$ContentPlaceHolder1$loginform$signInButton')).click();

Thankfully just through dumb luck when creating this my session had timed out and I found out the login page accepted a return url parameter that it would direct the browser to after successful login. So now, if the login goes smoothly, the browser should be at the screen where I can control the locks. A different button is used to lock or unlock the door and only one is visible at a time. So writing my function in such a way that it accepted boolean ‘lock’ (where false would be unlock) param and then failing if it’s not able to find the button is a safe way to ensure I don’t unlock the door when I mean to lock it and vice versa.

driver.wait(function() {
	return driver.getTitle().then(function(title) {
		console.log('Toggling Door Status');
		if(lock)
		{
			driver.findElement(webdriver.By.name('ctl00$phBody$summaryRepeater$ctl00$lockButton')).click();
			lockResult.message = 'Lock request sent!';
		}
		else
		{
			driver.findElement(webdriver.By.name('ctl00$phBody$summaryRepeater$ctl00$unlockButton')).click();	
			lockResult.message = 'UnLock request sent!';					
		}
		lockResult.success = true;
		

		driver.quit();

		return lockResult;
	});
}, 1000);

 

I don’t know Selenium super well yet, but it seems that after the login button is clicked the driver waits until it can retrieve the title of the page (which is a easy way to tell the page has at least somewhat loaded) and when it has then run the inner logic for clicking the proper button.  Honestly I’m not totally sure, but it works and that’s the important thing 😛

I was actually a bit shocked when my little function worked. Calling it with the proper username and password actually made the door lock a few moments later, much to my dogs surprise as he sat napping in the living room (the little motor on that lock is kind of loud). Now the next peice of this puzzle was to invoke that function on a timer system. Locking the door at say, 11:00pm and unlocking at 8:00am. This turned out to require only your regular every day javascript, nothing fancy.

var lockHour = 23;
var unlockHour = 8;
var beenLockedToday = false;
var beenUnlockedToday = false;

var date = new Date();
var current_hour = date.getHours(); 
console.log('Checking current hour for lock status checks. Current hour is ' + current_hour  + ' Will automatically lock at ' + lockHour + ' and unlock at ' + unlockHour + ' listening on port ' + port);
	
function monitorLoop() {
	
	
	date = new Date();
	current_hour = date.getHours();       
	
	console.log('Checking local hour. It is ' + current_hour);
	
	if(current_hour >= lockHour && !beenLockedToday) 
	{
		console.log('Lock hour hit or passed and door has not been locked. Locking!!');
		toggleDoor(alarm_username,alarm_password,true);		
		beenLockedToday = true;
	} 
	if(current_hour == unlockHour && !beenUnlockedToday) 
	{
		console.log('Un-Lock hour hit!');
		toggleDoor(alarm_username,alarm_password,false);		
		beenUnlockedToday = true;
	}
	if(current_hour == 0)
	{
		console.log('Resetting lock status variables');
		beenLockedToday = false;
		beenUnlockedToday = false;		
	}

	setTimeout(monitorLoop,600000);

}

monitorLoop();

It’s just a function that is called via setInterval every hour. It checks the current hour against my two predefined lock and unlock times. I used a couple variables to track if the door has been locked or unlocked so if I reduce the time on the event loop it’s not attempting to lock/unlock the door every few minutes and wearing out the batteries on the motor. Obviously omitted from this is my alarm_username and alarm_password variables stored higher up in the script. With this event loop and the Selenium automation function I now have one less thing to worry about. Now if I could just find a Node.Js host that supported Selenium (damn you Heroku). So if anything, I’d say this is the take away: Don’t do manually what you can automate, and browser automation with Selenium is crazy easy. So easy that when it all worked I was almost disappointed that it seemed like I didn’t do anything.

Glorious event loop in action

Glorious event loop in action

Till next time!
-Kenji

Also, be sure to check out the addendum to this project in my next blog post where I added automatic operations with a geofence. https://iwritecrappycode.wordpress.com/2015/10/21/automating-things-that-should-already-be-automated-with-selenium-and-node-js/