Using Redis and RedisToGo to store Node.js sessions on Heroku

Storing data in the session state is a little bit naughty as HTTP should remain stateless but there is a trade-off. In this case I wanted to test an assumption around a potential feature for Mayday. I felt it was more important to release than over architecture the entire solution with the possibility of dropping the  feature the next day.

When it comes to Node.js, session state is stored in-memory meaning any restarts or new deployments will delete the data.

Enter Redis

Redis is an ultra-fast, open source, advanced key-value store. To make it even easier, RedisToGo offer a hosted version with a Heroku plugin. After adding the plugin, an account will be created with a database URL provided as a configuration variable.

$ heroku config
REDISTOGO_URL   => redis://redistogo:[email protected]:9712/
NODE_ENV        => production

To use this as your session store you will need to configure the middleware by defining a RedisStore from the connect-redis npm.

The require statements should look like this:

var express = require(‘express’);
var RedisStore = require(‘connect-redis’)(express);

var url = require(‘url’)

For development you will want to use your local server.
app.configure(‘development’, function(){         
  app.use(express.session({ secret: “password”, 
                            store: new RedisStore({
                                          host: “127.0.0.1”,
                                          port: “6379”,
                                          db: “name_of_my_local_db”
                                        })  
          }));
}); 


For production you should use the RedisToGo URL provided. 
app.configure(‘production’, function(){
 var redisUrl = url.parse(process.env.REDISTOGO_URL);
 var redisAuth = redisUrl.auth.split(‘:’);


 app.use(express.session({ secret: “password”, 
                           store: new RedisStore({
                                        host: redisUrl.hostname,
                                        port: redisUrl.port,
                                        db: redisAuth[0],
                                        pass: redisAuth[1]
                                      })  
         }));
});

Node.js and Connect-Redis will do the rest for you.

The key will be the session id for the user with the value being a JSON serialised object of req.session.

In place editing of a file with sed

When writing a file modification script I find it annoying (read: boring) to write the logic of saving the file under a different name, moving to override original and then cleaning up. This is what I recently had to do and instead decided to look around at what other options were available, one of which was sed.

The following command will replace the word testdomain with proddomain.

$ echo ‘www.testdomain.com’ > test.txt
$ cat test.txt 
www.testdomain.com
 

$ sed -i ” s/testdomain/proddomain/ test.txt
$ cat test.txt
www.proddomain.com

The key to how this works is the -i ” argument of the sed command. -i specifies the file extension, by passing an empty string it uses the same file.

Getting node-compress to work on Node 0.6

I have a lot of respect for Node, but sometimes certain npm packages become out of sync with the latest version and break. Today, that npm package was node-compress.

While the package looked to build, when attempting to require the module it would error.

$ node
> var c = require(‘compress’)
Error: Cannot find module ‘compress’

After a quick look around on GitHub I found a fork with the fix https://github.com/elliotttf/node-compress/

In your package.json, simply reference the tarball, clean out the node_modules directory, install and everything should work again.

Package.json

    , “compress”:”https://github.com/elliotttf/node-compress/tarball/1edaa48bf33f7c836f1e275691e1d8645f0a71c3″

A JavaScript equivalent to Ruby’s respond_to?

While working on the new version of Mayday, I wanted to show a message if no data was returned from Google Analytics. To add to the complexity, I wanted to be able to override the default message on a per page basis.

I’m already using the ICanHazJS client-side tempting engine which has a method for each template block on a page. However, if the page doesn’t have the block then the method won’t exist and an error will be thrown.

What I needed was functionality similar to Ruby’s respond_to. With this method I can ask the object if it will respond to the method call. For example:

Object.respond_to? ‘test’ #=> false
Object.respond_to? ‘to_s’ #=> true

Luckily this is just as easy to do in JavaScript using ‘in’

> ‘test’ in Object
false
> ‘toString’ in Object
true

This allowed me to write the following:

var e = $(‘#data’)
if(‘nodata’ in ich)
   e.append(ich.nodata());
else
   e.append(‘No Data Found’);



If a nodata ICanHaz template block appears on the page then it will be rendered, otherwise it will fall back to the default. Problem solved.


Update: 


Problem almost solved. As pointed out on Twitter by @nmosafi and @theprogrammer, just using ‘in’ along is not enough.  For example:

> Object.test = “test”
“test”
> ‘test’ in Object
true
> Object.test()
TypeError: Property ‘test’ of object function Object() { [native code] } is not a function



What you need to do is ensure that the property is also a function. Something I had assumed previously.


> ‘test’ in Object && typeof(Object.test) == “function”
false

> ‘toString’ in Object && typeof(Object.toString) == “function”
true

Caching with Memcached, Express and NodeJS

In a previous post I discussed how you can use Redis to store NodeJS session data. While in many cases Redis is a better product than Memcached, Memcached has one big advantage is that it has least-recently-used eviction. Basically, when the table is full, older data will be removed based on the LRU algorithm.

With NodeJS, there is a great npm package called memcache.

To access the package I created a simple wrapper called cache.js.  Like with all programming languages, some abstract can be a good thing.

var memcache = require(‘memcache’);

var client = undefined;
exports.connect = function() {
client = new memcache.Client();
client.connect();
return client;
}

exports.get = function(key, callback) {
client.get(key, callback);
};
exports.set = function(key, value) {
client.set(key, value, function(error, result){}, 10);
};

The implementation is then very simple:
1) Call cache.connect(); after app.listen(3000);
2) Within your route, you provide the key and have the result value returned in the callback. If the item is undefined then perform the required logic, set the value and continue.

app.get(‘/’, function(req, res){
cache.get(‘key’, function(error, result){
console.log(result);

if(result == null) {
result = microtime();
cache.set(‘key’, result);
}

res.render(‘index’, {title: “Hello Express at ” + result});
});

Simple. Dirty.

Sadly, every time I looked at the above code I felt dirty.

As such, I wondered. How else could you do this?

One approach I thought could be interesting is to take advantage of the middleware pipeline nature of Express and NodeJS. Now the logic and responsibility is separated into their own functions.

function getCache(req, res, next) {
cache.get(req.route.path, function(error, result){
req.params.cachedItem = result;
next();
});
};

function processIndexIfNotInCache(req, res, next) {
if(req.params.cachedItem == undefined) {
req.params.cachedItem = microtime();
cache.set(req.route.path, req.params.cachedItem);
}
next();
}

app.get(‘/’, getCache, processIndexIfNotInCache, function(req, res){
res.render(‘index’, {title: “Hello Express at ” + req.params.cachedItem});
});

However, while getCache is generic and reusable, processIndexIfNotInCache is a bit of a mouthful and you’ll soon start seeing repeating logic.

Dirty in a different way

While we’ve improved the code in some aspects, it’s still not great. In my final attempt, I was left with two generic methods and a hash which pointed the point to where the cacheable logic lived.

function processCache(req, res, next) {
cache.get(req.route.path, function(error, result){
if(result == undefined)
result = executeAndStore(req.route.path);

req.params.cachedItem = result;
next();
});
};

function executeAndStore(path) {
var result = routes[path]();
cache.set(path, result);
return result;
}

var routes = {‘/’: microtime }
app.get(‘/’, processCache, function(req, res){
res.render(‘index’, {title: “Hello Express at ” + req.params.cachedItem});
});

There are still a few refactoring to go, but we have started to start separating the logic of caching from the logic of the method and the actual rendering itself.

A sample, along with many others, can be found at https://github.com/BenHall/memcached_express

I would love to see other examples of how people would solve this – ping me @Ben_Hall

Remove X-Powered-By for Express and NodeJS

When responding to a web request it’s common for servers to tell the client various bits of information. The one they enjoy most is some promotion around the name and version “powering” the site. Sadly, hackers also love this as it gives them more information for an attack vector.

By default, ExpressJS with NodeJS will return a X-Powered-By header.

$ curl -I 0.0.0.0:3000/
HTTP/1.1 302 Moved Temporarily
X-Powered-By: Express

I wasn’t overly impressed by this but it’s easy to remove. In your application configuration, at the top, add a new middleware function which removes the header.

  app.configure(function(){
      app.use(function (req, res, next) {
        res.removeHeader(“X-Powered-By”);
        next();
      }); 


      app.set(‘views’, __dirname + ‘/views’);
      app.set(‘view engine’, ‘jade’);
      app.use(express.bodyParser());
      app.use(express.methodOverride());
      app.use(express.cookieParser());
      app.use(express.static(__dirname + ‘/static’));
  });

Simple.

Finding out where that console.log output is coming from

While trying to solve a problem, we have all sometimes done a little console.log outputting to help us gain additional understanding. While it’s far from the most effect way of debugging, it’s more annoying when those statements appear in your production logs. That was my scenario today.

Thankfully, with a simple grep command I was able to identified all of my “debugging” statements and the lines they occurred on.

$ grep -nr console * | grep -v “node_modules/*” | grep -v “static/”

endpoints/browser_stats.js:35:        if(error) { console.log(error); }
endpoints/browser_stats.js:51:        if(error) { console.log(error); }
endpoints/browser_stats.js:67:        if(error) { console.log(error); }
endpoints/events.js:16:    console.log(eventsUrl);
endpoints/events.js:17:    console.log(hitsUrl);
endpoints/events.js:20:        if(error) { console.log(error); }
endpoints/events.js:27:            if(error) { console.log(error); }
passport.js:23:      console.log(“Auth”);

In this case, I filtered out any matches from node_modules and static javascript files. Keep in mind that in some cases it’s a node module causing the output.

A simple command to help keep your production logs clean and readable.

Javascript WTF: The Date object

Almost every time I have to deal with the Date object in Javascript I always hit the same WTF.

The documentation is clear but the API has some strange aspects that personally I don’t think reflect the real world and how we naturally handle dates.

For example:
      getDay() returns the day of the week (0-6) leaving getDate() to return 1-31. Personally, I would have imagined getDate() to return DMY with getDay() returning 1-31.
      getMonth() returns 0-11.  Classic computer science with starting the count at 0 while the Gregorian calendar is 1-12. This is also inconsistent with getDate() which starts at 1.
      getYear() returns the year (usually 2-3 digits). For example, 2012 is
obviously 112. You need to use getFullYear() to return 2012. The logic
about how the method came to 112 is found at https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Date/getYear

JavaScript 1.0 – we love you and you have left an impact in many ways.