Installing Hadoop 0.21.0 on Windows – Spaces in username gotcha

A quick side note about installing Hadoop on Windows. My Windows username is “Ben Hall” – this will be used by hadoop in two places – one being ${user.name} and the other being $USER in shell scripts. 

As hadoop is cross platform, it doesn’t expect to see spaces in the path names resulting in random errors.

The first one I received was while creating the DFS directory.

11/01/16 17:54:04 WARN common.Util: Path file:///tmp/hadoop-Ben Hall/dfs/name should be specified as a URI in 
configuration files. Please update hdfs configuration.
11/01/16 17:54:04 ERROR common.Util: Error while processing URI: file:///tmp/hadoop-Ben Hall/dfs/name
java.io.IOException: The filename, directory name, or volume label syntax is incorrect
        at java.io.WinNTFileSystem.canonicalize0(Native Method)
        at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:396)
        at java.io.File.getCanonicalPath(File.java:559)
        at java.io.File.getCanonicalFile(File.java:583)
        at org.apache.hadoop.hdfs.server.common.Util.fileAsURI(Util.java:78)
        at org.apache.hadoop.hdfs.server.common.Util.stringAsURI(Util.java:65)
        at org.apache.hadoop.hdfs.server.common.Util.stringCollectionAsURIs(Util.java:91)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getStorageDirs(FSNamesystem.java:378)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNamespaceDirs(FSNamesystem.java:349)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1223)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368)

Not the most helpful error message. After some pondering and looking at the hadoop code-base, I realised it was due to the space in my username. To solve the problem I created an override of the default tmp path of the directory without the space. Within core-site.xml, I added the following property node:


  hadoop.tmp.dir
  /tmp/hadoop-BenHall

This allowed me to proceed until I hit the following error when starting the nodes:

C:hadoophadoop-0.21.0/bin/hadoop-daemon.sh: line 111: [: /tmp/hadoop-Ben: binary operator expected
C:hadoophadoop-0.21.0/bin/hadoop-daemon.sh: line 67: [: Hall-namenode-BigBlue7.out: integer expression expected
starting namenode, logging to C:hadoophadoop-0.21.0logs/hadoop-Ben
C:hadoophadoop-0.21.0/bin/hadoop-daemon.sh: line 127: $pid: ambiguous redirect
localhost: /cygdrive/c/hadoop/hadoop-0.21.0/bin/hadoop-daemon.sh: line 111: [: /tmp/hadoop-Ben: binary operator expected

Again – not an ideal exception. After looking within the hadoop-daemon.sh script, I found a usage of $USER to build up the log output path. Again, this resulted in a space in the path that caused the exception. To fix it, instead of using the $USER variable I hard-coded the value to BENH.

If you would prefer not to hard-code the value then you could use ‘sed’ to remove spaces dynamically as described here – http://mydebian.blogdns.org/?p=132

After fixing those two errors, I was able to run hadoop without any errors.

One thought on “Installing Hadoop 0.21.0 on Windows – Spaces in username gotcha”

  1. One million thanks. I was feeling sheepish as i was able to set up in one of my machine and not in another. Alas its my profile name which is adopted in $USER had a space as said in your blog.

Leave a Reply

Your email address will not be published. Required fields are marked *