Parsing XML using IronRuby

Today I have been looking at how you can use IronRuby to communicate with the various REST API’s floating around the web. Most of the services, such as flickr or twitter, allow you to select the format of the response.  While JSON (JavaScript Object Notation) seems to be the most common format, I decided to start with Xml, simply because I know how to parse xml.

Initially, I wanted to use a Ruby library to parse the xml, however I found REXML which comes with Ruby is not yet supported by IronRuby. As such, I had to take a different approach. I decided to use the System.Xml namespace as my base, and then create a wrapper and monkey patch the CLR objects to produce a cleaner more flexible API.

When creating the wrapper, the first task is to define all of the required references.  Generally with IronRuby, if you want to do .Net interop you will need to reference mscorlib and System.  In this case, I’ve also referenced the System.Xml assembly. The include comment is similar to a using directive in C#, within Ruby include allows you to do ‘Mixins’ and allows the functionality to be accessible from the current module, as if they was combined – a very powerful technique. In this case, it allows us to access CLR objects without needing to specify the namespace.

require ‘mscorlib’
require ‘System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’
include System
require ‘System.Xml, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’
include System::Xml

The first task is to create a new Ruby class. This class has an initialize method which will take in raw xml as a string.  Under the covers, it creates an instance of System.Xml.XmlDocument and loads the xml.

class Document
  def initialize(xml)
    @document = XmlDocument.new
    @document.load_xml xml
  end
end

Once we have created the document, we need a way to access the xml elements. Within C#, you would use the SelectNodes method, which returns a XmlNodeList, you would then iterate this collection to access the XmlNodes and as such your data. Well, life in IronRuby is a little different. I found that when iterating over the XmlNodeList, I was getting XmlElement objects, each of the nodes. I also wanted to provide a more ‘ruby-like’ way to access the elements.

The method I created has two arguments, one is the xpath query, the second being a Block, a piece of code which I want to be executed for each element. Within my code, I can iterate over all the elements, passing control back to the block with the element as a parameter for processing.

def elements(xpath, &b)
  ele = @document.select_nodes xpath
  ele.each do |e|
    b.call e
  end
end

Within the block, I can place the code required to process that section of the XML. However, I still need a way to access the data of the elements.  Because the above code will return XmlElement objects, I wanted to monkey patch the class to include a few custom methods. This is amazingly simple within IronRuby, you define a class with the same name and define your additional methods.

class XmlElement
  def get(name)
    select_single_node name
  end
  def to_s
    inner_text.to_s
  end
end

I also include an additional method called node() which is the same as above, but allows me to return sub-elements from an XmlElement object.

Finally, I saved this in a file called xml.rb. The filename is used by consumers within the require statement.

With this in place, I can use the wrapper to process xml.

# Include the wrapper
require ‘xml’

# Create the document
@document = Document.new(‘Jim193312BobSmith‘)

# Access root/name elements
@document.elements(‘root/name’) do |e|
   # Output the contents of the element named first
   puts e.get(‘first’)
   # Access the element named dob, then output the value of year.
   e.node(‘dob’) {|y| puts y.get(‘year’)}
end

When I execute this block of code, Jim, 1933, Bob is printed to the console.

>ir processXml.rb
Jim 
1933
Bob

While the wrapper isn’t very advanced, it’s a very quick and easy way to start working with xml from IronRuby.

A big thank you to Ivan Porto Carrero who pointed me in the correct direction of how to accept blocks within methods, before I had to do this:

@document.elements(‘x’).collect {|e| puts e.get(‘first’)}

Not much difference, but enough to make a impact.

Feel free to download the wrapper and the sample.  For future reference, I’ve uploaded the code to the MSDN Code Gallery, which I will update if I release a new version.

Technorati Tags:

Microsoft.Scripting.SyntaxErrorException was unhandled – Unicode support and IronRuby

I just ran head first into a wall I wasn’t expecting. I was attempting to host the DLR within a C# application (topic for another host), the IronRuby code I wanted to load was in an external file and was simply:

def helloWorld()
   puts ‘Hello World’
end

However, when I attempted to load the file, I was greeted with an unhandled exception.

Microsoft.Scripting.SyntaxErrorException was unhandled
  Message=”Invalid character ‘ï’ in expression”
  Source=”Microsoft.Scripting”
  Column=2
  ErrorCode=4112
  Line=1
 
SourceCode=”def helloWorld()rn   puts ‘Hello World’rnend”
  SourcePath=”C:/Users/Administrator/Documents/Visual Studio 10/Projects/IronRubyHost/bin/DebughelloWorld.rb”
  StackTrace:
       at Microsoft.Scripting.ErrorSink.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in E:IronRubyr156srcMicrosoft.ScriptingErrorSink.cs:line 34
       at Microsoft.Scripting.ErrorCounter.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in E:IronRubyr156srcMicrosoft.ScriptingErrorSink.cs:line 92
       at IronRuby.Compiler.Tokenizer.Report(String message, Int32 errorCode, SourceSpan location, Severity severity) in E:IronRubyr156srcironrubyCompilerParserTokenizer.cs:line 430

I was surprised to see the additional characters being added to the start of my SourceCode. After a bit of searching, I found this:

http://www.mail-archive.com/[email protected]/msg02137.html

It turns out that Ruby 1.8 (which IronRuby is targeting) doesn’t have support for Unicode characters. When creating a text file using Visual Studio, it automatically saves it with unicode support.

The quickest solution appears to be that you shouldn’t use Visual Studio to create Ruby files – at least until the editor has IronRuby integration. The other solution would be to use another editor, for example GVim or IronEditor, which is shortly going to get a facelift.

Technorati Tags: ,

IronRuby with StreamReader and StreamWriter

While playing around with IronRuby, I wanted to access a stream, as such I needed to create a StreamReader.  Generally a very simple task, however no matter what I did, I was getting an error message.

The code looked like this, readStream = System::IO::StreamReader.new(receiveStream), however when executed I was always getting the error.

E:IronRubysrcIronRuby.LibrariesBuiltinsModuleOps.cs:721:in `const_missing’: uninitialized constant System::IO:
:StreamReader (NameError)

NameError means IronRuby cannot find the StreamReader object – a bit of a problem. I thought I had included all the required references, I had access to the System.dll using the correct require statement.

require ‘System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’

After firing up reflector, turns out StreamReader is actually in the mscorlib assembly – not System.dll.  This was a simple fix, I added an additional require statement for mscorlib (require ‘mscorlib’) and I could happily access the StreamReader object.

In future, I’ll remember to always reference mscorlib and system.dll when doing .Net interop – it will just make life easier.

Technorati Tags:

RubyGems, IronRuby and System::Net namespace

Recently I encountered an interesting gotcha with IronRuby. I wanted to use a rubygem together with the WebRequest CLR object from the System.Net namespace. I had my require and include statements set as shown below.

require ‘rubygems’
# Additional require statements
require ‘mscorlib’
require ‘System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’
include System

With this in place, I could happily access my gems and CLR objects. However, when I attempt to access WebRequest it failed. The code was correct as I had it working in a different sample.

request = Net::WebRequest.Create(@url)

However, this whenever I tried in the current sample I was getting a NameError, meaning IronRuby was unable to find the class.

E:IronRubysrcIronRuby.LibrariesBuiltinsModuleOps.cs:721:in `const_missing’: uninitialized constant Net::WebRequest (NameError)
        from :0

After a little trial and error, I found that the reason it wasn’t working was because I had included RubyGems, this gave me access to ruby’s standard library. Within the standard library, there is a namespace called Net which has all of the standard objects you would expect for dealing with network communication (net::http, net::telnet etc). As a result, Ruby and .Net clashed with Ruby taking priority, meaning I couldn’t access Net::WebRequest.

The workaround is to be more explicit about what you want. You either need to include the System.Net namespace, include System::Net,  or add System when attempting to access the object, System::Net::WebRequest. You will then be able to access the object as normal.

Technorati Tags:

Learning a new language? Write some tests…

The excellent Pragmatic Programmer book suggests that you should learn a new language every year – this is something which I strongly agree with. By learning a new language it does not mean C# 4.0 (when you already know 2.0 and 3.0), or how to create Silverlight applications. Instead, make the effort to learn a language with a different mindset and approach to what your used to and actively engage in that community. Coming from a JavaC# background, Ruby was an eye-opener for me and approach to software development. The priorities and principals are different, for example Ruby has much more emphasise on elegance, solving problems and testability where software is treated as an art form while still being pragmatic in their approach – it’s never prefect and can be improved at some point. This importance is also being effectively communicated throughout the community from the Ruby guru’s such as Dave Thomas and Chad Fowler to the developers writing applications on a day-to-day basis. As a result of learning more about the Ruby language,

I feel I can approach a solution with a more opened minded approach.

How should you learn a new language?

This is something which I have been thinking about for a while, what is the most effective way to learn a new programming language? There are a couple of approaches you could take, but this is my approach (if you have your own suggestion, please leave it as a comment).

1) Buy a book, but not just any book – the best book. Personally, online materials are great but I still find the most effective way to learn something is to have a physical book by my side.  However, be sure to pick your book wisely, I recommend you research the influences within the community and either buy their book, or the book they recommend.  The wrong book could take you down completely the wrong path.

2) Start writing tests

Once you have picked your language and book of choice, you need to start grokking the concepts. After trying a couple of different techniques, I’ve found the best way to learn a new language is to actually write tests. This isn’t as crazy as it might sound, the test will define your expected outcome from your sample and give you something to aim for. This will help focus your mind on the task in hand, while giving you a clear signal as to when you are complete – the test will pass.

For example, with C#, if I wanted to know how to write a line of text to a file I might write the following test with the implementation.  

public class IO_Examples
{
    [Fact]
    public void Write_Hello_World_To_The_File_HelloWorldTxt()
    {
        MyFileAccess.Write(“Hello World”, “HelloWorld.txt”);

        Assert.True(File.ReadAllText(“HelloWorld.txt”).Contains(“Hello World”));
    }
}

public class MyFileAccess
{
    public static void Write(string s, string file)
    {
        StreamWriter streamWriter = new StreamWriter(file);
        streamWriter.WriteLine(s);
        streamWriter.Close();
    }
}

I have also found tests to be a much more natural starting point for interacting with a language, with C# my starting point was a command line application, however this wasn’t the most effective way of learning as I was constantly commenting out calls to various methods to execute the correct block of code. By using a test framework and a test runner as your starting point, I would have been able to run the samples more quickly and effectively while still keeping everything readable.

However, if your anything like me, while learning you will end up going off on a tangent or being distracted mid-task, I can recall too many occasions where I have been deep in the middle of learning the inners the underlying code only to read a blog post which takes me in a different direction and then completely forgetting where I was. By having a test, or a series of tests, guiding me I am able to quickly get back on track by seeing which tests are currently failing.  Because the tests will describe my aim, I will have a much better chance of remembering what I was actually doing.

Finally, once you start moving onto real applications and solving real problems using the language, you will be able to look back and refer to the tests as a reference regarding everything you learned.  If you keep your tests in a source control repository, you will even be able to see how you adapted over time. This could be extremely useful a year down the line where you want a very quick refresher.

3) Solve a problem!

Once you have an understanding of the language with a series of tests describing how to do various different tasks, the next thing to do is solve a problem you, or someone else, is experiencing having. While this isn’t always possible, it’s a great motivator to finish the job.  I enjoy automation and improving productivity (it allows me to spend more time on twitter), I find it really interesting to see how problems can be solved in a more effective fashion using tools and technologies – if a new language can help me with that then all the better. Not only will I learn the language in a more ‘real world’ context, but it will be helping me in the future.

Technorati Tags: , ,

Downloading IronRuby from GitHub

This week the IronRuby team moved their source repository over to GitHub, with the layout now reflecting the structure they maintain internally. I just wanted to cover how I downloaded and built IronRuby on a Windows Vista machine – LinuxMac users will need to build via Mono. The first task is to download the code, as such you will need a git client, at the moment the best client I have found is msysgit.

msysgit installs both a GUI and command line, to download the source code I simply used the command line. Executing the following command within the directory where you want the code to be stored, for me this was E:IronRubyGit.

git clone git://github.com/ironruby/ironruby.git

The next task is to build the project, if you have MRI (Matz’s ruby) you can use the rake compile command, this will compile everything for you. If you don’t have MRI installed, you can build the Ruby.sln file using Visual Studio 2008 or the C# compiler (csc).

E:IronRubyGitironrubymerlinmainLanguagesRuby>rake compile
(in E:/IronRubyGit/ironruby/merlin/main/Languages/Ruby)
——————————————————————————-
dlr_core
——————————————————————————-
rake aborted!
No such file or directory – e:ironrubygitironrubymerlinmainlanguagesrubysrcmicrosoft.scripting.core

(See full trace by running task with –trace)

However, I was greeted with a error message saying it was unable to find a file or directory. Luckily, the IronRuby mailing list is a great resource, and it turns out I needed to set the MERLIN_ROOT environment variable.

E:IronRubyGitironrubymerlinmainLanguagesRuby>set MERLIN_ROOT=e:ironrubygitironrubymerlinmain

As a side note, merlin was the original codename for the IronPython team, which now includes IronRuby and the DLR (see John Lam’s post)

After solving this, I hit another error regarding tf.exe, the Team Foundation Source Control command line, being missing.

Cannot find tf.exe on system path.

The build script (rake) does a check to see if everything is configured correctly for IronRuby development, however tf isn’t required for building and actually only required internally within the team. The check for tf.exe is made within E:IronRubyGitironrubymerlinmainLanguagesRubyrakemisc.rake, search for the string and simply remove it from the array.

commands += [‘tf.exe’, ‘svn.exe’] if IronRuby.is_merlin?

Run rake compile again and IronRuby should happily build with the assemblies being built to ‘E:IronRubyGitironrubymerlinmainbindebug’

However, after launching ir.exe (the IronRuby REPL), I was unable to reference rubygems. After a bit of investigating, turns out the $LOAD_PATHS where incorrect. These paths are set within ir.exe.config, all you need to do is replace the following part of the path ExternalLanguagesRubyRuby-1.8.6libruby with => externallanguagesrubyredist-libsruby . It’s used three times, after replacing the values I could happily use IronRuby.

Hope this helps!

Technorati Tags: , ,

Announcing IronEditor – An Editor for IronRuby, IronPython and other DLR languages

IronEditor 1.0.0.44For a while now I have been working on an application called IronEditor, this is a simple application designed to make it easier to pick up and start coding against the DLR based languages. By taking advantage of the DLR’s Hosting API, the application can execute code for any language built on top of the DLR platform.

The project is hosted at CodePlex, along with all of the source code.

Download: http://www.codeplex.com/IronEditor

Build: 1.0.0.46

Out of the box, the application works with IronRuby and IronPython, however one of the main aims of the application is to allow other languages to be easily embedded into the application.

The reason why I decided to build this is because Visual Studio Integration for the languages is a long way off and while playing around and creating code to use the languages is painful via the provided console applications. As such, the aim of the application is to provide a very lightweight way to edit and execute code, great while learning the languages and giving demos (I used this application for my NxtGenUG Oxford DLR session).

One of the items I’m really pleased about is the fact that the application works on Mono (Tested only on Ubuntu 8.04 and Mono 1.9.1), something which will definitely not be possible with the Visual Studio integration.

To run the application, you will need to ensure you have Mono installed on your machine. Download the application and extract the zip into a directory. Then enter the command:

mono IronEditor.exe

You will then have the same application, same binaries everything working on Mono.  The only difference is that syntax highlighting for IronPython doesn’t work at the moment.

IronEditor running on Mono 

I admit, at the moment the application is very basic. However, over the next few weeks and months I will build new features into the application to make it easier to start playing around with the DLR languages.

Executing IronRuby and IronPython Code

1) Start the application

2) File > New Project

3) From the drop down, select your language

NewProject

4) Type some code (print ‘Hello World’ is always good)

5) Press the green arrow, or hit F5.  Code is executed and the output is wrote to the Output Window.

Very quick and easy I think!

There are some very big limitations and bugs within the application, but I’m going for the ‘Release Early, Release Often’ approach. Various items could be improved, for example Ruby doesn’t have any syntax highlighting but this will come in time. There are some other much larger features I want to implement, keep an eye on the blog for more information as and when. Over the next few weeks I will also be blogging about the implementation of IronEditor and how it uses the DLR Hosting API. 

Any feedback is most welcome!

NOTE: As I mentioned this is a very early, it hasn’t had a great deal of testing.  Please don’t use it on your production code base just yet! I wanted to get a release out for some initial feedback, if it causes everything to go wrong – I’m very sorry!

Download: http://www.codeplex.com/IronEditor

Build: 1.0.0.46