Parsing XML using IronRuby

Today I have been looking at how you can use IronRuby to communicate with the various REST API’s floating around the web. Most of the services, such as flickr or twitter, allow you to select the format of the response.  While JSON (JavaScript Object Notation) seems to be the most common format, I decided to start with Xml, simply because I know how to parse xml.

Initially, I wanted to use a Ruby library to parse the xml, however I found REXML which comes with Ruby is not yet supported by IronRuby. As such, I had to take a different approach. I decided to use the System.Xml namespace as my base, and then create a wrapper and monkey patch the CLR objects to produce a cleaner more flexible API.

When creating the wrapper, the first task is to define all of the required references.  Generally with IronRuby, if you want to do .Net interop you will need to reference mscorlib and System.  In this case, I’ve also referenced the System.Xml assembly. The include comment is similar to a using directive in C#, within Ruby include allows you to do ‘Mixins’ and allows the functionality to be accessible from the current module, as if they was combined – a very powerful technique. In this case, it allows us to access CLR objects without needing to specify the namespace.

require ‘mscorlib’
require ‘System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’
include System
require ‘System.Xml, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’
include System::Xml

The first task is to create a new Ruby class. This class has an initialize method which will take in raw xml as a string.  Under the covers, it creates an instance of System.Xml.XmlDocument and loads the xml.

class Document
  def initialize(xml)
    @document = XmlDocument.new
    @document.load_xml xml
  end
end

Once we have created the document, we need a way to access the xml elements. Within C#, you would use the SelectNodes method, which returns a XmlNodeList, you would then iterate this collection to access the XmlNodes and as such your data. Well, life in IronRuby is a little different. I found that when iterating over the XmlNodeList, I was getting XmlElement objects, each of the nodes. I also wanted to provide a more ‘ruby-like’ way to access the elements.

The method I created has two arguments, one is the xpath query, the second being a Block, a piece of code which I want to be executed for each element. Within my code, I can iterate over all the elements, passing control back to the block with the element as a parameter for processing.

def elements(xpath, &b)
  ele = @document.select_nodes xpath
  ele.each do |e|
    b.call e
  end
end

Within the block, I can place the code required to process that section of the XML. However, I still need a way to access the data of the elements.  Because the above code will return XmlElement objects, I wanted to monkey patch the class to include a few custom methods. This is amazingly simple within IronRuby, you define a class with the same name and define your additional methods.

class XmlElement
  def get(name)
    select_single_node name
  end
  def to_s
    inner_text.to_s
  end
end

I also include an additional method called node() which is the same as above, but allows me to return sub-elements from an XmlElement object.

Finally, I saved this in a file called xml.rb. The filename is used by consumers within the require statement.

With this in place, I can use the wrapper to process xml.

# Include the wrapper
require ‘xml’

# Create the document
@document = Document.new(‘Jim193312BobSmith‘)

# Access root/name elements
@document.elements(‘root/name’) do |e|
   # Output the contents of the element named first
   puts e.get(‘first’)
   # Access the element named dob, then output the value of year.
   e.node(‘dob’) {|y| puts y.get(‘year’)}
end

When I execute this block of code, Jim, 1933, Bob is printed to the console.

>ir processXml.rb
Jim 
1933
Bob

While the wrapper isn’t very advanced, it’s a very quick and easy way to start working with xml from IronRuby.

A big thank you to Ivan Porto Carrero who pointed me in the correct direction of how to accept blocks within methods, before I had to do this:

@document.elements(‘x’).collect {|e| puts e.get(‘first’)}

Not much difference, but enough to make a impact.

Feel free to download the wrapper and the sample.  For future reference, I’ve uploaded the code to the MSDN Code Gallery, which I will update if I release a new version.

Technorati Tags:

OrcsWeb – Creating subdomains using ISAPI Rewrite

Over Christmas I moved my blog over to OrcsWeb. I had been meaning to move hosting provider for a while, however just never got around to it. However, in order to make the move as transparent as possible, I needed to create a subdomain of benhall.me.uk.  By default, OrcsWeb does not offer support for subdomains, you can map additional domains onto the account as long as you stay within your account limits, but you can only have one site (In terms of IIS).

However, hats off to the webteam (especially Desirée) at Orcsweb for helping me setup my subdomain, they pointed me towards the direction of ISAPI Rewrite v3 which is setup and ready to go for each site they host. Yesterday, Steve Andrews tweeted about how to do this, so I decided to share the config.

With V3 of ISAPI rewrite, you simply create a file called ‘.htaccess’ in the root of your IIS site. This contains all of your ‘rules’ about how to handle requests.

This is the .htaccess file for my website.

RewriteEngine on

#Redirect rss.xml to feedburner
RewriteCond %{HTTP:Host} blog.benhall.me.uk$
RewriteRule ^(.*)rss.xml$
http://feeds.feedburner.com/BenHall [R=301,NC]

#Fix missing trailing slash char on folders
RewriteRule ^([^.?]+[^.?/])$ $1/ [R,L]

# this rule directs blog.benhall.me.uk to Sites/blog.benhall.me.uk
RewriteCond %{HTTP:Host} ^(?:blog.)?benhall.me.uk$
RewriteRule ^(.*)$ Sites/blog.benhall.me.uk/$1

The most important section is the last block, this defines that if the request has the HTTP Host address as Blog.BenHall.me.uk, then rewrite the request to return the content from /Sites/blog.benhall.me.uk/ adding additional paths onto the end if required instead.

Now, when you visit blog.benhall.me.uk/index.html, the file is actually returned from /Sites/blog.benhall.me.uk/index.html.

However, for a long time I had a problem in getting this to work as I expected. Because I wanted a seamless experience, I didn’t want /Sites/blog… appearing in the browser’s address bar. It turns out the problem was that in the rule, I was including the full absolute path to the resource, instead of the relative path. By having the IP in the Rule (RewriteRule ^(.*)$ http://0.0.0.0/Sites/blog.benhall.me.uk/$1 ), it was causing a 301 redirect to occur which was being detected by the browser. By having the rule as relative, it would happen under the covers – without the user ever being aware of the change.

Technorati Tags: ,

Microsoft.Scripting.SyntaxErrorException was unhandled – Unicode support and IronRuby

I just ran head first into a wall I wasn’t expecting. I was attempting to host the DLR within a C# application (topic for another host), the IronRuby code I wanted to load was in an external file and was simply:

def helloWorld()
   puts ‘Hello World’
end

However, when I attempted to load the file, I was greeted with an unhandled exception.

Microsoft.Scripting.SyntaxErrorException was unhandled
  Message=”Invalid character ‘ï’ in expression”
  Source=”Microsoft.Scripting”
  Column=2
  ErrorCode=4112
  Line=1
 
SourceCode=”def helloWorld()rn   puts ‘Hello World’rnend”
  SourcePath=”C:/Users/Administrator/Documents/Visual Studio 10/Projects/IronRubyHost/bin/DebughelloWorld.rb”
  StackTrace:
       at Microsoft.Scripting.ErrorSink.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in E:IronRubyr156srcMicrosoft.ScriptingErrorSink.cs:line 34
       at Microsoft.Scripting.ErrorCounter.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in E:IronRubyr156srcMicrosoft.ScriptingErrorSink.cs:line 92
       at IronRuby.Compiler.Tokenizer.Report(String message, Int32 errorCode, SourceSpan location, Severity severity) in E:IronRubyr156srcironrubyCompilerParserTokenizer.cs:line 430

I was surprised to see the additional characters being added to the start of my SourceCode. After a bit of searching, I found this:

http://www.mail-archive.com/[email protected]/msg02137.html

It turns out that Ruby 1.8 (which IronRuby is targeting) doesn’t have support for Unicode characters. When creating a text file using Visual Studio, it automatically saves it with unicode support.

The quickest solution appears to be that you shouldn’t use Visual Studio to create Ruby files – at least until the editor has IronRuby integration. The other solution would be to use another editor, for example GVim or IronEditor, which is shortly going to get a facelift.

Technorati Tags: ,

IronRuby with StreamReader and StreamWriter

While playing around with IronRuby, I wanted to access a stream, as such I needed to create a StreamReader.  Generally a very simple task, however no matter what I did, I was getting an error message.

The code looked like this, readStream = System::IO::StreamReader.new(receiveStream), however when executed I was always getting the error.

E:IronRubysrcIronRuby.LibrariesBuiltinsModuleOps.cs:721:in `const_missing’: uninitialized constant System::IO:
:StreamReader (NameError)

NameError means IronRuby cannot find the StreamReader object – a bit of a problem. I thought I had included all the required references, I had access to the System.dll using the correct require statement.

require ‘System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’

After firing up reflector, turns out StreamReader is actually in the mscorlib assembly – not System.dll.  This was a simple fix, I added an additional require statement for mscorlib (require ‘mscorlib’) and I could happily access the StreamReader object.

In future, I’ll remember to always reference mscorlib and system.dll when doing .Net interop – it will just make life easier.

Technorati Tags:

RubyGems, IronRuby and System::Net namespace

Recently I encountered an interesting gotcha with IronRuby. I wanted to use a rubygem together with the WebRequest CLR object from the System.Net namespace. I had my require and include statements set as shown below.

require ‘rubygems’
# Additional require statements
require ‘mscorlib’
require ‘System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’
include System

With this in place, I could happily access my gems and CLR objects. However, when I attempt to access WebRequest it failed. The code was correct as I had it working in a different sample.

request = Net::WebRequest.Create(@url)

However, this whenever I tried in the current sample I was getting a NameError, meaning IronRuby was unable to find the class.

E:IronRubysrcIronRuby.LibrariesBuiltinsModuleOps.cs:721:in `const_missing’: uninitialized constant Net::WebRequest (NameError)
        from :0

After a little trial and error, I found that the reason it wasn’t working was because I had included RubyGems, this gave me access to ruby’s standard library. Within the standard library, there is a namespace called Net which has all of the standard objects you would expect for dealing with network communication (net::http, net::telnet etc). As a result, Ruby and .Net clashed with Ruby taking priority, meaning I couldn’t access Net::WebRequest.

The workaround is to be more explicit about what you want. You either need to include the System.Net namespace, include System::Net,  or add System when attempting to access the object, System::Net::WebRequest. You will then be able to access the object as normal.

Technorati Tags:

SQL Doc 2.0 – Customizing the Word cover page

In my previous posts I discussed how you could create printable documentation from SQL Doc. In this post, I want to discuss how you can customize the front cover of the documentation. By default, the output of the Word cover will look like this.

image

Within the SQL Doc installation directory (C:Program FilesRed GateSQL Doc 2), there is a style directory.  This style directory contains different artefacts used when generating the documentation, within the Word folder there is a file called DefaultCoverPage.doc. This is just a standard Word document, with placeholder fields used to insert the appropriate information for the document during generation. You can now customize the cover as you see fit such as including your own information, logo etc. For example, below I’ve moved the Red Gate logo, added an additional title, changed the size and colour of the title while moving the created date time to the footer of the page.

image

The next time a Word document is generated on that machine the new cover page will be used.

image

A ‘word’ of warning for users on Vista, you will need to run Word as Administrator, or save the customized cover page to a different path and copy the file back as Program Files is a restricted location which requires elevated permissions to write too.

I would also recommend taking a copy of the original cover in case you want to revert back. If you do accidentally change the cover page without taking a backup, delete the file DefaultCoverPage.doc and repair the installation via the control panel. This will restore the application to the original install files.

image

Technorati Tags: ,

SQL Doc 2.0 – Output as PDF or XPS

In my previous post I explained how you can use SQL Doc 2.0 to document ASPNetDB and produce a Microsoft Word document. One huge advantage of having it in Microsoft Word is that it can be exported to a number of different formats very easily – for example PDF or XPS.

To do this from Word 2007, the first step is to download the Word 2007 Save as PDF XPS addin from here – http://www.microsoft.com/downloads/details.aspx?FamilyID=4d951911-3e7e-4ae6-b059-a2e79ed87041&displaylang=en.  After installing this you will have a new option within your Save As menu.

image

After producing your database documentation as a Document, simple select this option to save the document.  It will ask you where you wish to save the document and the format it should be saved as. You can select between PDF, or Microsoft’s XPS Document – both work exactly perfectly, it’s just personal choice.

image

Once you have selected your location, the document will open. This is how the PDF document looks, which you can download from here

image

This is how the XPS document looks within IE8, which you can download here.

image

The document looks exactly the same as the original word output, however it can now be shared in different formats.

However, if you don’t have Office 2007, all is not lost. There is a free download available called CutePDF Writer where you can print any document or file as a PDF – very useful to have installed.

image

By selecting this as your printer, it will create you a PDF document which you can then use as you wish. For those of you with .Net 3.0 framework installed, there is an XPS Document Writer installed, which is very similar to CutePDF.  This will create you XPS documents in exactly the same way.

image

A 14 day free trail of SQL Doc can be downloaded from http://www.red-gate.com/Products/SQL_Doc

Technorati Tags: , , ,

Red Gate SQL Doc 2.0 – Producing printable database documentation

image

Last week I successfully released my first project! The project was a major release of SQL Doc from Red Gate which allows you to easily document your database schema. While maybe not as popular as SQL Compare, it solves a problem many people experience very effectively.

With SQL Doc 2.0, we had very clear aims and goals for what we wanted to achieve. We used a variety of different sources, such as support calls and forum posts, to identity the core features people where asking for. This provided us with three main features:

  • Printable documentation
  • Improve the description editor
  • Be able to use the links within the preview panel

With 2.0, we implemented the above features along with a series of small, yet important, changes to improve the overall experience for the user.

Documenting ASPNetDB

With ASP.net, there is a database called ASPnetDB, by default this is where various settings and data is stored for the built-in ASP.net providers, such as Membership and Personalization. While most of the time you don’t need to interact with the database directly, the times you do are extremely difficult due to the lack of documentation and the fact that SQL Server Management Studio isn’t designed for gaining a high level picture of the database schema. This is one example of where SQL Doc can help.

Upon connecting to your server, you can select the database(s) you wish to document. Here I’ve just selected aspnetdb.

image

Once you have selected all your databases, click Generate Documentation… SQL Doc can output the documentation to three different formats, CHM, HTML and now Microsoft Word (.doc). Depending on your use-case depends on the type of documentation you might require – we feel SQL Doc provides you with excellent support, however if you feel we are missing something let us know.

image

To understand ASPNetDb, I want to print off the database schema as I always find reading this easier in paper format than online. As such, I select Document to be my output. After a few moments, Word will open and I will have my documentation produced.

image

We can now navigate and read the document using word, or print it off. The documentation itself contains all the same information it would have within CHM or HTML, it’s just now in a printable format.

image

You can download this sample document here.

We also include the functionality to filter down the different objects, for example I might only want to include tables and stored procedures within the documentation. Using the treeview down the size, we can unselect the objects and categories we do not wish to be included.

By default, we include the SQL creation script for each database object, while this is very interesting certain users requested the ability to exclude this – which we have done. Within the Edit Project dialog there is a check box to include the SQL creation script, untick this and it will no longer be included.

image

The documentation now only includes the relevant sections while shorter due to the lack of SQL Scripts.  Feel free to take a look here.

You can download a 14 day free trail from our website:

http://www.red-gate.com/products/sql_doc

Technorati Tags: , ,

Learning a new language? Write some tests…

The excellent Pragmatic Programmer book suggests that you should learn a new language every year – this is something which I strongly agree with. By learning a new language it does not mean C# 4.0 (when you already know 2.0 and 3.0), or how to create Silverlight applications. Instead, make the effort to learn a language with a different mindset and approach to what your used to and actively engage in that community. Coming from a JavaC# background, Ruby was an eye-opener for me and approach to software development. The priorities and principals are different, for example Ruby has much more emphasise on elegance, solving problems and testability where software is treated as an art form while still being pragmatic in their approach – it’s never prefect and can be improved at some point. This importance is also being effectively communicated throughout the community from the Ruby guru’s such as Dave Thomas and Chad Fowler to the developers writing applications on a day-to-day basis. As a result of learning more about the Ruby language,

I feel I can approach a solution with a more opened minded approach.

How should you learn a new language?

This is something which I have been thinking about for a while, what is the most effective way to learn a new programming language? There are a couple of approaches you could take, but this is my approach (if you have your own suggestion, please leave it as a comment).

1) Buy a book, but not just any book – the best book. Personally, online materials are great but I still find the most effective way to learn something is to have a physical book by my side.  However, be sure to pick your book wisely, I recommend you research the influences within the community and either buy their book, or the book they recommend.  The wrong book could take you down completely the wrong path.

2) Start writing tests

Once you have picked your language and book of choice, you need to start grokking the concepts. After trying a couple of different techniques, I’ve found the best way to learn a new language is to actually write tests. This isn’t as crazy as it might sound, the test will define your expected outcome from your sample and give you something to aim for. This will help focus your mind on the task in hand, while giving you a clear signal as to when you are complete – the test will pass.

For example, with C#, if I wanted to know how to write a line of text to a file I might write the following test with the implementation.  

public class IO_Examples
{
    [Fact]
    public void Write_Hello_World_To_The_File_HelloWorldTxt()
    {
        MyFileAccess.Write(“Hello World”, “HelloWorld.txt”);

        Assert.True(File.ReadAllText(“HelloWorld.txt”).Contains(“Hello World”));
    }
}

public class MyFileAccess
{
    public static void Write(string s, string file)
    {
        StreamWriter streamWriter = new StreamWriter(file);
        streamWriter.WriteLine(s);
        streamWriter.Close();
    }
}

I have also found tests to be a much more natural starting point for interacting with a language, with C# my starting point was a command line application, however this wasn’t the most effective way of learning as I was constantly commenting out calls to various methods to execute the correct block of code. By using a test framework and a test runner as your starting point, I would have been able to run the samples more quickly and effectively while still keeping everything readable.

However, if your anything like me, while learning you will end up going off on a tangent or being distracted mid-task, I can recall too many occasions where I have been deep in the middle of learning the inners the underlying code only to read a blog post which takes me in a different direction and then completely forgetting where I was. By having a test, or a series of tests, guiding me I am able to quickly get back on track by seeing which tests are currently failing.  Because the tests will describe my aim, I will have a much better chance of remembering what I was actually doing.

Finally, once you start moving onto real applications and solving real problems using the language, you will be able to look back and refer to the tests as a reference regarding everything you learned.  If you keep your tests in a source control repository, you will even be able to see how you adapted over time. This could be extremely useful a year down the line where you want a very quick refresher.

3) Solve a problem!

Once you have an understanding of the language with a series of tests describing how to do various different tasks, the next thing to do is solve a problem you, or someone else, is experiencing having. While this isn’t always possible, it’s a great motivator to finish the job.  I enjoy automation and improving productivity (it allows me to spend more time on twitter), I find it really interesting to see how problems can be solved in a more effective fashion using tools and technologies – if a new language can help me with that then all the better. Not only will I learn the language in a more ‘real world’ context, but it will be helping me in the future.

Technorati Tags: , ,

Downloading IronRuby from GitHub

This week the IronRuby team moved their source repository over to GitHub, with the layout now reflecting the structure they maintain internally. I just wanted to cover how I downloaded and built IronRuby on a Windows Vista machine – LinuxMac users will need to build via Mono. The first task is to download the code, as such you will need a git client, at the moment the best client I have found is msysgit.

msysgit installs both a GUI and command line, to download the source code I simply used the command line. Executing the following command within the directory where you want the code to be stored, for me this was E:IronRubyGit.

git clone git://github.com/ironruby/ironruby.git

The next task is to build the project, if you have MRI (Matz’s ruby) you can use the rake compile command, this will compile everything for you. If you don’t have MRI installed, you can build the Ruby.sln file using Visual Studio 2008 or the C# compiler (csc).

E:IronRubyGitironrubymerlinmainLanguagesRuby>rake compile
(in E:/IronRubyGit/ironruby/merlin/main/Languages/Ruby)
——————————————————————————-
dlr_core
——————————————————————————-
rake aborted!
No such file or directory – e:ironrubygitironrubymerlinmainlanguagesrubysrcmicrosoft.scripting.core

(See full trace by running task with –trace)

However, I was greeted with a error message saying it was unable to find a file or directory. Luckily, the IronRuby mailing list is a great resource, and it turns out I needed to set the MERLIN_ROOT environment variable.

E:IronRubyGitironrubymerlinmainLanguagesRuby>set MERLIN_ROOT=e:ironrubygitironrubymerlinmain

As a side note, merlin was the original codename for the IronPython team, which now includes IronRuby and the DLR (see John Lam’s post)

After solving this, I hit another error regarding tf.exe, the Team Foundation Source Control command line, being missing.

Cannot find tf.exe on system path.

The build script (rake) does a check to see if everything is configured correctly for IronRuby development, however tf isn’t required for building and actually only required internally within the team. The check for tf.exe is made within E:IronRubyGitironrubymerlinmainLanguagesRubyrakemisc.rake, search for the string and simply remove it from the array.

commands += [‘tf.exe’, ‘svn.exe’] if IronRuby.is_merlin?

Run rake compile again and IronRuby should happily build with the assemblies being built to ‘E:IronRubyGitironrubymerlinmainbindebug’

However, after launching ir.exe (the IronRuby REPL), I was unable to reference rubygems. After a bit of investigating, turns out the $LOAD_PATHS where incorrect. These paths are set within ir.exe.config, all you need to do is replace the following part of the path ExternalLanguagesRubyRuby-1.8.6libruby with => externallanguagesrubyredist-libsruby . It’s used three times, after replacing the values I could happily use IronRuby.

Hope this helps!

Technorati Tags: , ,