Skip to main content

Building a Simple Data Definition DSL with Boo

In my daily work, I find myself writing a lot of one-off SQL scripts to handle various data issues. In particular, I've had to tweak existing sets of metadata with scripts that look something like the following:

DECLARE     @SomeId INT
DECLARE     @SomeOtherId INT

SELECT      @SomeId = SomeTableId 
FROM        SomeTable 
WHERE       SomeName = 'Something'

SELECT      @SomeOtherId = SomeOtherTableid 
FROM        SomeOtherTable 
WHERE       SomeOtherName = 'SomethingElse'

IF NOT EXISTS(
    SELECT  1 
    FROM    SomeMetadataTable 
    WHERE   Name = 'MetadataName')
BEGIN
    INSERT INTO SomeMetadataTable
        (Name, Property1, Property2)
    VALUES 
        ('MetadataName', @SomeId, @SomeOtherId)
END
ELSE
BEGIN
    UPDATE  SomeMetadataTable
    SET     Property1 = @SomeId
        ,   Property2 = @SomeOtherId
    WHERE   Name = 'MetadataName'
END

Just writing a few of these gets to be tedious, and master scripts to populate entire tables are large, difficult to maintain blobs of SQL.

At the end of last year, I nabbed myself some copies (digital and physical) of Ayende's excellent DSLs in Boo. Having thought before that I might use some of the tools in the SQL Server Modeling space to build a DSL to simplify the scripting of such metadata, I decided to see how easily something workable could be built with Boo. There were a few stumbling blocks along way, mostly regarding my inexperience with Boo and my inability to download and compile Rhino DSL (I'm not used to git, and a directoy with a space in the name wasted some of my time). Once I was past those problems, I found the going very easy.

I wrote this exploratory example before the NFL championship games this year, so I chose to create a language that might describe the attributes of a football team. Here is the final script I ended with:

define "New York Jets":
    defense "Superb"
    pass_offense "Game Management", with_qb_type("Rookie")
    run_offense "Power"

define "Minnesota Vikings":
    defense "Solid"
    pass_offense "High-Powered", with_qb_type("Aging Veteran")
    run_offense "Dynamic"

define "Indianapolis Colts":
    defense "Speedy"
    pass_offense "High-Octane", with_qb_type("Elite Veteran")
    run_offense "Anemic"

define "New Orleans Saints":
    defense "Opportunistic"
    pass_offense "Explosive", with_qb_type("Rising Star")
    run_offense "Serviceable"

The easiest way to define 'keywords' like define and defense is through meta methods. Meta methods process AST nodes and return AST nodes during compilation. The meta method for define looks like this:

[Meta]
public static def define(expression as StringLiteralExpression, block as BlockExpression):	
    return [|
        _currentTeam.Name = $(expression.Value)	
        DefineTeam($block)
    |]

I'm using quasi-quotation here, which basically means that Boo will turn that block of code that I'm returning into an appropriate AST node. So, in this case, when the compiler encounters 'define', it will take the following string and the block that follows that and transform it into a call that sets a property (Name) on a class variable (_currentTeam) with the string, followed by a call to a class method (DefineTeam) that takes the block as an argument. Yeahhhhhhhhhhh, that's confusing. It takes a little bit of time to get your head around exactly what is going on, but once you do, you can crank out meta methods like mad.
So where did that field and method come from? I'm using Ayende's Rhino DSL library to create an 'implicit base class' for the script. The meta methods ultimately call fields and methods that are defined on this base class from which the script inherits. In this case, I'm doing something that might be unorthodox or bad practice by combining all the team definitions together and storing them in a list. In most examples I've seen, the team definitions would be in four different script. For this spike, I went with something quick and dirty. The SQL I'm generating isn't being written to file anywhere or executed; I'm just kicking it out to the console. Here's the script's base class in its entirety:

import System
import Boo.Lang.Compiler.Ast

public abstract class TeamDefinitionScript:

protected currentNumber as int

public def constructor():
    CurrentTeam = TeamModel()

[Property(Teams)]
protected _teams as List[of TeamModel] = List[of TeamModel]()

[Property(CurrentTeam)]
protected _currentTeam as TeamModel

abstract def Prepare():
    pass

protected def DefineTeam(block as callable):
    block()
    Teams.Add(CurrentTeam)
    currentNumber++

    print "DECLARE @DefenseId${currentNumber} INT"
    print "SELECT  @DefenseId${currentNumber} = DefenseId FROM Defense WHERE DefenseTypeName = '${CurrentTeam.Defense}'"
    print ""
    print "DECLARE @RunOffenseId${currentNumber} INT"
    print "SELECT  @RunOffenseId${currentNumber} = RunOffenseId FROM RunOffense WHERE RunOffenseTypeName = '${CurrentTeam.RunOffense}'"
    print ""
    print "DECLARE @PassOffenseId${currentNumber} INT"
    print "DECLARE @QbTypeId${currentNumber}	  INT"

    print "SELECT  @QbTypeId${currentNumber} = QbTypeId FROM QbType WHERE QbTypeName = '${CurrentTeam.QbType}'"
    print "SELECT  @PassOffenseId${currentNumber} = PassOffenseId FROM PassOffense WHERE PassOffenseTypeName = '${CurrentTeam.PassOffense}' AND QbTypeId = @QbTypeId"
    print ""
    print "IF NOT EXISTS (SELECT 1 FROM Teams WHERE TeamName = '${CurrentTeam.Name}')"
    print "BEGIN"
    print "    INSERT INTO Teams (TeamName, DefenseId, RunOffenseId, PassOffenseId)"
    print "        VALUES        (${CurrentTeam.Name}, @DefenseId${currentNumber}, @RunOffenseId${currentNumber}, @PassOffenseId${currentNumber})"
    print "END"
    print "ELSE"
    print "BEGIN"
    print "    UPDATE Teams"
    print "    SET    DefenseId = @DefenseId${currentNumber}"
    print "    ,      RunOffenseId = @RunOffenseId${currentNumber}"
    print "    ,      PassOffenseId = @PassOffenseId${currentNumber}"
    print "    WHERE  TeamName = '${CurrentTeam.Name}'"
    print "END"
    print ""

[Meta]
public static def define(expression as StringLiteralExpression, block as BlockExpression):	
    return [|
        _currentTeam.Name = $(expression.Value)	
        DefineTeam($block)
    |]

[Meta]
public static def defense(expression as StringLiteralExpression):
    return [|
        _currentTeam.Defense = $(expression.Value)	
    |]

[Meta]
public static def pass_offense(expression as StringLiteralExpression, withQbType as MethodInvocationExpression):	
    return [|
        _currentTeam.PassOffense = $(expression.Value)	
        $withQbType
    |]

[Meta]
public static def run_offense(expression as StringLiteralExpression):
    return [|
        _currentTeam.RunOffense = $(expression.Value)	
    |]

[Meta]
public static def with_qb_type(qbType as StringLiteralExpression):
    return [|
        _currentTeam.QbType = $(qbType.Value)
    |]

(NOTE: Whitespace is significant by default in Boo, and I've tinkered with it a bit here, so this wouldn't actually compile.)

That's really all there is to it. This class contains the bulk of the work. The meta methods are static, and don't need to be members on the base class, but it's a little easier to follow the flow of method calls this way. Now all I need to do is wire up the implicit base class and run the script. Rhino DSL does this for me:

import System
import System.IO
import Rhino.DSL

factory = DslFactory()
factory.BaseDirectory = System.AppDomain.CurrentDomain.BaseDirectory
factory.Register[of TeamDefinitionScript](TeamDefinitionDslEngine())

scriptPath = Path.Combine(factory.BaseDirectory, "Scripts\\TeamDefinitions.boo")

script = factory.Create[of TeamDefinitionScript](scriptPath)
script.Prepare()

(NOTE: This script runner console is a different assembly than the one with the script base class.)

I run it, and this is the output (for the last declaration only):

DECLARE @DefenseId4 INT
SELECT  @DefenseId4 = DefenseId FROM Defense WHERE DefenseTypeName = 'Opportunistic'
DECLARE @RunOffenseId4 INT
SELECT  @RunOffenseId4 = RunOffenseId FROM RunOffense WHERE RunOffenseTypeName = 'Serviceable'
DECLARE @PassOffenseId4 INT
DECLARE @QbTypeId4        INT
SELECT  @QbTypeId4 = QbTypeId FROM QbType WHERE QbTypeName = 'Rising Star'
SELECT  @PassOffenseId4 = PassOffenseId FROM PassOffense WHERE PassOffenseTypeName = 'Explosive' AND QbTypeId = @QbTypeId

IF NOT EXISTS (SELECT 1 FROM Teams WHERE TeamName = 'New Orleans Saints')
BEGIN
INSERT INTO Teams (TeamName, DefenseId, RunOffenseId, PassOffenseId)
VALUES        (New Orleans Saints, @DefenseId4, @RunOffenseId4, @PassOffenseId4)END
ELSE
BEGIN
UPDATE Teams
SET    DefenseId = @DefenseId4
,      RunOffenseId = @RunOffenseId4
,      PassOffenseId = @PassOffenseId4
WHERE  TeamName = 'New Orleans Saints'
END

Pretty simple. Boo makes everything so easy, you can concentrate on developing syntax and language features instead of hacking out code.

Comments

Popular posts from this blog

Migrating Hg Repos with hg-fast-export and Windows Subsystem for Linux

Introduction I prefer Mercurial (hg) to git . I don’t really have any reason for this preference - they both do the same thing, and the user experience for 90% of the use cases is the same. It probably comes from the conditions of the DVCS landscape when I started using these systems. Some of this may have been perception only, but it looked like this: GitHub didn’t have free private repos BitBucket did have free private repos BitBucket was very hg-friendly Joel Spolsky had an amazing tutorial that served as both a how-to for hg as well as a general intro to DVCS hg was much more Windows-friendly than git Since hg was written in python, I felt like extending it would be easier than doing so for git if I ever needed to (admittedly, this is a pretty ridiculous reason) hg felt like a more unified, “coherent” system than the very linux-y feeling git and its extensions (also pretty ridiculous) Where they differed, I liked the verbs hg used better than git’s counterparts ...

Enabling Globalization Invariant Mode for .NET Core App on Raspberry Pi Running LibreElec

I had an app I wanted to run on my Raspberry Pi 3 running LibreElec . In LibreElec you can install the dotnet core 2.2 runtime as an addon, and in Visual Studio you can compile for ARM processors with ‘Target Runtime’ set to ‘linux-arm’ in the publish profile. So, I published to a folder from VS using that profile, and I copied the output over to my RPi which had the dotnet runtime installed. I did a simple dotnet Whatever.dll to run the app (actually in this case, it was /storage/.kodi/addons/tools.dotnet-runtime/bin/dotnet Whatever.dll because of the way the addon is installed) and was met with this error: FailFast: Couldn't find a valid ICU package installed on the system. Set the configuration flag System.Globalization.Invariant to true if you want to run with no globalization support. at System.Environment.FailFast(System.String) at System.Globalization.GlobalizationMode.GetGlobalizationInvariantMode() at System.Globalization.GlobalizationMode..cctor() at Syste...

Stubbing Static Methods with PostSharp

TypeMock uses the Profiler API to allow mocking, stubbing, etc. of classes used by code under test. It has the ability to handle sealed classes, static classes, non-virtual methods, and other troublesome-yet-oft-encountered scenarios in the world of unit testing. Other frameworks rely on proxies to intercept method calls, limiting them to be able to only fake virtual, abstract, and interface members. They also rely on dependecy injection to place the proxies as the concrete implementation of calls to the abstracted interface members. Anyone working with a legacy codebase is bound to run into static method calls (especially in the data access layer), dependencies on concrete types with non-virtual methods, and sealed class dependencies (HttpContext anyone?). The only way to unit test this without refactoring is with TypeMock. I've never used TypeMock, and I'm sure it's a great product, but it's not free. I decided to spike some code to see if I could solve the prob...