Rabih Kodeih's

FINQ: A ‘LINQ to Objects’ implementation in Actionscript 3

LINQ is a technology which enables developers to efficiently write queries against ubiquitous data sources in a SQL-like declarative style. It’s an impressive technology, especially to developers who aren’t acquainted with functional programming techniques. Wouldn’t it be nice to have something like LINQ implemented in Actionscript 3?

At first, I started experimenting a bit with “select” and “where” statements, then “join” and “groupBy” and not before long, FINQ was born.

What is FINQ and how to use it?

FINQ (Flex Integrated Query) is an implementation of LINQ to Objects (LINQ is basically made up of LINQ to Objects and LINQ to SQL). It is entirely written in AS3 and as such can be very easily integrated with any Flex or Flash program. Download the source code here. To use FINQ in your code or Flex project, just unpack the downloaded file and place the ‘finq’ folder inside the ‘src’ folder of your project then simply import the ‘finq’ package in your code.

As an example of how FINQ code looks like, consider the following example which outputs the number of occurrences for each element in a collection of integers:

var data:IEnumerable = new FinqObj([1 , 1 , 1 , 2 , 2 , 2 , 3 , 3 , 4 , 4 , 4]);
data.groupBy(
   function (key) { return key; },
   function (value) { return 1; },
   function (key, result) { return { number:key, count:result.count() }; }
).print(["number", "count"]);

Later in this post we will come across a tutorial showing numerous code examples.

How was FINQ implemented?

Truth be told, when I finished writing the code I realized that it was really mostly syntactic sugar on top of AS3’s arrays. AS3 is a powerful language, it supports lexical closures, OOP constructs, dynamic programming, associative arrays and prototypal inheritance all in one place! These things make writing the implementation a relatively straightforward task. Most of the functions found in FINQ where implemented using more or less AS3’s built in collections and functional support. I will hopefully try in future posts to elaborate more on this topic.

How is FINQ designed and organized?

The design is very simple. It consists of one interface: IEnumerable and one class: FinqObj. IEnumerable defines all the extension methods that normally ship with LINQ plus some few others of my own to mimic the versatility of AS3’s arrays. FinqObj is a class which extends AS3’s Array class and implements IEnumerable. The new method definitions added to IEnumerable are:

  • forEach
  • getElementKeys
  • getElementKeysDescending
  • nonPrimitives
  • pop
  • primitives
  • print
  • push
  • toArrayCollection
  • toFinqObj

All IEnumerable method definitions were implemented except for the following:

  • asQueryable
  • longCount
  • toList
  • toLookup
  • selectMany

It is worth noting that I could have added the extension methods using Array’s prototype object but chose not to because Flex builder 3’s IntelliSense doesn’t know as yet what to do with them, besides I wanted to explicitly introduce the IEnumerable interface.

So why didn’t I do a ‘FINQ to SQL’?

Well, after digging a bit into the inner workings of LINQ, I discovered that LINQ to SQL could not work without so called expression trees. In a .NET language complier, when LINQ query code is parsed, it is converted into an intermediate form known as an expression tree. At run time, when the query is about to execute, its corresponding expression tree is used to construct the SQL query string that is passed to the backing database engine for execution. Thus in order to emulate LINQ to SQL, we have to at least provide some sort of mechanism to parse FINQ queries into expression trees. That seems to be a lot of work.

This might be done in Flex builder 3. The idea is to run some kind of a script or a program (possibly written in Java) prior to compiling the AS3 source code, which generates a text file (in a pre-build step), and save the file into the source code project folder.
This file would contain all the textual code definitions of all the FINQ queries that are found in the source during the pre-build step (the file itself can be deployed as an asset in the complied swf file). At runtime, before a certain FINQ query is executed, its related code text is read from the embedded text file and then parsed whereby the expression tree is generated. To perform this last step, we could use something like tamarin or this cool AS3 eval complier. I am not sure if there is a better idea, but this seems to be at least doable.

FINQ tutorial

Let’s start by defining our IEnumerable collection object FinqObj, and adding some elements to it:

var data:IEnumerable = new FinqObj();
data.push("orange");
data.push("banana");
data.push("apple");
data.push("grapes");
data.push("mango");

Alternatively, we can write:

var data:IEnumerable = new FinqObj(["orange", "banana", "apple", "grapes", "mango"]);

To access the stored items, we can simply iterate through data as if it were a normal AS3 array:

for each (var e:String in data) {
   trace(e)
}
// OUTPUTS:
// orange
// banana
// apple
// grapes
// mango

A little handy function is print() which returns a string output of the data and also prints the data on the Flex builder debug console:

data.print();
// OUTPUTS:
// orange
// banana
// apple
// grapes
// mango

Let’s now perform some transformations on data using select():

data.select(function(x) { return x.length; }).print();
// OUTPUTS:
// 6
// 6
// 5
// 6
// 5

data.select(function(x) { return { type:x, length:x.length }; }).print();
// OUTPUTS:
// length:6 type:orange
// length:6 type:banana
// length:5 type:apple
// length:6 type:grapes
// length:5 type:mango

In both cases, the select statement applies the inner anonymous function (referred to as closure from now on) to every element of data resulting in a new IEnumerable collection. In each closure above, the variable x represents an individual element of data. Most FINQ functions (at least those who return IEnumerable) are non-destructive to the original data meaning that the output is a completely new object.

The output of the second select statement is a bit odd, length was printed before type, this is because the order of <key, value> pairs is not guaranteed in AS3 objects and associative arrays. To straighten this out, we could use:

data.select(function(x) { return { type:x, length:x.length }; })
.print(["type", "length"]);
// OUTPUTS:
// type:orange length:6
// type:banana length:6
// type:apple length:5
// type:grapes length:6
// type:mango length:5

A nice feature of select() is that you can get any <key, value> pair in the element object using a notation similar to the above print() statement:

var people:IEnumerable = new FinqObj([
   {name:"Allen Frances", age:11, canCode:false},
   {name:"Burke Madison", age:50, canCode:true},
   {name:"David Charles", age:33, canCode:true},
   {name:"Connor Morgan", age:50, canCode:false},
   {name:"Everett Frank", age:16, canCode:true},
]);
people.select("name", "canCode").print();
//people.where(function(
// OUTPUTS:
// canCode:false name:Allen Frances
// canCode:true name:Burke Madison
// canCode:true name:David Charles
// canCode:false name:Connor Morgan
// canCode:true name:Everett Frank

which is also equivalent to:

people.select(function(x) { return { name:x.name, canCode:x.canCode }; })
.print();
// OUTPUTS:
// canCode:false name:Allen Frances
// canCode:true name:Burke Madison
// canCode:true name:David Charles
// canCode:false name:Connor Morgan
// canCode:true name:Everett Frank

You can also retrieve an array of keys for collections containing elements which are either Arrays or Objects, and iterate through it:

var result:IEnumerable = data.select(function(x) { return { type:x, length:x.length }; });
var keys:Array = result.getElementKeys();
for each (var key:String in keys) {
   var element:Object = result[0];
   trace(element[key]);
}
// OUTPUTS:
// 6
// orange

Suppose now we want to find out who can code among people:

people.where(function(x) { return x.canCode == true; })
.print();
// OUTPUTS:
// age:50 canCode:true name:Burke Madison
// age:33 canCode:true name:David Charles
// age:16 canCode:true name:Everett Frank

Just like select(), where() also returns an IEnumerable collection. In general, all the functions with a return value type of IEnumerable can be chained together to construct more sophisticated queries:

people.where(function(x) { return x.canCode == true; })
.select("name")
.print();
// OUTPUTS:
// name:Burke Madison
// name:David Charles
// name:Everett Frank

Sometimes it is useful to defer the execution of a query. This can be easily achieved by wrapping the query with a closure as shown in the following snippet:

// query defined here:
var whoCanCode = (function(data:IEnumerable) {
   return function() {
      return data.where(function(x) { return x.canCode == true; }).select("name");
   }
})(people);
//.
//.
// the original data source is assigned a new value here:
people = new FinqObj([1 ,2 ,3]);;
//.
//.
// query executed here:
whoCanCode().print();
// OUTPUTS:
// name:Burke Madison
// name:David Charles
// name:Everett Frank

So far so good.

But before we proceed any further, I want to introduce a useful little naming convention regarding argument closures. If you look at the method signatures in IEnumerable, you will notice that most closure argument names end with one of the following strings:

  • selector
  • aggregator
  • comparer
  • equlalityComparer
  • predicate

The convention is that those closures are considered to have types according to:

  • selector
  • aggregator
  • comparer
  • equlalityComparer
  • predicate

= function (x)
= function (x, y)
= function (x, y)
= function (x, y)
= function (x)

returns dynamic type
returns dynamic type
returns int within {-1, 0, +1}
returns Boolean
returns Boolean

For example, the orderBy() function takes a keySelector and an optional keyComparer. The keySelector specifies which key the ordering should be done relative to and the keyComparer specifies how two keys should be compared relative to each other. According to our naming convention, keySelector’s name ends with “selector” and therefore it is a function which takes one argument (of any type) and returns a dynamic type (in the example below, it returns a String). Likewise, keyComparer’s name ends with “comparer” and as such it is a function which takes two arguments (also of any type) and returns an integer. Checkout the following example for an illustration:

people.orderBy(
   function(x) { return x.name; }, // keySelector
   function(x:String, y:String) { // keyComparer
      if (x.toUpperCase() > y.toUpperCase()) return +1;
      if (x.toUpperCase() == y.toUpperCase()) return 0;
      return -1;
   }
).select("name").print();
// OUTPUTS:
// name:Allen Frances
// name:Burke Madison
// name:Connor Morgan
// name:David Charles
// name:Everett Frank
 

To be more concise, I will refer from now on to closure types by their ending string (a keySelector is a selector, a keyComparer is a comparer, etc…).

If you look more closely at the signature of orderBy(), you’ll notice oddly that it specifies keySelector as a dynamic type (indicated by using a wildcard) instead of a Function. When a wildcard is used to specify the argument type, it means that the argument can be either a closure or a string specifying the the element key.

A comparer usually returns and integer value of –1, 0 or +1 but it is annoying to write tests (as in the above code) each time we want to supply a comparer. That’s why I’ve introduced the option of a comparer returning a boolean (ex: return { x < y; }), the FinqObj implementation is smart enough to interpret the difference. Thus our previous orderBy() example can been written more succinctly as follows:

people.orderBy("name", function(x, y) { return (x.toUpperCase() > y.toUpperCase()); })
.select("name").print();
// OUTPUTS:
// name:Allen Frances
// name:Burke Madison
// name:Connor Morgan
// name:David Charles
// name:Everett Frank

Just like in a phone book, we can nest ordering operations using thenBy():

people.orderBy("canCode").thenBy("name")
.select("canCode", "name").print();
// OUTPUTS:
// canCode:false name:Allen Frances
// canCode:false name:Connor Morgan
// canCode:true name:Burke Madison
// canCode:true name:David Charles
// canCode:true name:Everett Frank

One of the most powerful commands in FINQ is groupBy(). Say we want to group people in two separate lists according to who can and who can’t code:

var result:IEnumerable = people.groupBy("canCode");
for each (var element:* in result) {
   var key:String = element.key;
   var group:IEnumerable = element.value;
   trace("canCode : " + key);
   group.print();
   trace();
}
// OUTPUTS:
// canCode : false
// age:11 canCode:false name:Allen Frances
// age:50 canCode:false name:Connor Morgan
//
// canCode : true
// age:50 canCode:true name:Burke Madison
// age:33 canCode:true name:David Charles
// age:16 canCode:true name:Everett Frank

When the query is executed, groupBy() applies its input keySelector argument to every element in people and creates a <key, value> pair accordingly. The key part in the pair holds the result of keySelector and the value part of the pair holds an IEnumerable list of elements having all the same key as computed by keySelector for that pair.

groupBy() is a peculiar command, it can be used to implement complicated queries. For example, suppose we need the names of people sorted in three distinct age groups:

var result:IEnumerable = people.groupBy(
   function(x) { // keySelector
      if ( x.age > 0 && x.age <= 20 ) return "adolescent";
      if ( x.age > 20 && x.age <= 35 ) return "young";
      if ( x.age > 35 ) return "veteran";
   },
   "name" // elementSelector
);
for each (var element:* in result) {
   trace("Age group : " + element.key);
   (element.value as IEnumerable).print();
   trace();
}
// OUTPUTS:
// Age group : adolescent
// Allen Frances
// Everett Frank
//
// Age group : veteran
// Burke Madison
// Connor Morgan
//
// Age group : young
// David Charles

Here we have used an elementSelector as well. When the list of results is constructed for each key, elementSelector is applied to every element (that matches the key) prior to its addition to the list.

Now we come to the venerable join(). Consider the classical example of customers and orders:

var customers:IEnumerable = new FinqObj([
   {id:1, name:"Gotts"},
   {id:2, name:"Valdes"},
   {id:3, name:"Gauwin"},
   {id:4, name:"Deane"},
   {id:5, name:"Zeeman"}
]);
var orders:IEnumerable = new FinqObj([
   {id:1, description:"Order 1"},
   {id:1, description:"Order 2"},
   {id:4, description:"Order 3"},
   {id:4, description:"Order 4"},
   {id:5, description:"Order 5"}
]);
customers.join( // outer
   orders, // inner
   "id", // outerKeySelector
   "id", // innerKeySelector
   function(c, o) { // resultAggregator
      return { customerName:c.name, order:o.description };
   }
).print();
// OUTPUTS:
// customerName:Gotts order:Order 1
// customerName:Gotts order:Order 2
// customerName:Deane order:Order 3
// customerName:Deane order:Order 4
// customerName:Zeeman order:Order 5

This examples deserves some explanation:

inner is the IEnumerable collection to join with outer. The two selectors outerKeySelector and innerKeySelector are used to get the corresponding keys from outer and inner to be matched in the join operation. The final aggregator closure resultAggregator is simply applied to each matching element pair (from inner and outer) in order to compute the resulting element.

Suppose now we wanted to group our last join result by customerName. We can of course append a simple groupBy() query to join(), but there is shorter way to do it using groupJoin(). The signature of groupJoin() is exactly the same as the signature of join() and operates in almost the same way. The only difference is that instead of joining outer and inner by directly matching keys, first innerKeySelector is used in a groupBy() like operation on inner then the result is matched with outer using outerKeySelector. Check out the following code which demonstrates the concept:

customers.groupJoin( // outer
orders, // inner
"id", // outerKeySelector
"id", // innerKeySelector
function(c, g) { // resultAggregator
   return {
      c
ustomerName:c.name,
      orders:g.select
(function(order) { return order.description; })
   }
;
}).print();
// OUTPUTS:
// customerName:Gotts orders:Order 1,Order 2
// customerName:Valdes orders:
// customerName:Gauwin orders:
// customerName:Deane orders:Order 3,Order 4
// customerName:Zeeman orders:Order 5

There are still other FINQ functions such as union(), intersection(), max(), min() and many more. All of those are fairly similar to their LINQ counterparts and relatively easy to understand.

Some caveats

There are a few things native to LINQ which aren’t easily portable to FINQ: type inference and IntelliSense support (this requires changes at the compiler and IDE levels), lambda expressions (admittedly lambda expressions allow more succinct queries but with closures they are not absolutely necessary), and as mentioned at the beginning of this post, expression trees.

Conclusion

I hope you find FINQ useful. You are welcomed to post any comment.

6 comments:

Anonymous said...

Kudos to you Rabih, this is a fascinating take on LINQ. Well done.

Justin

kuimoani said...

thanks good information

Anonymous said...

Hello.
Looks interesting, I wonder whether it can be used in Pure ActionScript projects (No Flex)

Anonymous said...

This is amazing work! Can't believe I haven't seen it before. I'll have a blast picking this apart tonight!

Anonymous said...

Hi there,
thanks a lot for sharing this! A very handy tool. How about putting it somewhere like Sourceforge or Google Code and creating its own website so that it can be found even better by the flex community? I think finq is a must for every efficient developer...
Stefan

Rabih Kodeih said...

Thanks Stefan,

Good idea about putting the code on a public repo. Please feel free to do whatever you think is best for the flex community.

Post a Comment