First tentative steps for Linq-2-NHibernate

It's not much. But it does work. Don't get too excited (again - I keep saying that!), but this is now going right through the execution pipeline within NH, so this did hit the DB, and did come back with the correct answer. Of course, there aren't any Assert()s or any fancy stuff like that. It's not *really* a test :).

Now just to do the translation properly... And write some real tests...

[Test]
public void Test()
{
   ISession session = OpenSession();
   var x = from a in session.Query<Animal>() where a.Legs == 8 select a;
   Console.WriteLine(x.ToList().Count());
   session.Close();
}

BTW, ignore the superfluous .ToList() in there, it's just 'cos my translator doesn't know how to handle Count(), so I'm forcing execution and doing the Count() client-side.

Comments

# re: First tentative steps for Linq-2-NHibernate
Gravatar Hi Steve

are you going to make an attempt at creating a full LINQ provider from scratch?

We've talked about this before, and I'm wondering whether you're still considering using re-linq. Have you given our re-linq-based LINQ to NH prototype a look yet?

We've created that back in early march, so it's still creating HQL strings instead of the AST, but changing that should be a pretty minor task compared to the effort of creating a full-blown LINQ provider.

Also, please check out my last blog entry about re-linq.

We'd be happy to help out further, but so far, nobody seems to be too interested in asking.

- Stefan
Left by Stefan Wenig on 4/28/2009 10:20 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Hi Stefan,

So far, it's my own Linq provider. But that side of things is really just a proof of concept to get everything hooked in with the rest of the NH source. My provider currently runs to less than 200 loc, and can only handle really simple stuff like the query above. In fact, it's quite possible that it can *only* handle the query above, since that's my only "test" case :)

My next job is to get a full suite of tests sorted, and then to return to the provider side of things to look at how to best implement it. I have got the current re-linq source from your svn repo, and read the article that Fabian posted a couple of days ago, so don't worry - I've not forgotten about you guys :)
Left by sstrong on 4/28/2009 10:43 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Steve,

no problem, pls just be aware that in order to deliver the missing pieces in re-linq in time, we'd rather talk sooner than later about how we're going to do that.

(We have a stake in this too: we're thinking about moving our own ORM, re-store, on top of NH. But we absolutely need LINQ queries for NH then. Knowing where the LINQ story for NH is heading is quite essential for our planning.)

- Stefan
Left by Stefan Wenig on 4/28/2009 11:12 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar The main problem with linq providers is not the part where you emit SQL, it's the part where you try to link everything together, like which part of the join does this memberreference refer to? (so you can properly alias the thing) etc. etc.

The more you can off-load that to already written code, the better. I'm not sure if the AST pipeline already takes care of this (i.e. wrapping things in derived tables, optimizing them etc.). If not, you're in for a hell of a time ;)

I might sound like some weirdo, but don't focus too much on getting enough tests, you'll never have enough. I've over 1500 nasty linq queries, but it's not enough to make things bugfree unfortunately. MS has over a million. It's more important to get a solid basis and cover all the cases with respect to converting method calls to known IQueryable / IEnumerable extension methods into known constructs. Matt Warren's public code is a good start. That code doesn't contain code to understand the more nastier parts of linq queries though, but it would save you some time and gets you started with the ground works like a generic expression visitor etc. Also check my articles on writing a linq provider for further info how to avoid pitfalls ;)
Left by Frans Bouma on 4/28/2009 12:16 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Frans, without knowing, you've already been called as witness #1 ;-)

www.re-motion.org/.../...vider-infrastructure.aspx (you're in that paper too)
Left by Stefan Wenig on 4/28/2009 4:25 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Hi Frans,

Thanks for the comment - I followed both yours and Matt's posts on the subject when they were underway, and understand (to some extent at least) some of the complexities involved.

Fortunately, the AST part takes care of the heavy lifting around the mapping side, so all(!) I need to concentrate on is effectively a Linq to HQL translator.

As far as tests go, I'm not planning on trying to get a complete test suite - I understand that is impractical. Instead, I'm aiming initially for 100 or so common forms of query to get things started. MS's 1 million is quite clearly out of my reach :)

Cheers,

Steve
Left by sstrong on 4/28/2009 4:50 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar @Stefan: Cool :) I've read the paper, it's a novel idea. The main problem I see is that the approach to reverse engineer towards 'from clause' and the like isn't going to work:
from c in md.Customer.Where(x=>x.Country=="Germany")
join o in md....
select c.Orders;

here, the 'Where' is inside one side of the join. How will you reverse engineer this? Or better: why would you want to do that? It's easier to convert the various elements to known expressions, like a method call to Where to a WhereExpression, and a call to SelectMany to a JoinExpression with a cross-join hint, and a call to Join to a JoinExpression with an Inner join hint. Though there are other things: aliases. One of the biggest problems is to keep aliases correct. As an aliased fragment (e.g. an entity reference) might be tucked away behind a property of an anonymous type, you've to keep track that the member reference of that property thus refers to the aliased set it represents, however, not always (due to scopes, a subquery encloses the aliases inside itself!).

Another big problem is that you have to convert tree fragments because they have no 1:1 mapping on an o/r mapper query system, e.g groupby and contains. This often leads to folding of query fragments into derived tables (as aggregates on a groupby have to be in hte select of the groupby in sql but not in linq!). The cherry on the pie is of course nested from clauses/joins with defaultifempty: references to tree parts which are somewhere else, and perhaps already handled.

It might be a good starting point to get things up and running though. The approach to first recognize things into recognizable expressions and then parse those is also the approach I use. Though the problems are way bigger than that, I wished it was that simple. :(

@Steve: Sounds good :) I wrote my linq provider as a converter to our own query api (which is build with objects which mimic sqlfragments) so basicly took the same road although not an AST target (which might be easier for you). However there are many tiny details which unfortunately make this whole linq provider thing a big pain to get right.

I use these phases:
- preprocessor, which first finds in-memory elements and compiles them and includes referenced expression trees into the main tree, then traverses the tree once to convert all cryptic crap into expressions which are easy to recognize (like a call to Where to a WhereExpression). It assigns aliases to everything it recognizes and tracks those in a tracker, together with which memberinfos are in which anonymous type and which alias they did receive
- aliasscoper. this is a visitor which collects scopes: subqueries get their own scope for aliases. every expression which is a set gets its scope object with the aliases it contains.
- entity field finder. This converts memberaccess etc. into elements they represent.
- query builder. the biggest phase, where all elements recognized are combined into a query at the level of the o/r mapper.
- post-build phase, where aliases which have to be rewritten (defaultifempty mess) are corrected and trees which can be optimized are optimized if possible.

HTH
Left by Frans Bouma on 4/29/2009 12:20 PM
# Is re-linq's transformation model powerful enough?
Gravatar Is re-linq's transformation model powerful enough?
Left by re-motion Team Blog on 4/29/2009 2:42 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Frans,

I tried to type a short answer, but it quickly became a post of its own:

www.re-motion.org/.../41.aspx

(the trackback link above is broken, still fighting with subtext...)

- Stefan

PS: thanks for all those posts on LINQ. we did not take this exact route, but they are still helpful. you know, the way you're helping out projects competing with LLBLGen ... you really should be doing OSS ;-)
Left by Stefan Wenig on 4/29/2009 6:44 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar @Stefan: (I'm replying here as the discussion takes place here): the problem is mainly that the user of the linq provider can and WILL write any wicked construct out there and will demand that the provider works. The point of:
from c in md.Customer.Where(x=>x.Country=="Germany")
join o in md....
select c.Orders;

vs.
from c in (from x in md.Customer where x.Country=="Germany" select x)
join o in md....
select c.Orders;

vs.
from c in md.Customer
join o in md....
where x.Country=="Germany"
select c.Orders;

is that either of these queries have to be parseable and result in a query which works. I.o.w.: if you get the first query to handle, without a select in the expression tree, you can do two things: insert the select manually (as a side of a join is always a separate scope in this case) or extract the where from the join side. There are some complex mixed join queries possible which demand the latter, where extraction from the join side, IF there's no select (as in the second version). However, that won't lead to the most optimal query (as that would be: leaving the where and add it to the ON clause, at least on some databases and situations).

Yes those posts... I thought it would be a week or 2 and then it would be done... it turned out to be 8-9 months full time work... I'll never estimate how long a project takes ever again ;)

about helping other projects... well, LLBLGen Pro v3 will support nhibernate and as it contains a view designer over sets of related entities, I really would like to see the supported frameworks to be able to handle that also through linq. So I don't mind if nhibernate gets more followers due to its linq provider getting more mature, that won't harm us in the end at all ;). In fact, we're still thinking about releasing our linq provider for nhibernate too (although not free).
Left by Frans Bouma on 5/1/2009 11:05 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Frans,

we've put several months of full-time work into re-linq already and there is still a lot of stuff to do. You're preaching to the choir! I just believe that the necessary effort per provider can be reduced significantly using re-linq, so this should really be a community effort.

I think there is a lot to say for the ease of use coming from an AST model that meets the expectations of developers, as opposed to complex libraries that just help the provider developer to handle the complexity of an otherwise unmitigated Queryable AST. (We've debated long whether this is a leaky abstraction, and we've come to the conclusion that for all practical purposes, it is not.)

The possibility to build reusable transformations that individual providers _can_ apply on the generic model should make this effort scale even better. In any case, they will get semantically identical queries, only presented in a way that suits their output format best. I think your join-example shows this quite well.

Also, from an intuitive query model comes the possibility to easily create and modify queries programmatically. Applications can use this to create dynamic queries. Infrastructure code can use it to provide generic capabilities: we've used it for eager fetching (that was almost a breeze using re-linq). Another thing that would work nicely is transparent versioning in the DB (using version number or valid from/valid until columns): take a version-unaware query and add conditions to filter the result set for a specific version. That's pretty easy too if you're working with a simple query model.

There might be some corner cases that will be cumbersome to handle (though I'd really like to see those). But most LINQ expressions I've seen so far work really well in our model (assuming some hypothetical transformations we have not implemented yet).
Even if there _should_ remain some strange cases that are awkward to handle using re-linq, I think the advantages of that model will more than compensate for that.

But still, if you can think of a query that really sucks if it has to be transformed back to query expression syntax, or that would need so much specific transformation smartness that it's unlikely to ever be implemented or be worth the processing time it requires, let's see it! We'd rather know the limits of our approach sooner than later.
Left by Stefan Wenig on 5/2/2009 11:01 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Frans,

one more thing: databases should not make a difference between a clause in a where condition and a clause in the join expression if they are logically identical. I assume that you're refering to databases with weak query optimization where the exact way you state a query has massive implications on the performance of that query? But here we're talking about very specific optimizations for specific back-ends, right?

In that case, would you say that handling the complexity of the original (Queryable-based) LINQ query in the provider is easier than to extract that optimization from an AST similar to query expressions?

I know this is a hard question to answer out of the blue, but I'd really like to get your opinion, even if it's just an educated guess. In my experience, optimization is both easier and faster when you're operating on data structures that are easy to grok. We've looked at a few several possible future optimizations and always came to the conclusion that re-linq makes these optimization opportunities easier to detect and implement.
Left by Stefan Wenig on 5/2/2009 11:13 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar PS: my last question refers to:
> (as that would be: leaving the where and add it to the ON clause, at least on some databases and situations)"
Left by Stefan Wenig on 5/2/2009 11:15 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar @Stefan: You get problems in areas where for example 'where' clauses and 'orderby' clauses have to be removed from sides of a join and have to be moved to another place inside the tree. It also makes little sense as the query can always be constructed with extension method calls (or parts of it!) which thus begs the question, why not simply implement a provider which can understand the extension methods and go from there, as the conversion back to native query syntax doesn't always work. (not every queryable extension method overload is expressable in C# syntax and VB.NET has different language elements, e.g. groupby looks way different in vb.net)

Take for example this query (adventure works, query doesn't make sense, it's about the construct)
var q = from customer in metaData.Customer
from ca in customer.CustomerAddressCollection
where ca.AddressId != null
from soh in customer.SalesOrderHeaderCollection
where soh.SalesOrderId == null
select customer;

or for example:
var q = from customer in metaData.Customer
where customer.CustomerId < 100
from soh in customer.SalesOrderHeaderCollection.DefaultIfEmpty()
select new { customer.CustomerId, soh.SalesOrderId };

in that last one, the where clause has to be moved as the rest forms a single set of joins and if you leave the where inside the query, one side of the join will become a derived table.

You have to be able to support joins between derived tables, joins between table/view and derived table over an expression other than a simple field compare. THis is important in the case of LEFT joins. For inner joins, it's not a problem indeed. It's the same problem as with SELECT * FROM A, B, C WHERE A.F=B.F(+) ... in oracle, this might cause edge-case problems where not all rows are returned as expected.

A community project for a common linq provider... I'm not sure that will work. The problem is that having 'linq' support for all cases is a big asset to have for a data-access technology. I.o.w.: if you don't have a full linq provider now or soon, chances are people are going to look elsewhere. As the time spend on writing these kind of layers is extreme, you can't expect that people donate time for free as the amount of time is gigantic.

Take linq to nhibernate for example. Red Hat owns nhibernate, but it doesn't spend a dime on linq support, 3rd parties have to do that. If 3rd parties lose interest (and why shouldn't they, it IS a lot of time and money) nhibernate will lose traction and in the end the momentum they have now. As I said earlier, we (solutions design) will support nhibernate in our next designer and it is healthy for us if it stays so we're looking into supporting linq to nhibernate IF that's necessary as it will benefit us too, at least that's the expectation. If it turns out not a lot of nhibernate users use our designer, why would be bother keeping it alive? I know that doesn't sound very generous, but it's just business, similar to what Red Hat is doing. Personally I find it rather sad that they don't care about nhibernate at all, as they refuse to fund linq development it seems.
Left by Frans Bouma on 5/9/2009 11:06 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Is the code repository available to track changes/progress?
Left by Matt on 5/11/2009 9:26 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Hi!

Ayende mentioned a year ago that there were two active NhLINQ effort. One was a "production ready" version that was incomplete, the other was still in dev but aimed to be alot more complete.

Steve, is this implementation aiming to handle all the bells and whistles (and craziness) LINQ offers?

Left by Vijay Santhanam on 5/26/2009 5:14 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Hi Vijay,

I don't know if we'll be able to handle all the bells & whistles - I don't know of any Linq provider (other than Linq to Objects) that can do everything. However, we are aiming to be much closer to "complete" than the current Linq provider.

Cheers,

Steve
Left by sstrong on 5/26/2009 9:21 PM
# LINQ: Subqueries in
Gravatar LINQ: Subqueries in
Left by Fabian's Mix on 5/28/2009 9:30 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar @Frans: sorry for not replying any sooner, I missed your comment (I geuss it would be easier to discuss this on our blog)

I'm not particularly scared of the examples you provide. There are two problems with cases like this:
- identifying those cases you want to support
- building the code to support them

While I agree that the first point can be a problem of it own, just because of the sheer number of possible cases that you need to create explicit support for, the bigger problem is the second one. This is just hard to do based on LINQ ASTs. But these cases are quite easy to handle using re-linq's QueryModel AST.

Extensibility happens mostly within existing clauses and projections, and re-linq supports those. Or are you talking about adding new extension methods on the same level as Where() and Select()? We're going to support those too in the next few weeks.

We will see how it works out, no point in trying to reach an agreement in a theoretic discussion.

As for the shared community provider: this is not really a problem. Here's an example: The other day Steve asked about the join transformation I talked about here and on our blog. Steve said he'd probably need this because HQL has no support for sub-queries in from/select/join clauses. So should we not provide this, he can just contribute that transformation to re-linq, to our contrib-repository, or, if all else fails, just add it to the LINQ2HQL code. It's quite easy.

As far as I know, nobody "owns" NH. JBoss supported it by employing the past lead developer, but that's history, and they didn't own NH even then.

NH is a pretty complete ORM, LINQ is really the most glaring gap here. And I'm quite confident that we will be able to fill this gap pretty soon.

Stefan
Left by Stefan Wenig on 5/29/2009 4:08 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Latest NH 2.1, NHL 1.0, session does not have a .Query<T>().

Either post all your code, or don't post at all. Is there some LAW that says NH guys can't post working code?
Left by WhereIsTheCode on 9/13/2009 9:30 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar WhereIsTheCode,

The current Linq to NHibernate provider that ships with NH 2.1 is a different project to what I'm working on. It has been around for some time and is based on the Criteria API. As such, it does what it does really well, but has limits.

The provider that I'm working on is based on the new HQL parser, and should provide close to all the functionality available through HQL (indeed, more functionality in some scenarios). It is aimed to be released as part of NH 3.0. Currently, it exists in the Trunk of the NH SVN repository.

And no, I'm not aware of any law saying that NH guys can't post working code. I'm aware of lots of reasons why if you run one set of client code against some other API that it won't compile / run as expected :)
Left by sstrong on 9/14/2009 9:56 AM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Hi Steve,

I'd love to hear more about the status of the new HQL-based provider you've got in the NH trunk. It looks like there are a fair number of test cases (in LinqQuerySamples.cs). Would you say it's stable enough for us to start playing with in some software we're developing to be released first quarter of next year? I haven't seen any mention of the new provider code since it was committed, except for your last comment here. Do you have any general comments about it for us eager NHibernate users?

Chris
Left by Chris Collier on 9/14/2009 5:18 PM
# re: First tentative steps for Linq-2-NHibernate
Gravatar Chris,

There's a post due in the next day or two which should hopefully help you with your decisions...
Left by sstrong on 9/14/2009 9:15 PM

Leave Your Comment

Title*
Name*
Email (never displayed)
 (will show your gravatar)
Url
Comment*

Please add 3 and 2 and type the answer here:

Preview Your Comment.