Wednesday, March 30, 2011

Linq-to-Sql: recursively get children

I have a Comment table which has a CommentID and a ParentCommentID. I am trying to get a list of all children of the Comment. This is what I have so far, I haven't tested it yet.

private List<int> searchedCommentIDs = new List<int>();
// searchedCommentIDs is a list of already yielded comments stored
// so that malformed data does not result in an infinite loop.
public IEnumerable<Comment> GetReplies(int commentID) {
    var db = new DataClassesDataContext();
    var replies = db.Comments
        .Where(c => c.ParentCommentID == commentID 
            && !searchedCommentIDs.Contains(commentID));
    foreach (Comment reply in replies) {
        searchedCommentIDs.Add(CommentID);
        yield return reply;
        // yield return GetReplies(reply.CommentID)); // type mis-match.
        foreach (Comment replyReply in GetReplies(reply.CommentID)) {
            yield return replyReply;
        }
    }
}

2 questions:

  1. Is there any obvious way to improve this? (Besides maybe creating a view in sql with a CTE.)
  2. How come I can't yield a IEnumerable <Comment> to an IEnumerable <Comment>, only Comment itself?
  3. Is there anyway to use SelectMany in this situation?
From stackoverflow
  • I'd probably use either a UDF/CTE, or (for very deep structures) a stored procedure that does the same manually.

    Note that if you can change the schema, you can pre-index such recursive structures into an indexed/ranged tree that lets you do a single BETWEEN query - but the maintenance of the tree is expensive (i.e. query becomes cheap, but insert/update/delete become expensive, or you need a delayed scheduled task).


    Re 2 - you can only yield the type specified in the enumeration (the T in IEnumerable<T> / IEnumerator<T>).

    You could yield an IEnumerable<Comment> if the method returned IEnumerable<IEnumerable<Comment>> - does that make sense?

    Improvements:

    • perhaps a udf (to keep composability, rather than a stored procedure) that uses the CTE recursion approach
    • use using, since DataContext is IDisposable...

    so:

    using(var db = new MyDataContext() ) { /* existing code */ }
    
    • LoadWith is worth a try, but I'm not sure I'd be hopeful...
    • the list of searched ids is risky as a field - I guess you're OK as long as you don't call it twice... personally, I'd use an argument on a private backing method... (i.e. pass the list between recursive calls, but not on the public API)
    Shawn Simon : yea it makes perfect sense im just not sure why they wouldnt let you either return an ienumerable of a type or the type tiself
    Marc Gravell : You can **return** an IEnumerable of the type. You can **yield return** a T.
    Shawn Simon : right i completley understand but i think you should be able to do either
    Marc Gravell : If it was only the generic (typed) versions in existence, I expect it could... the problem is that it would be very ambiguous for the non-generic versions... you can `yield` any object, so it wouldn't know whether that represented 1 item or a sequence. Personally, I'm happy the simple way it is now.
    Shawn Simon : great point. didnt think of that
  • Version 2: Who knows if this shit works??

    public static IEnumerable<Comment> GetReplies(int commentID) {
        List<int> searchedCommentIDs = new List<int>();
        return GetReplyIterator(commentID, searchedCommentIDs);
    }
    private static IEnumerable<Comment> GetReplyIterator(int commentID, List<int> searchedCommentIDs) {
        searchedCommentIDs.Add(commentID);
        var db = new DataClassesDataContext();
        var replies = db.Comments
           .Where(c => c.ParentCommentID == commentID
               && !searchedCommentIDs.Contains(commentID));
        return replies.Union(replies
            .SelectMany(r => GetReplyIterator(r.CommentID, searchedCommentIDs)));
    }
    

0 comments:

Post a Comment