Writing readable LINQ code in C#

As a software developer, writing self-explanatory code is a skill that most agree is required but everyone seems to have differing views on the level of code simplicity. In this post, I am sharing my views on how I write readable LINQ code.

Updates

  • 2019-03-29: ToList just to execute ForEach is inefficient. Replaced with regular foreach.

Motivation

Too often, I come across LINQ code with:

  • Inconsistent indentation
  • Inconsistent line breaks
  • Long lambda expressions
  • Anonymous methods

I would like to share my style of writing to make LINQ a lot more readable.

Solution

LINQ method chain

Take a look at the LINQ method chain below. Its line count easily fits without scrolling allowing us to focus on the overall LINQ logic rather than the deeper details. Do you not find it easy to read despite the absence of comments?

var anonymousStudents = classes
    .SelectMany(c => c.Students)
    .Where(IsFromEuropeOrAsia)
    .Where(IsGradeAScore)
    .OrderByDescending(s => s.Score)
    .Take(numberOfTopStudentsScoreThatEducationMinistryRequire)
    .Select(AnonymizeStudent);

foreach (var anonymousStudent in anonymousStudents)
    ShareAnonymousStudentsScoreWithEducationMinistry(anonymousStudent);

Model classes are at tail end of the post.

Breakdown

When using SelectMany, I usually do not care about creating method for it unless it needs to traverse multiple levels deep. Likewise with OrderBy statements.

.SelectMany(c => c.Students)

For Where clauses, I would almost always create a method for it especially if the expression body starts to have composite conditions.

private bool IsFromEuropeOrAsia(Student student)
{
    return (_countriesInEurope.ContainsKey(student.Country)
        || _countriesInAsia.ContainsKey(student.Country));
}

As for when to split multiple conditions into different Where clauses, I split them per fields that I am checking. For example, Student’s Country and Score are two different fields therefore I tend to check them separately. The method name self-explanatory without the need to spam comments. Besides that, it should not be doing more than what the name says.

private bool IsGradeAScore(Student student)
{
    return student.Score >= GradeAScore;
}

I always give meaning by creating a constant to a number even if the number is used just once. This conveys to the reader the purpose of the number, rather than just naming any arbitrary number. Too often, I read code that just i.e. Thread.Sleep(1000) or other arbitrary number without any explanation for the reason for selecting that number.

.Take(numberOfTopStudentsScoreThatEducationMinistryRequire)

Select almost always require multiple lines and should always be extracted as a method. On a separate matter, I write code in American English whereas non-code in British spelling. Given that the framework libraries are in American English, mixing British Colour with American’s Color serves only to confuse readers thus best to stick to American English for code.

private NamelessStudent AnonymizeStudent(Student student)
{
    return new NamelessStudent
    {
        Score = student.Score,
        YearOfBirth = student.DateOfBirth.Year
    };
}

Model classes

In case the model classes used is not already obvious, here they are:

class Class
{
    public IList<Student> Students { get; set; } = new List<Student>();
}

class Student
{
    public string Name { get; set; }
    public string Country { get; set; }
    public DateTime DateOfBirth { get; set; }
    public int Score { get; set; }
}

Conclusion

LINQ can be easy to read if you code it to be.