Linq知识集二（翻译）

2022-03-28

字数统计: 2.4k字 | 阅读时长≈ 11分

原文起始连接：https://dotnettutorials.net/course/linq/

说明：翻译接上篇，上一篇翻译到了LINQ中的操作符，所谓操作符就是LINQ中的一系列扩展方法，用来对数据操作的一系列方法（如查询、过滤、排序等）。我们已经介绍了查询的Select方法，接下来继续。

LINQ中过滤操作符

过滤操作符，顾名思义就是对数据进行符合条件过滤。LINQ中过滤方法有两个：

Where
OfType

Where过滤

where通常需要至少一个条件，通过谓词来具体化条件。条件可以是一下操作符

1	==, >=, <=, &&, \|\|, >, <, etc.

从Where方法的签名可以看出，它是作为IEnumerable的扩展方法来实现的。

1
2

public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate);
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, int, bool> predicate);

断言

断言是用来测试每个数据是否符合条件的方法，看下面例子

List<int> intList = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
           //Method Syntax
           IEnumerable<int> filteredData = intList.Where(num => num > 5);
           //Query Syntax
           IEnumerable<int> filteredResult = from num in intList
                                             where num > 5
                                             select num;
           
           foreach (int number in filteredData)
           {
               Console.WriteLine(number);
           }
           Console.ReadKey();

num => num > 5 这个匿名函数就是一个断言，Where对数据源中的每一项都执行该方法。

where第一个重载方法断言参数Func<int, bool> preidcate期望是一个整型的输入参数，返回一个布尔类型值。这是一个泛型代理，需要一个或多个输入参数，以及一个输出参数。最后一个参数作为输出参数，且是强制必须要的，输出参数可选。上面的lambada表达式即为传给Func的参数，可以重写为如下形式。

Func<int, bool> preidicate = i => i > 5;
//当然也可以把这个匿名方法写成一个有名字的方法
//public static bool CheckNumber(int number)
//{
//    if(number > 5)
//        return true;
//    else
//        return false;
//}
IEnumerable<int> filterData = intList.Where(predicate);
//IEnumerable<int> filterData = intList.Where(num => CheckNumber(num));

where第二个重载方法，断言方法的int型参数表示数据源元素的索引位置。参考下面例子

List<int> intList = new List<int>{1,2,3,4,5,6,7,8,9};
var oddNumberWithIndex = intList.Select((num, index) => 
                    new {
                        Numbers = num,
                        IndexPosition = index
                    }).Where(x => x.Numbers % 2 != 0)
                    .Select(data => 
                    new {
                        Number = data.Numbers,
                        IndexPostion = data.IndexPostion
                    });
foreach(var item in oddNumberWithIndex)
{
      Console.WriteLine($"IndexPosition :{item.IndexPosition} , Value : {item.Number}");
}

OfType操作符

OfType用于过滤数据源中指定类型的数据，实际上也可以通过where来判断数据源中数据的类型，来获得指定数据。看下面的例子

List<object> datasource = new List<object>()
{"Tom", "Mary", 1, 2, "Price", 40, 20, 10};
//通过OfType的泛型方法可以直接获取到整型数组
List<int> intData = datasource.OfType<int>().ToList();
//采用查询语法where条件来获取指定类型数据
var strData = from name in datasource
                where name is string
                select name;

Set操作符

set操作符用于根据数据源元素的显隐来产生结果集，意味着这些操作有可能针对单个数据源或多个数据源，输出数据中这些数据有可能出现有可能不出现。主要有以下方法：

Distinct : 用于查询无重复数据的方法
Except：用于查询在某一集合里的且不在另一集合内的数据(1.Except(2)在1不在2, 2.Except(1)在2不在1)
Intersect ：用于查询多个数据源相交的集合数据（在1且在2）
Union ：用于查询具有相同结构的数据源数据的集合（1和2一起）

4个操作结果如下图

set操作符结果

LINQ Distinct

Distinct的两个重载签名如下：

1
2

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source);
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer);

举两个例子说明Distinct的用法

 List<int> intCollection = new List<int>(){1,2,3,2,3,4,4,5,6,3,4,5};
 //查询数组中去除重复项后的数据
var MS = intCollection.Distinct();
foreach (var item in MS)
{
    Console.WriteLine(item);
}

如果数据源是复杂类型的列表，当使用Distinct的默认Comparer时，只会判断两个引用对象是否相同，而不会判断对象中每个属性值是否相等。

List<Student> students = new List<Student>()
            {
                new Student {ID = 101, Name = "Preety" },
                new Student {ID = 102, Name = "Sambit" },
                new Student {ID = 103, Name = "Hina"},
                new Student {ID = 104, Name = "Anurag"},
                new Student {ID = 102, Name = "Sambit"},
                new Student {ID = 103, Name = "Hina"},
                new Student {ID = 101, Name = "Preety" },
            };
//Using Method Syntax
var MS = students
        .Distinct().ToList();
// var MS = students
//         .Select(stu => new {
//             ID = stu.ID,
//             Name = stu.Name
//         })
//         .Distinct().ToList();
foreach (var item in QS)
{
    Console.WriteLine($"ID : {item.ID} , Name : {item.Name} ");
}

运行上面代码，获得输出全部学生ID、姓名，原因上面已经解释。如何解决这个问题，可以通过以下三种方法来实现

定义一个类，实现IEqualityComparer接口，然后将这个类对象传递给Distinct作为参数
重写Student类的Equals和GetHashCode方法
使用匿名类，如上面示例代码中Select中创建匿名类

在Student类中实现IEquatable接口

//方法1
public class StudentComparer : IEqualityComparer<Student>
{
    public bool Equals(Student x, Student y)
    {
        if(object.ReferenceEquals(x, y))
            return true;
        if (object.ReferenceEquals(x,null) || object.ReferenceEquals(y, null))
            return false;
        return x.ID == y.ID && x.Name == y.Name;
    }
    public int GetHashCode(Student obj)
    {
        if (obj == null)
            return 0;
        int IDHashCode = obj.ID.GetHashCode();
        int NameHashCode = obj.Name == null ? 0 : obj.Name.GetHashCode();
        return IDHashCode ^ NameHashCode;
    }
}
//方法2，Student类中重写下面两个方法
public override bool Equals(object obj)
{
    return this.ID == ((Student)obj).ID && this.Name == ((Student)obj).Name;
}

public override int GetHashCode()
{
    int IDHashCode = this.ID.GetHashCode();
    int NameHashCode = this.Name == null ? 0 : this.Name.GetHashCode();
    return IDHashCode ^ NameHashCode;
}

//方法4，Student类继承IEquatable<Student>
public bool Equals(Student other)
{
    if (object.ReferenceEquals(other, null))
    {
        return false;
    }
    if (object.ReferenceEquals(this, other))
    {
        return true;
    }
    return this.ID.Equals(other.ID) && this.Name.Equals(other.Name);
}

IEqualityComparer 和 IEquatable区别

从上面的方法1和方法4示例代码能看出两者区别。
IEqualityComparer<T>接口，是使用第三方类，来实现两个泛型类对象之间的比较。
IEquatable<T>接口，则是使用泛型类与该类自身的新对象进行比较。

LINQ中的Except

首先来看Except的两个方法签名，和Distinct基本一样。只是该操作是针对两个数据源，因此多一个数据参数。同样，第二个方法的签名最后一个参数传入一个IEqualityComparer接口，意味着处理复杂类型数据时，需要自己实现这个接口进行对象间的比较。

1
2

public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second);
public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer);

接下来通过例子说明

List<int> dataSource1 = new List<int>() { 1, 2, 3, 4, 5, 6 };
List<int> dataSource2 = new List<int>() { 1, 3, 5, 8, 9, 10 };
//Method Syntax
var MS = dataSource1.Except(dataSource2).ToList();
var QS = (from num in dataSource1
        select num)
        .Except(dataSource2).ToList();
foreach (var item in QS)
{
    Console.WriteLine(item);
}

下面的代码处理复杂类，如果Student类没有重写Equals方法和GetHasCode方法的话，输出结果将不是我们期望的。参考Distinct方法中给Student重写方法、实现IEqualityCompare接口、匿名类等方法来实现我们期望的结果。

List<Student> AllStudents = new List<Student>()
  {
      new Student {ID = 101, Name = "Preety" },
      new Student {ID = 102, Name = "Sambit" },
      new Student {ID = 103, Name = "Hina"},
      new Student {ID = 104, Name = "Anurag"},
      new Student {ID = 105, Name = "Pranaya"},
      new Student {ID = 106, Name = "Santosh"},
  };
  List<Student> Class6Students = new List<Student>()
  {
      new Student {ID = 102, Name = "Sambit" },
      new Student {ID = 104, Name = "Anurag"},
      new Student {ID = 105, Name = "Pranaya"},
  };
  
  //Method Syntax
  var MS = AllStudents.Except(Class6Students).ToList();
  //Query Syntax
  var QS = (from std in AllStudents
              select std).Except(Class6Students).ToList();
  
  foreach (var student in MS)
  {
      Console.WriteLine($" ID : {student.ID} Name : {student.Name}");
  }

LINQ中的Intersect

首先来看Intersect的两个方法签名，和上面的Except基本一样。所以后面的示例代码以及如何处理复杂类数据元素都不具体提供示例了。

1
2

public static IEnumerable<TSource> Intersect<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second);
public static IEnumerable<TSource> Intersect<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer);

LINQ中的Union

首先来看Union的两个方法签名，和上面的Except基本一样。所以后面的示例代码以及如何处理复杂类数据元素都不具体提供示例了。

1
2

public static IEnumerable<TSource> Union<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second);
public static IEnumerable<TSource> Union<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer);

Concate方法

Concate和Union类似，将两个数据源合并成一个。二者的区别在于：Concate操作，只合并两个数据源，不进行去除重复；而Union操作返回的结果中，是first、second数据源合并后去除重复值的。二者的关系相当于SQL中的UNION 和 UNION ALL。

1	public static IEnumerable<TSource> Concate<TSource>(this IEnumerable<TSouce> first, IEnumerable<TSource> second);

本文作者： 达文西
本文链接： https://edsiongithub.github.io/2022/03/28/0039/
版权声明： 本博客所有文章除特别声明外，均采用 MIT 许可协议。转载请注明出处！