0349. 两个数组的交集 Intersection of Two Arrays

这是一个搜索问题, 这个问题的解法就比较多了.

并行双指针 Parallel Two Pointers

这个也比较符合并行双指针的使用场景, 遍历两个数组或链表, 用它能很快地计算集合的交集和并集.

parallel two-pointers

在使用双指针之前, 要先对数组进行排序, 让重复的元素挨在一起.

这种方法也合适输出结果包含重复元素的问题, 只要不去重即可.

#![allow(unused)]
fn main() {
// 并行双指针
pub fn intersection1(nums1: Vec<i32>, nums2: Vec<i32>) -> Vec<i32> {
    let mut nums1 = nums1;
    let mut nums2 = nums2;
    // 先给数组排序
    nums1.sort();
    nums2.sort();

    let mut index1 = 0;
    let mut index2 = 0;
    let len1 = nums1.len();
    let len2 = nums2.len();
    let mut out = Vec::with_capacity(len1 + len2);

    // 遍历两个数组
    while index1 < len1 && index2 < len2 {
        match nums1[index1].cmp(&nums2[index2]) {
            Ordering::Less => {
                index1 += 1;
            }
            Ordering::Equal => {
                let val = nums1[index1];
                out.push(val);
                // 跳过重复的元素
                while index1 < len1 && nums1[index1] == val {
                    index1 += 1;
                }
                while index2 < len2 && nums2[index2] == val {
                    index2 += 1;
                }
            }
            Ordering::Greater => {
                index2 += 1;
            }
        }
    }
    out
}
}

HashSet 的集合操作

Rust 语言的 HashSet 实现了集合操作, 我们可以先把数组成转 HashSet, 再利用它的 HashSet::intersection() 方法, 求出交集, 最后再把交集转换回数组即可.

整个方法代码量很少, 很简洁, 但性能不是最好的.

这种方法只合适输出结果不包含重复元素的问题, 如果要包含重复元素的话, 可以将 HashSet 换成 HashMap.

#![allow(unused)]
fn main() {
// 使用 HashSet 集合操作
pub fn intersection2(nums1: Vec<i32>, nums2: Vec<i32>) -> Vec<i32> {
    let set1: HashSet<i32> = nums1.into_iter().collect();
    let set2: HashSet<i32> = nums2.into_iter().collect();
    set1.intersection(&set2).copied().collect()
}
}

使用 Bitset

当一个数组是无序的, 而且里面也有重复元素时, 我们可以把它转换 BitSet, 而我们并不关心重复元素时, 可以实现对数组元素的快速查找.

C++ 这样的语言在标准库里自带了 BitSet, 但在 Rust 标准库里却没有, 还好它的实现不算复杂.

我们使用 Vec<bool> 来简化 BitSet 的实现, 性能会差一些.

#![allow(unused)]
fn main() {
#[derive(Debug, Default, Clone, Eq, PartialEq)]
pub struct BitSet {
    bits: Vec<bool>,
}

impl BitSet {
    #[must_use]
    #[inline]
    pub const fn new() -> Self {
        Self { bits: Vec::new() }
    }

    #[must_use]
    #[inline]
    pub fn with_capacity(capacity: usize) -> Self {
        Self {
            bits: Vec::with_capacity(capacity),
        }
    }

    #[inline]
    pub fn set(&mut self, index: usize) {
        if index >= self.bits.len() {
            self.bits.resize(index + 1, false);
        }
        self.bits[index] = true;
    }

    #[inline]
    pub fn unset(&mut self, index: usize) {
        if index < self.bits.len() {
            self.bits[index] = false;
        }
    }

    #[must_use]
    #[inline]
    pub fn is_set(&self, index: usize) -> bool {
        if index < self.bits.len() {
            self.bits[index]
        } else {
            false
        }
    }

    #[inline]
    pub fn to_vec(&self) -> Vec<usize> {
        // TODO(shaohua): Impl Iterator and IntoIter traits
        self.bits
            .iter()
            .enumerate()
            .filter(|(_index, &is_set)| is_set)
            .map(|(index, _is_set)| index)
            .collect()
    }
}

impl FromIterator<usize> for BitSet {
    fn from_iter<T>(iter: T) -> Self
    where
        T: IntoIterator<Item = usize>,
    {
        let iterator = iter.into_iter();
        let capacity = match iterator.size_hint() {
            (_, Some(upper_size)) => upper_size,
            (size, None) => size,
        };
        let mut set = BitSet::with_capacity(capacity);
        for num in iterator {
            set.set(num)
        }
        set
    }
}
}

实现 BitSet 类之后, 就可以用它来存储 nums1 了. 要注意的点有两个:

交集, 元素要在 nums1 和 nums2 中都存在
输出的结果不允许有重复的元素

bitset

这种方法只适合输出结果中不包含重复元素的问题.

#![allow(unused)]
fn main() {
// 优化上面的方法, 只使用一个 bitset
pub fn intersection4(nums1: Vec<i32>, nums2: Vec<i32>) -> Vec<i32> {
    // 将 nums1 转换为 bitset.
    let mut set1: BitSet = nums1.iter().map(|&val| val as usize).collect();
    let mut out = Vec::new();

    // 遍历 nums2
    for &num in &nums2 {
        let num_usize = num as usize;
        // num 在 set1 中也存在
        if set1.is_set(num_usize) {
            out.push(num);
            // 重置 set1 中的值, 因为它已经被插入到了结果中, 不能再重复使用.
            set1.unset(num_usize);
        }
    }
    out
}
}

二分查找法 Binary Search

上面提到了交集的两个特点:

交集, 元素要在 nums1 和 nums2 中都存在
输出的结果不允许有重复的元素

除了使用 HashSet 和 BitSet 之外, 我们也可以在原地给数组排序并去除重复元素. 然后遍历 nums1, 并用二分查找法检查这个元素在 nums2 中是否同样存在.