深入浅出：模拟实现 C++ STL 中的 unordered_map 和 unordered

深入浅出：模拟实现 C++ STL 中的 unordered_map 和 unordered_set

1. 引言

unordered_map 和 unordered_set 是 C++ 标准模板库 (STL) 中非常强大的容器类型，它们基于散列表实现，提供了平均情况下 O(1) 的查找、插入和删除性能。本文将带你深入了解这两个容器的内部工作原理，并通过具体的 C++ 代码实现来模拟 unordered_map 和 unordered_set。

2. 基础知识

2.1 散列表

散列表是一种通过哈希函数将键映射到数组索引的数据结构。它利用哈希函数将键转换为索引，然后通过该索引直接访问数组中的位置。这种方法通常能提供非常快的查找速度。

2.2 哈希函数

哈希函数是将任意长度的输入映射到固定长度输出的过程。一个好的哈希函数应该尽量减少键之间的冲突，并均匀地分布到整个数组范围内。

2.3 负载因子

负载因子是指散列表中元素的数量与桶的数量之比。当负载因子过高时，散列表的性能会下降，因为更多的键会映射到同一个桶上，导致碰撞增多。

3. 模拟实现 `unordered_set`

3.1 数据结构设计

对于 unordered_set，我们使用一个动态数组来存储链表，每个链表代表一个桶。当发生碰撞时，使用链地址法来解决。

template<typename T>
class UnorderedSet {
public:// 构造函数UnorderedSet(size_t initialCapacity = 16, float loadFactor = 0.75f) : capacity(initialCapacity), size(0), loadFactorThreshold(loadFactor * initialCapacity) {table.resize(capacity);}// 插入操作bool insert(const T& value);// 查找操作bool find(const T& value) const;// 删除操作bool erase(const T& value);private:// 重新哈希void rehash();// 计算哈希值size_t hash(const T& value) const;// 动态数组std::vector<std::list<T>> table;// 当前容量size_t capacity;// 当前元素数量size_t size;// 负载因子阈值size_t loadFactorThreshold;
};template<typename T>
size_t UnorderedSet<T>::hash(const T& value) const {// 简单的哈希函数实现static std::hash<T> hasher;return hasher(value) % capacity;
}

3.2 哈希函数

使用 C++ 标准库提供的 std::hash 类型来实现哈希函数。

3.3 碰撞解决策略

使用链地址法来解决碰撞问题。当多个键映射到同一个桶时，这些键会被存储在同一个链表中。

3.4 插入操作

插入操作首先计算哈希值，然后检查是否需要重新哈希，最后将元素插入到相应的链表中。

template<typename T>
bool UnorderedSet<T>::insert(const T& value) {if (size >= loadFactorThreshold) {rehash();}size_t index = hash(value);auto& bucket = table[index];// 检查元素是否已存在if (std::find(bucket.begin(), bucket.end(), value) != bucket.end()) {return false;}bucket.push_back(value);++size;return true;
}

3.5 查找操作

查找操作也使用哈希值定位桶，然后在对应的链表中查找元素。

template<typename T>
bool UnorderedSet<T>::find(const T& value) const {size_t index = hash(value);const auto& bucket = table[index];return std::find(bucket.begin(), bucket.end(), value) != bucket.end();
}

3.6 删除操作

删除操作同样使用哈希值定位桶，然后在对应的链表中删除元素。

template<typename T>
bool UnorderedSet<T>::erase(const T& value) {size_t index = hash(value);auto& bucket = table[index];auto it = std::find(bucket.begin(), bucket.end(), value);if (it == bucket.end()) {return false;}bucket.erase(it);--size;return true;
}

4. 模拟实现 `unordered_map`

4.1 键值对存储

unordered_map 与 unordered_set 类似，但存储的是键值对。

template<typename K, typename V>
class UnorderedMap {
public:// 构造函数UnorderedMap(size_t initialCapacity = 16, float loadFactor = 0.75f) : capacity(initialCapacity), size(0), loadFactorThreshold(loadFactor * initialCapacity) {table.resize(capacity);}// 插入操作std::pair<typename std::list<std::pair<K, V>>::iterator, bool> insert(const std::pair<K, V>& value);// 查找操作typename std::list<std::pair<K, V>>::iterator find(const K& key);// 删除操作bool erase(const K& key);private:// 重新哈希void rehash();// 计算哈希值size_t hash(const K& key) const;// 动态数组std::vector<std::list<std::pair<K, V>>> table;// 当前容量size_t capacity;// 当前元素数量size_t size;// 负载因子阈值size_t loadFactorThreshold;
};template<typename K, typename V>
size_t UnorderedMap<K, V>::hash(const K& key) const {static std::hash<K> hasher;return hasher(key) % capacity;
}

4.2 插入操作

插入操作与 unordered_set 类似，只是插入的是键值对。

template<typename K, typename V>
std::pair<typename std::list<std::pair<K, V>>::iterator, bool> UnorderedMap<K, V>::insert(const std::pair<K, V>& value) {if (size >= loadFactorThreshold) {rehash();}size_t index = hash(value.first);auto& bucket = table[index];auto it = std::find_if(bucket.begin(), bucket.end(),[&value](const std::pair<K, V>& pair) { return pair.first == value.first; });if (it != bucket.end()) {return {it, false};}bucket.push_back(value);++size;return {std::prev(bucket.end()), true};
}

4.3 查找操作

查找操作与 unordered_set 类似，只是返回的是键值对的迭代器。

template<typename K, typename V>
typename std::list<std::pair<K, V>>::iterator UnorderedMap<K, V>::find(const K& key) {size_t index = hash(key);auto& bucket = table[index];return std::find_if(bucket.begin(), bucket.end(),[&key](const std::pair<K, V>& pair) { return pair.first == key; });
}

4.4 删除操作

删除操作与 unordered_set 类似，只是删除的是键值对。

template<typename K, typename V>
bool UnorderedMap<K, V>::erase(const K& key) {size_t index = hash(key);auto& bucket = table[index];auto it = std::find_if(bucket.begin(), bucket.end(),[&key](const std::pair<K, V>& pair) { return pair.first == key; });if (it == bucket.end()) {return false;}bucket.erase(it);--size;return true;
}

5. 代码示例

下面是一个简单的测试示例，展示如何使用模拟实现的 UnorderedSet 和 UnorderedMap。

#include <iostream>
#include "unordered_set.h"
#include "unordered_map.h"int main() {UnorderedSet<int> uset;uset.insert(10);uset.insert(20);uset.insert(30);std::cout << "Contains 20: " << uset.find(20) << std::endl;UnorderedMap<int, std::string> umap;umap.insert({10, "ten"});umap.insert({20, "twenty"});umap.insert({30, "thirty"});auto it = umap.find(20);if (it != umap.end()) {std::cout << "Value for 20: " << it->second << std::endl;}return 0;
}