heap_2">heap算法概述
heap的内部是一个完全二叉树,将极值存放在根节点。这个里的极值可分为最大值、最小值。根据极值意义的不同可分为大堆和小堆。为了方便解释,以下的讲解内容都以大堆(max-heap)的例子讲解。
大顶堆:
根节点(堆顶元素)是所有节点中的最大值(父节点都大于左右子节点)。
小顶堆:
小顶堆中的根节点是所有节点中的最小值(父节点都小于左右子节点)。
如上图所示,A即为这个列表中的极值。在查找极值时极为方便。同时因为是一个完全二叉树,所以除了叶子节点外,其他的节点都存在。因此可以通过索引值,来找到父节点和左右子节点之间的关系。当父节点索引值为i时,左子节点索引值2i+1,右子节点索引值为2i+2。
具体原理可查看此博客:堆排序中 i 位置的节点的子节点位置为 2i+1, 2i+2, 父节点为 (i-1) / 2
heap__12">push_heap 插入元素
数据插入示意图
将要插入的元素,放入容器的尾端。通过调用push_heap()函数,将尾端的元素放入到完全二叉树的合适位置。
//计算元素之间的距离
template<class ForwardIterator>
typename std::iterator_traits<ForwardIterator>::difference_type Distance(ForwardIterator first, ForwardIterator last)
{typename std::iterator_traits<ForwardIterator>::difference_type n = 0;while (first != last){++first;++n;}return n;
}//获取元素类型
template<class Iterator>
inline typename std::iterator_traits<Iterator>::value_type* value_Type(const Iterator&)
{return static_cast<std::iterator_traits<Iterator>::value_type *> (0);
}//获取表示距离的类型
template<class Iterator>
inline typename std::iterator_traits<Iterator>::difference_type* distance_type(const Iterator&)
{return static_cast<std::iterator_traits<Iterator>::difference_type *> (0);
}
template<class RandomAcessIterator, class Distance, class T>
void _push_heap(RandomAcessIterator first, Distance holeIndex, Distance topIndex, T value)
{Distance parent = (holeIndex - 1) / 2; //找到父节点while (holeIndex > topIndex && *(first + parent) < value) //当索引值不为根节点时,或者父节点小于插入空节点时{*(first + holeIndex) = *(first + parent);//父节点的值复制给空节点holeIndex = parent; //父节点索引值复制给空节点parent = (holeIndex-1) / 2; //查找此时空节点的父节点}*(first + holeIndex) = value; //将值复制到此时的空节点上
}template<class RandomAcessIterator, class Distance, class T>
inline void _push_heap_aux(RandomAcessIterator first, RandomAcessIterator last, Distance*, T*)
{int count = last - first;_push_heap(first, Distance((last - first) - 1), Distance(0), T(*(last - 1)));
}template <class RandomAcessIterator>
inline void USD_push_heap(RandomAcessIterator first, RandomAcessIterator last)
{_push_heap_aux(first, last, distance_type(first), value_Type(first));
}
这是是一个插入操作,将数据写入到字符串的尾端。找到尾端所对应的父节点,进行对比。如果大于父节点,两个节点对调。找到根节点或者插入空节点小于父节点时,循环操作结束。此时插入的值被放入合适的位置。元素的插入,可参考示意图的操作。
heap__73">pop_heap 取出根节点元素
调用 函数pop_heap(),会将根节点的值取出来,存在放在容器的尾端。同时会按照完全二叉树的规则重新移动剩余的元素,找出剩余元素中的极值,存放在根节点中,构建出一个新的完全二叉树。此时要取出的元素位于尾部,再通过容器自身的操作,取出尾部的元素。
元素取出示意图
template<class RandomAcessIterator, class Distance, class T>
void _adjust_heap(RandomAcessIterator first, Distance holeIndex, Distance len, T value)
{Distance topIndex = holeIndex; //根节点Distance sencondChild = 2 * holeIndex + 2; //右子节点while (sencondChild < len){if (*(first + sencondChild) < *(first + sencondChild - 1)) //右子节点与左子节点进行比较{sencondChild--; //通过移动,找到左子节点}*(first + holeIndex) = *(first + sencondChild);//将较大的值复制给空值元素所在的位置holeIndex = sencondChild; //空值元素移动到较大值元素位置上sencondChild = 2 * sencondChild + 2; //再次查找对应的右子节点}if (sencondChild == len) //此时右子节点等于树的长度,那么右子树不存在,将左子树元素复制给空值元素{*(first + holeIndex) = *(first + sencondChild-1);holeIndex = sencondChild-1;}_push_heap(first, holeIndex, topIndex,value);//将原二叉树的尾部元素,插入到现在空值元素位置上
}template<class RandomAcessIterator,class T,class Distance>
inline void _pop_heap(RandomAcessIterator first, RandomAcessIterator last, RandomAcessIterator result,T value,Distance *)
{*result = *first;_adjust_heap(first, Distance(0), Distance(last - first), value);
}template<class RandomAcessIterator,class T>
inline void _pop_heap_aux(RandomAcessIterator first, RandomAcessIterator last,T*)
{_pop_heap(first, last-1, last - 1, T(*(last - 1)), distance_type(first));
}template<class RandomAcessIterator>
inline void USD_pop_heap(RandomAcessIterator first, RandomAcessIterator last)
{_pop_heap_aux(first,last,value_Type(first));
}
此算法中的核心逻辑调用_adjust_heap(),首先holeIndex = 0,此时空值元素代表的是根节点,找到根节点对应的右子节点,根节点的右子节点和左子节点进行相比。空值元素与相比后较大值元素进行对换,此时较大值元素就作为了根节点,同时空值元素,移动到了较大值的位置。重复上述操作在继续查找控制元素的右子节点,与它的左子节点进行对比。只到所查值的右子节点索引值大于树的长长度。则停止查找。查找结束后,将原完全二叉树的尾节点,复制到空值所在的位置。操作流程,可参考此函数的示意图。
heap__124">sort_heap 按极值存放元素
将元素按照极值,从尾到前开始存放元素
template<class RandomAcessIterator>
void USD_sort_heap(RandomAcessIterator first, RandomAcessIterator last)
{while (last - first>1){USD_pop_heap(first,last--);}
}
通过使用pop_heap()操作,将结果存放在尾部,同时缩短last的索引值。这样每pop_heap一次,就会构造一个新的二叉树,同时尾部会存放上一个二叉树的根节点。
heap_heap_142">make_heap 将一段现有数据构造成heap
容器中存放一段数据,通过make操作后,会更改容器中元素的位置。从而满足 堆的要求。
template<class RandomAcessIterator,class T,class Distance>
void _make_heap(RandomAcessIterator first, RandomAcessIterator last, T*, Distance*)
{if (last - first<2){return;}Distance len = last - first;Distance parent = (len - 2) / 2; //最后一个父节点while (true){_adjust_heap(first, parent, len, *(first+ parent));if (parent == 0){return;}parent--;}
}template<class RandomAcessIterator>
inline void USD_make_heap(RandomAcessIterator first, RandomAcessIterator last)
{_make_heap( first,last, value_Type(first), distance_type(first));
}
此算法的思路,首先找到最后一个父节点。通过调用_adjust_heap()函数,来调整此时的父节点和左右两个子节点之间的关系。来满足heap的条件。当此节点和其子节点调整结束后,通过移动parent索引值,找到下一个父节点再重复上一个操作。当查找到根节点即索引值为0时,heap构建完成。
程序测试
//heap.h
#pragma once
#include <algorithm>template<class ForwardIterator>
typename std::iterator_traits<ForwardIterator>::difference_type Distance(ForwardIterator first, ForwardIterator last)
{typename std::iterator_traits<ForwardIterator>::difference_type n = 0;while (first != last){++first;++n;}return n;
}template<class Iterator>
inline typename std::iterator_traits<Iterator>::value_type* value_Type(const Iterator&)
{return static_cast<std::iterator_traits<Iterator>::value_type *> (0);
}template<class Iterator>
inline typename std::iterator_traits<Iterator>::difference_type* distance_type(const Iterator&)
{return static_cast<std::iterator_traits<Iterator>::difference_type *> (0);
}template<class RandomAcessIterator, class Distance, class T>
void _push_heap(RandomAcessIterator first, Distance holeIndex, Distance topIndex, T value)
{Distance parent = (holeIndex - 1) / 2;while (holeIndex > topIndex && *(first + parent) < value){*(first + holeIndex) = *(first + parent);holeIndex = parent;parent = (holeIndex-1) / 2;}*(first + holeIndex) = value;
}template<class RandomAcessIterator, class Distance, class T>
inline void _push_heap_aux(RandomAcessIterator first, RandomAcessIterator last, Distance*, T*)
{int count = last - first;_push_heap(first, Distance((last - first) - 1), Distance(0), T(*(last - 1)));
}template <class RandomAcessIterator>
inline void USD_push_heap(RandomAcessIterator first, RandomAcessIterator last)
{_push_heap_aux(first, last, distance_type(first), value_Type(first));
}template<class RandomAcessIterator, class Distance, class T>
void _adjust_heap(RandomAcessIterator first, Distance holeIndex, Distance len, T value)
{Distance topIndex = holeIndex;Distance sencondChild = 2 * holeIndex + 2;while (sencondChild < len){if (*(first + sencondChild) < *(first + sencondChild - 1)){sencondChild--;}*(first + holeIndex) = *(first + sencondChild);holeIndex = sencondChild;sencondChild = 2 * sencondChild + 2;}if (sencondChild == len){*(first + holeIndex) = *(first + sencondChild-1);holeIndex = sencondChild-1;}_push_heap(first, holeIndex, topIndex,value);
}template<class RandomAcessIterator,class T,class Distance>
inline void _pop_heap(RandomAcessIterator first, RandomAcessIterator last, RandomAcessIterator result,T value,Distance *)
{*result = *first;_adjust_heap(first, Distance(0), Distance(last - first), value);
}template<class RandomAcessIterator,class T>
inline void _pop_heap_aux(RandomAcessIterator first, RandomAcessIterator last,T*)
{_pop_heap(first, last-1, last - 1, T(*(last - 1)), distance_type(first));
}template<class RandomAcessIterator>
inline void USD_pop_heap(RandomAcessIterator first, RandomAcessIterator last)
{_pop_heap_aux(first,last,value_Type(first));
}template<class RandomAcessIterator>
void USD_sort_heap(RandomAcessIterator first, RandomAcessIterator last)
{while (last - first>1){USD_pop_heap(first,last--);}
}template<class RandomAcessIterator,class T,class Distance>
void _make_heap(RandomAcessIterator first, RandomAcessIterator last, T*, Distance*)
{if (last - first<2){return;}Distance len = last - first;Distance parent = (len - 2) / 2; //最后一个父节点while (true){_adjust_heap(first, parent, len, *(first+ parent));if (parent == 0){return;}parent--;}
}template<class RandomAcessIterator>
inline void USD_make_heap(RandomAcessIterator first, RandomAcessIterator last)
{_make_heap( first,last, value_Type(first), distance_type(first));
}
#include "heap.h"
#include <vector>int main()
{int a[5] = {1,2,3,4,5};std::vector<int> vecTest(a,a+5);USD_make_heap(vecTest.begin(), vecTest.end()); //5,4,3,1,2 vecTest.push_back(7);USD_push_heap(vecTest.begin(), vecTest.end()); //7,4,5,1,2,3USD_pop_heap(vecTest.begin(), vecTest.end());//5,4,3,1,2,7vecTest.pop_back();//5,4,3,1,2USD_sort_heap(vecTest.begin(), vecTest.end());//1,2,3,4,5return 0;
}
为了方便,上述实现都采用了“<”作为比较操作符。这块比较操作,可封装成仿函数,来满足用户自定义的比较操作。注意:写仿函数时,注意返回的bool值。从而决定构建的是max-heap或者min-heap。
如果想了解,算法封装仿函数操作可参考:STL算法详细解剖——单纯数据处理函数
此文章,列举了相同算法,封装仿函数与不封装仿函数两种形式。未了解过仿函数的读者,可做初步的了解。