Module crater::fheap

source ·
Expand description

A fibonacci heap is a type of data structure optimized for use as a priority queue. It is commonly useful in pathfinding algorithms like Dijkstra’s and A*.

Below is a very in depth comparison and analysis of various types of heaps, skip to FibHeap if you only want to know how to use this library. Reading the tests is also a great way to find examples. Building code as tests will help debug fibheap code in particular because additional checks are performed under cfg(test) that will abort if the heap breaks (ie because a duplicate node was added, a node was removed that wasn’t in the heap, or there is a bug in the library).

Depending on your needs, crate::mmheap::MmHeap may be more useful to you, or Rust’s builtin std::collections::BinaryHeap. Both of these are binary heaps, but the former offers fast access to both the minimum and the maximum.

§Comparison of Heaps

§Asymptotic Time

BinaryPairingFibonacciBucket
find-minO(1)O(1)O(1)O(1)
extract-minO(logn)O(logn)O(logn)O(1)
insertO(logn)O(1)O(1)O(1)
decrease-keyO(logn)O(logn)O(1)O(1)
increase-keyO(logn)O(logn)O(logn)O(1)
meldO(n)O(1)O(1)O(n)
Not all of these operations are directly supported. In Fibonacci heaps, increase-key
must be implemented by removing and re-inserting an element. In binary heaps,
decrease/increase key can be difficult to implement because the elements exist in the heap’s
internal buffer and there isn’t a direct way to get a reference/index to them.
Also in binary heaps, meld is not directly supported and must be implemented by extending
the internal buffer of one heap by that of another and re-heapifying.
Bucket queues are extremely good when relevant, but they only support a fixed number of
priorities, literally having a bucket for each priority, so they are not always suitable.

§Linked vs Unlinked Data Structures

Asymptotic time complexity is not the whole story. Fibonacci heaps and pairing heaps are linked data structures, so links to adjacent nodes are stored as pointers, whereas binary heaps have such a simple, rigid structure that they can be stored directly in an array (eg the children of a node at index i will be at indices 2i + 1 and 2i + 2). Linked data structures can still be stored in an array, but the offsets between nodes are not consistent and so generally pointers are used, which can introduce overhead in several ways. Linked data structures often underperform relative to “rigid” data structures even though their asymptotic complexity is better. And even among linked data structures, Fibonacci heaps are considered complex and may perform worse than pairing heaps.

§Amortized Running Time

The runtimes of Fibonacci heaps are amortized using potential, which means if the number of other calls is much much greater than the number of extract-min calls, the extract-min calls may take much longer because they are picking up more work. This doesn’t happen in most use cases since generally a large part of the heap will be drained in algorithms like heapsort, A*, etc. When it’s not the case that a large part of the heap will be drained, a minmax heap can often be a good choice since it allows dropping entries that will never be used efficiently. In particular, if the number of entries that will be extracted from the heap is constant and small, a minmax heap will probably be best. If the number of entries that will be extracted is proportional to the number inserted and/or large, a fibonacci heap may be the best.

§Explanation of Algorithms

There are two operations (extract-min and decrease-key) that are central to Fibonacci heaps, and all other operations are trivial. A Fibonacci heap consists of a collection of subheaps, each of which is a tree where nodes can have any number of children. This is implemented by giving each node 4 pointers: its parent, its first child, its next sibling, and its previous sibling. For the subheap roots, their parent is null and their next/previous siblings are instead other roots. Each node also stores its degree (number of chilren) and whether or not it has had a child removed.

The heap itself stores a pointer to the minimal root, and the number of elements in the heap.

This structure is restricted by a few invariants. First, each node is less than or equal to all of its children according to the heap comparison function. Second, the total size of a subheap whose root has k children is at least F_(k+2), where F_k is the kth Fibonacci number, with F_1 = F_2 = 1, F_3 = 2, F_4 = 3, etc.

When extract-min is called, first we remove the minimal root. Then we iterate over the roots of all other subheaps, plus the children of the removed root, placing them into an array with O(logn) elements. Each such node goes into the index in the array corresponding to its degree. If there is already a subheap there, we instead remove that subheap, merge it with the current one, and keep going. When we merge the subheaps, the one with the larger root is added as a child of the one with the smaller root, so the degree of the latter goes up and then we try to put it in the next corresponding index in the array, repeating until we reach a degree that isn’t in the array yet.

At the end, this gives us an array of subheaps with distinct degrees. We re-link these into a double circularly linked list, and update the minimal root pointer to point to the minimal one.

If all subheaps obey the size >= F_(degree + 2) invariant before extract-min, they will also obey it after.

If there are two or fewer nodes in the heap before we extract-min, we can optimize by not merging subheaps and just setting the new minimal root to either the one remaining node or null.

When decrease-key is called, if the key is still greater than or equal to the parent’s key, or the node is a root node and the key is still greater than or equal to the minimal root, or the root node is the minimal root, there is nothing to be done. If the node is a root node but not the minimal root and the key becomes less than the minimal heap, update the minimal heap.

Finally, if the key is now less than the parent’s key, remove the node from its sibling list and as a child of its parent. Add it as a new subeap. If the parent was not marked as already having a child removed, mark it as such and then remove it as well, repeating for any remaining parents. However, root nodes need not be marked as such.

This marking is needed to ensure that the size >= F_(degree + 2) invariant does not break

Insert and meld just require adding a new subheap with one element, and stitching together the subheap root linked lists, respectively.

Find-min just requires inspecting the min_root pointer.

§Further Reading

Wikipedia 3b1b style video by SithDev

A very popular priority queue crate. This crate implements priority queues using a hash table to allow key-indexed node access, and builds the priority queue on top of its internal hash table. It has a good api and is a good choice most of the time. Crater has some advantages over this crate: our fib heap allows building priority queues with any backing store, eg KD trees, vectors, just putting each node in Box, etc; and references to nodes have generous lifetimes, so storing references instead of keys is possible and makes lookup as fast as possible. However, unless you need these features, this crate is a better choice than Crater. priority-queue

Structs§

  • A fibonacci heap is a type of data structure optimized for use as a priority queue. It is commonly useful in pathfinding algorithms like Dijkstra’s and A*. Fib heaps have the following amortized time complexities:
  • The intrusive struct itself that should be embedded in any types that implement Node. See FibHeap for more information.

Traits§

  • Any struct can be used as the fib heap element simply by embedding a RawNode in it (or wrapping it in a struct containing a raw node) and implementing this trait.