Number Theory Utils 0.2.0
|
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
Dirichlet Hyperbola based functions for computing sums of multiplicative functions at a single value quickly
For a function f defined over the natural numbers, we say f is multiplicative if f(ab) = f(a)f(b) whenever a and b are coprime (have gcd 1). There are many functions in number theory that satisfy this condition, with one of the most well known being euler's phi function.
A brief list of common multiplicative functions is: (see here https://en.wikipedia.org/wiki/Multiplicative_function) The dirichlet convolution identity (explained later): I(n) = 1 if n == 1; 0 otherwise Sometimes written epsilon(n), (very confusingly) u(n), or delta(n, 0) (since it is just a kroneker delta function of course) The constant function: u(n) = 1 Sometimes written 1(n) The identity function: N(n) = n Sometimes written Id(n) or id(n) The power functions: N_k(n) = n**k Sometimes written Id_k(n) The divisor count function: d(n) = #{k | n} (the number of natural numbers including 1 and n which divide n) The generalized divisor functions: d_k(n) = # k-tuples ks of natural numbers such that ks multiplies out to n The divisor power sum functions: sigma_a(n) = sum(k | n, k**a) (note that a can be any real number) The mobius function: mu(n) = 0 if n is not squarefree; (-1)**Omega(n) otherwise, where Omega(n) is the number of prime factors of n with multiplicity Euler's totient function: phi(n) = #{k = 1 ... n : (k coprime n)} Any constant raised to the power of Omega(n) or omega(n) (the number of prime factors of n counted with and without multiplicity respectively) The number of non-isomorphic abelian groups of order n: a(n) The Ramanujan tau function: tau(n) The unitary divisor power sums: sigma_a^*(n) = sum(d | n where (d coprime n/d), d**a) All dirichlet characters, including the legendre symbol (n/p) for fixed p and gcd(n, m) for fixed m
Some of these are fully multiplicative, meaning not only do they satisfy f(ab) = f(a)f(b) for a, b coprime, but they satisfy it for all a, b. In particular, I, u, N, and N_k are fully multiplicative.
We often want to find the sum of a multiplicative function, often denoted by a capital version, for instance phi and Phi, f and F (for a general function, etc). We also often want to be able to analyze multiplicative functions in terms of simpler functions.
Multiplicative functions are fully determined by their values at prime powers, which often leads to efficient sieve based approaches to calculating them. This header provides some functions for doing this, the nut_euler_sieve family of functions, although these are mainly intended as low level subroutines for the upcoming Dirichlet Hyperbola algorithm.
However, if we only want to compute the sum of a multiplicative function up to some value n, sieve based methods will basically be O(nlogn), which seems very bad considering how structured multiplicative functions are.
Indeed, we can often do better, in particular when we want to find the sum H(n) of some multiplicative function h(n) defined as h = f <*> g where <*> denotes Dirichlet convolution. What is Dirichlet convolution? It is an operation for combining functions, similar to ordinary multiplication, division, addition, subtraction, composition, and so on. It is defined as (f <*> g)(n) = sum(k | n, f(k)g(n/k)). This is mostly useful because when f and g are multiplicative f <*> g will be as well, and it lets us build up more complicated functions in terms of simpler ones. (see here https://en.wikipedia.org/wiki/Dirichlet_convolution)
Obviously, not all operations on multiplicative functions will produce multiplicative functions. For example, if f is multiplicative, 2f is not, so multiplicative functions are not closed under scalar multiplication, and thus they are not closed under addition. They are closed under multiplication and dirichlet convolution though.
The Dirichlet Hyperbola algorithm lets us rewrite H(x) = sum(n <= x, h(n)) in terms of F and G as well as f and g, but crucially it lets us evaluate them at fewer values.
In particular, sum(n <= x, (f <*> g)(n)) = sum(n <= x, ab = n, f(a)g(b)) = sum(ab <= x, f(a)g(b)) = sum(a <= A, b <= x/a, f(a)g(b)) + sum(b <= B, a <= x/b, f(a)g(b)) - sum(a <= A, b <= B, f(a)g(b)) = sum(a <= A, f(a)G(x/a)) + sum(b <= B, F(x/b)g(b)) - F(A)G(B) where AB = x. (see here https://gbroxey.github.io/blog/2023/04/30/mult-sum-1.html) (see here https://angyansheng.github.io/blog/dirichlet-hyperbola-method) (you may also find this instructive https://en.wikipedia.org/wiki/Marginal_distribution) To accomplish this rearrangement, we first expanded (f <*> g) using the definition of dirichlet convolution. Then, we interpreted the sum both as an a-indexed sum and as a b-indexed sum. We could find the sum either way, but considering both sets us up to cut the work down quadratically. These two ways of interpreting the sum are EXACTLY EQUIVALENT to the trick of transforming an integral of f(x) in terms of x into an integral of f_inverse(y) in terms of y. For integrals, it is common to just pick whichever interpretation is easier, but for our sum, we benefit from realizing that if we sum f(a)g(b) for (a <= A, b <= x/a) and then separately for (b <= B, a <= x/b), then we double count the rectangular region (a <= A, b <= B). But more importantly, if we pick A = B = sqrt(x), then we've turned one sum over (f <*> g) for n from 1 to x, which we could accomplish in about O(xlogx), into two sums over f*G and F*g for n from 1 to sqrt(x), which we can sometimes accomplish in O(sqrt(x)).
There is a big catch though, we now need to know values of F and G as well as f and g. Wonderfully though, we only need to know F(x/n) and G(x/n) for integral values of n, and floor(x/n) has at most 2sqrt(x) distinct values. That, arguably, is the real magic, since it lets us store only O(sqrt(x)) values of F and G. When we want to compute only H(x), this is fine, but when we want to repeatedly compute the table of values of H(x/n) for integral values of n, it is often better to make A larger than B, eg A = x**(2/3) and B = x**(1/3).
Definition in file dirichlet.h.
Go to the source code of this file.
Data Structures | |
struct | nut_Diri |
Wrapper to hold values of some multiplicative function. More... | |
Functions | |
NUT_ATTR_CONST uint64_t | nut_dirichlet_D (uint64_t max, uint64_t m) |
Compute the sum of the divisor count function d from 1 to max. | |
NUT_ATTR_CONST uint64_t | nut_dirichlet_Sigma (uint64_t max, uint64_t m) |
Compute the sum of the divisor sum function sigma from 1 to max. | |
bool | nut_euler_sieve_conv_u (int64_t n, int64_t m, const int64_t f_vals[static n+1], int64_t f_conv_u_vals[restrict static n+1]) |
Given a table of values of a multiplicative function f, compute (f <*> u)(x) for all x from 1 to n See nut_euler_sieve_conv_u for more info}. | |
bool | nut_euler_sieve_conv_N (int64_t n, int64_t m, const int64_t f_vals[static n+1], int64_t f_conv_N_vals[restrict static n+1]) |
Given a table of values of a multiplicative function f, compute (f <*> N)(x) for all x from 1 to n See nut_euler_sieve_conv_u for more info}. | |
bool | nut_euler_sieve_conv (int64_t n, int64_t m, const int64_t f_vals[static n+1], const int64_t g_vals[static n+1], int64_t f_conv_g_vals[restrict static n+1]) |
Given tables of values of multiplicative functions f and g, compute (f <*> g)(x) for all x from 1 to n This uses Euler's sieve, a variant of the sieve of Eratosthenes that only marks off each multiple once. | |
bool | nut_Diri_init (nut_Diri *self, int64_t x, int64_t y) |
Allocate internal buffers for a diri table self->buf will have f(0) through f(y) at indicies 0 through y, and then f(x/1) through f(x/(yinv - 1)) at indicies y + 1 through y + yinv - 1. | |
void | nut_Diri_copy (nut_Diri *restrict dest, const nut_Diri *restrict src) |
Copy the values from one diri table to another, which must be initialized. | |
void | nut_Diri_destroy (nut_Diri *self) |
Deallocate internal buffers for a diri table. | |
void | nut_Diri_compute_I (nut_Diri *self) |
Just memset's the dense part, then sets index 1 and y + 1 through y + yinv - 1 to 1 (remember the sparse indicies are sums) | |
void | nut_Diri_compute_u (nut_Diri *self, int64_t m) |
Compute the value table for the unit function u(n) = 1 Fills the dense part of the table with 1s, and computes the sparse entries with table[y + k] = U(x/k) = x/k. | |
void | nut_Diri_compute_N (nut_Diri *self, int64_t m) |
Compute the value table for the identity function N(n) = n Fills the dense part of the table with increasing numbers, and computes the sparse entries with table[y + k] = sum_N(v = x/k) = v*(v+1)/2. | |
void | nut_Diri_compute_mertens (nut_Diri *restrict self, int64_t m, const uint8_t mobius[restrict static self->y/4+1]) |
Compute the value table for the mobius function mu(n) (whose sum is called the Mertens function) Requires both an initialized nut_Diri from nut_Diri_init and a packed table of mobius values from nut_sieve_mobius . | |
bool | nut_Diri_compute_dk (nut_Diri *restrict self, uint64_t k, int64_t m, nut_Diri *restrict f_tbl, nut_Diri *restrict g_tbl) |
Compute the value table for the kth generalized divisor function dk(n) dk(n) = da(n) <*> db(n) whenever a + b = k. | |
bool | nut_Diri_compute_Nk (nut_Diri *restrict self, uint64_t k, int64_t m) |
Compute the value table for the kth power function N^k(n) N^k(n) = n^k, so we can find the sums using https://en.wikipedia.org/wiki/Faulhaber%27s_formula (we actually do this by taking a lower triangular matrix of pascal's triangle and inverting it) This is one of the core components needed to compute the Dirichlet sum table for f where f(p) is a polynomial, the other one being https://en.wikipedia.org/wiki/Jordan%27s_totient_function. | |
bool | nut_Diri_compute_J (nut_Diri *restrict self, uint64_t p) |
Compute the value table for the Jacobi symbol jacobi(n/p) Currently, p MUST be and odd prime This is 1 if n is a quadratic residue (perfect square) mod p, -1 if it is not, or 0 if n is a multiple of p. | |
bool | nut_Diri_compute_Chi_balanced (nut_Diri *restrict self, uint64_t n, int64_t m, const int64_t period_tbl[restrict static n]) |
Compute the value table for a given Dirichlet Character Chi, whose total is 0 A dirichlet character is some n-periodic function Chi(x) which is. | |
bool | nut_Diri_compute_pi (nut_Diri *restrict self) |
Compute the value table for the prime counting function, aka prime pi The indicator function for primes is effectively multiplicative, and the prime counting function is its sum. | |
bool | nut_Diri_compute_conv_u (nut_Diri *restrict self, int64_t m, const nut_Diri *restrict f_tbl) |
Compute the value table for h = f <*> u, the dirichlet convolution of f and u (the unit function u(n) = 1), given the value table for f See nut_Diri_compute_conv for details, this is just that function but with several specializations due to u being very simple. | |
bool | nut_Diri_compute_conv_N (nut_Diri *restrict self, int64_t m, const nut_Diri *restrict f_tbl) |
Compute the value table for h = f <*> N, the dirichlet convolution of f and N (the identity function N(n) = n), given the value table for f See nut_Diri_compute_conv for details, this is just that function but with several specializations due to N being very simple. | |
bool | nut_Diri_compute_conv (nut_Diri *restrict self, int64_t m, const nut_Diri *f_tbl, const nut_Diri *g_tbl) |
Compute the value table for h = f <*> g, the dirichlet convolution of f and g, given value tables for the operands self must have been initialized using nut_Diri_init , and in particular the lengths and cutoffs for self, f_tbl, and g_tbl must be set and match This uses a sieve to compute h(1) through h(y), then uses dirichlet's hyperbola formula to compute H(x/1) through H(x/(yinv-1)) If one of the operands is a simple standard multiplicative function like the unit function u or the mobius function mu, then try to use a specialized function from this library and use that instead. | |
bool | nut_Diri_convdiv (nut_Diri *restrict self, int64_t m, const nut_Diri *restrict f_tbl, const nut_Diri *restrict g_tbl) |
Compute the value table for h such that f = g <*> h, aka h = f </> g where the division is in terms of dirichlet convolution self must have been initialized using nut_Diri_init , and in particular the lengths and cutoffs for self, f_tbl, and g_tbl must be set and match This is the inverse of <*>, in the sense that if h = f </> g, then h <*> g = f. | |
NUT_ATTR_CONST uint64_t nut_dirichlet_D | ( | uint64_t | max, |
uint64_t | m | ||
) |
Compute the sum of the divisor count function d from 1 to max.
[in] | max | inclusive upper bound of range to compute sum for |
[in] | m | modulus to reduce result by, or 0 to skip reducing |
NUT_ATTR_CONST uint64_t nut_dirichlet_Sigma | ( | uint64_t | max, |
uint64_t | m | ||
) |
Compute the sum of the divisor sum function sigma from 1 to max.
[in] | max | inclusive upper bound of range to compute sum for |
[in] | m | modulus to reduce result by, or 0 to skip reducing |
bool nut_euler_sieve_conv_u | ( | int64_t | n, |
int64_t | m, | ||
const int64_t | f_vals[static n+1], | ||
int64_t | f_conv_u_vals[restrict static n+1] | ||
) |
Given a table of values of a multiplicative function f, compute (f <*> u)(x) for all x from 1 to n See nut_euler_sieve_conv_u
for more info}.
[in] | n | inclusive upper bound of range to compute f <*> u over |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_vals | table of values for f |
[out] | f_conv_u_vals | table to store values of f <*> u in |
bool nut_euler_sieve_conv_N | ( | int64_t | n, |
int64_t | m, | ||
const int64_t | f_vals[static n+1], | ||
int64_t | f_conv_N_vals[restrict static n+1] | ||
) |
Given a table of values of a multiplicative function f, compute (f <*> N)(x) for all x from 1 to n See nut_euler_sieve_conv_u
for more info}.
[in] | n | inclusive upper bound of range to compute f <*> N over |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_vals | table of values for f |
[out] | f_conv_N_vals | table to store values of f <*> N in |
bool nut_euler_sieve_conv | ( | int64_t | n, |
int64_t | m, | ||
const int64_t | f_vals[static n+1], | ||
const int64_t | g_vals[static n+1], | ||
int64_t | f_conv_g_vals[restrict static n+1] | ||
) |
Given tables of values of multiplicative functions f and g, compute (f <*> g)(x) for all x from 1 to n This uses Euler's sieve, a variant of the sieve of Eratosthenes that only marks off each multiple once.
This generally has worse performance than the sieve of Eratosthenes, but some preliminary tests showed that Eratosthenes is only about 14% faster in release mode than Euler, so currently only Euler's sieve is implemented.
[in] | n | inclusive upper bound of range to compute f <*> g over |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_vals | table of values for f |
[in] | g_vals | table of values for f |
[out] | f_conv_g_vals | table to store values of f <*> g in |
bool nut_Diri_init | ( | nut_Diri * | self, |
int64_t | x, | ||
int64_t | y | ||
) |
Allocate internal buffers for a diri table self->buf will have f(0) through f(y) at indicies 0 through y, and then f(x/1) through f(x/(yinv - 1)) at indicies y + 1 through y + yinv - 1.
The first two indicies, f(0) and f(1), are basically dummies though: f(0) should never be relied on, and f(1) should always be 1
[in] | x | inclusive upper bound of the domain of interest |
[in] | y | inclusive upper bound of the dense portion of the domain of interest. Will be increased to sqrt(x) if needed, so 0 can be used to explicitly signal you want that behavior |
void nut_Diri_compute_I | ( | nut_Diri * | self | ) |
Just memset's the dense part, then sets index 1 and y + 1 through y + yinv - 1 to 1 (remember the sparse indicies are sums)
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
void nut_Diri_compute_u | ( | nut_Diri * | self, |
int64_t | m | ||
) |
Compute the value table for the unit function u(n) = 1 Fills the dense part of the table with 1s, and computes the sparse entries with table[y + k] = U(x/k) = x/k.
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
void nut_Diri_compute_N | ( | nut_Diri * | self, |
int64_t | m | ||
) |
Compute the value table for the identity function N(n) = n Fills the dense part of the table with increasing numbers, and computes the sparse entries with table[y + k] = sum_N(v = x/k) = v*(v+1)/2.
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
void nut_Diri_compute_mertens | ( | nut_Diri *restrict | self, |
int64_t | m, | ||
const uint8_t | mobius[restrict static self->y/4+1] | ||
) |
Compute the value table for the mobius function mu(n) (whose sum is called the Mertens function) Requires both an initialized nut_Diri from nut_Diri_init
and a packed table of mobius values from nut_sieve_mobius
.
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | mobius | packed table of mobius values, from nut_sieve_mobius (with upper bound self->y). |
bool nut_Diri_compute_dk | ( | nut_Diri *restrict | self, |
uint64_t | k, | ||
int64_t | m, | ||
nut_Diri *restrict | f_tbl, | ||
nut_Diri *restrict | g_tbl | ||
) |
Compute the value table for the kth generalized divisor function dk(n) dk(n) = da(n) <*> db(n) whenever a + b = k.
This function uses binary exponentiation to compute dk in only log(k) convolutions. However, if all dk's are needed, calling nut_Diri_compute_conv_u
is more efficient
[in,out] | self | the table to store the result in, initialized by nut_Diri_init . self->y, self->x, and self->yinv must be set and consistent with the inputs |
[in] | k | k, ie which generalized divisor function to compute. dk is u <*> u <*> ... <*> u with k u's |
[in] | m | modulus to reduce the result by, or 0 to skip reducing |
[in,out] | f_tbl,g_tbl | temporary tables for scratch work, initialized by nut_Diri_init , fields y, x, and yinv must be set Will still have scratch data stored on return |
bool nut_Diri_compute_Nk | ( | nut_Diri *restrict | self, |
uint64_t | k, | ||
int64_t | m | ||
) |
Compute the value table for the kth power function N^k(n) N^k(n) = n^k, so we can find the sums using https://en.wikipedia.org/wiki/Faulhaber%27s_formula (we actually do this by taking a lower triangular matrix of pascal's triangle and inverting it) This is one of the core components needed to compute the Dirichlet sum table for f where f(p) is a polynomial, the other one being https://en.wikipedia.org/wiki/Jordan%27s_totient_function.
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
[in] | k | the power of the power function to compute |
[in] | m | modulus to reduce the result by, or 0 to skip reducing |
bool nut_Diri_compute_J | ( | nut_Diri *restrict | self, |
uint64_t | p | ||
) |
Compute the value table for the Jacobi symbol jacobi(n/p) Currently, p MUST be and odd prime This is 1 if n is a quadratic residue (perfect square) mod p, -1 if it is not, or 0 if n is a multiple of p.
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
[in] | p | prime modulus for jacobi symbol |
bool nut_Diri_compute_Chi_balanced | ( | nut_Diri *restrict | self, |
uint64_t | n, | ||
int64_t | m, | ||
const int64_t | period_tbl[restrict static n] | ||
) |
Compute the value table for a given Dirichlet Character Chi, whose total is 0 A dirichlet character is some n-periodic function Chi(x) which is.
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
[in] | n | period of the dirichlet character |
[in] | m | modulus to reduce the result by, or 0 to skip reducing. Note that the input values must already be reduced |
[in] | period_tbl | table of values of Chi(x) for 0 <= x < n. These values should be reduced mod m and sum to 0 mod m, or if m is 0, some of these values should be 0 |
bool nut_Diri_compute_pi | ( | nut_Diri *restrict | self | ) |
Compute the value table for the prime counting function, aka prime pi The indicator function for primes is effectively multiplicative, and the prime counting function is its sum.
But also, Lucy Hedgehog's prime counting algorithm essentially uses Dirichlet tables already, hence its inclusion in this part of the library
[in,out] | self | the table to store the result in, and take the bounds from. Must be initialized |
Compute the value table for h = f <*> u, the dirichlet convolution of f and u (the unit function u(n) = 1), given the value table for f See nut_Diri_compute_conv
for details, this is just that function but with several specializations due to u being very simple.
[in,out] | self | the table to store the result in, initialized by nut_Diri_init . self->y, self->x, and self->yinv must be set and consistent with the inputs |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_tbl | table for the first operand |
Compute the value table for h = f <*> N, the dirichlet convolution of f and N (the identity function N(n) = n), given the value table for f See nut_Diri_compute_conv
for details, this is just that function but with several specializations due to N being very simple.
[in,out] | self | the table to store the result in, initialized by nut_Diri_init . self->y, self->x, and self->yinv must be set and consistent with the inputs |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_tbl | table for the first operand |
bool nut_Diri_compute_conv | ( | nut_Diri *restrict | self, |
int64_t | m, | ||
const nut_Diri * | f_tbl, | ||
const nut_Diri * | g_tbl | ||
) |
Compute the value table for h = f <*> g, the dirichlet convolution of f and g, given value tables for the operands self must have been initialized using nut_Diri_init
, and in particular the lengths and cutoffs for self, f_tbl, and g_tbl must be set and match This uses a sieve to compute h(1) through h(y), then uses dirichlet's hyperbola formula to compute H(x/1) through H(x/(yinv-1)) If one of the operands is a simple standard multiplicative function like the unit function u or the mobius function mu, then try to use a specialized function from this library and use that instead.
[in,out] | self | the table to store the result in, initialized by nut_Diri_init . self->y, self->x, and self->yinv must be set and consistent with the inputs |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_tbl | table for the first operand (dirichlet convolution is commutative, so order doesn't matter) |
[in] | g_tbl | table for the second operand |
bool nut_Diri_convdiv | ( | nut_Diri *restrict | self, |
int64_t | m, | ||
const nut_Diri *restrict | f_tbl, | ||
const nut_Diri *restrict | g_tbl | ||
) |
Compute the value table for h such that f = g <*> h, aka h = f </> g where the division is in terms of dirichlet convolution self must have been initialized using nut_Diri_init
, and in particular the lengths and cutoffs for self, f_tbl, and g_tbl must be set and match This is the inverse of <*>, in the sense that if h = f </> g, then h <*> g = f.
In fact, this function is essentially implemented by solving this for H(v) and applying the dirichlet hyperbola method (see the code comments for details). Unlike the compute_conv_*
family of functions, this needs to allocate space for scratch work, which can cause it to fail if there is not enough memory. About 16*(self->y + 1) bytes will be allocated. Currently there are not specialized functions for dividing by simple standard functions.
[in,out] | self | the table to store the result in, initialized by nut_Diri_init . self->y, self->x, and self->yinv must be set ant consistent with the inputs |
[in] | m | modulus to reduce results by, or 0 to skip reducing |
[in] | f_tbl | table for the first operand (the "numerator" aka "dividend" in the "division") |
[in] | g_tbl | the table for the second operand (the "denominator" aka "divisor" in the "division") |