Loading...
Searching...
No Matches
segsieves.h File Reference

Detailed Description

Author
hacatu
Version
0.2.0

LICENSE

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.

DESCRIPTION

Sieve based functions that work on segments, so that you can use all the cores your little heart desires

Definition in file segsieves.h.

Go to the source code of this file.

Data Structures

struct  nut_Segsieve
 

Functions

bool nut_Segsieve_init (nut_Segsieve *self, uint64_t max, uint64_t preferred_bucket_size)
 Set up the sieving primes header for a segmented sieve.
 
void nut_Segsieve_destroy (nut_Segsieve *self)
 Free the resources held by a segmented sieve header.
 
void nut_Segsieve_factorizations (const nut_Segsieve *restrict self, uint64_t a, uint64_t b, size_t pitch, void *buffer)
 Sieve all factorizations in the range [a, b) using a modified, in place largest factor sieve This uses a pitched array of factorization structs nut_Factors, so it uses a lot of memory per element in the segment, and the segment size should be lowered accordingly.
 
void * nut_Segsieve_factorizations_mkbuffer (const nut_Segsieve *self, size_t *pitch)
 Allocate a buffer for a thread to pass to nut_Segsieve_factorizations on its intervals.
 

Function Documentation

◆ nut_Segsieve_init()

bool nut_Segsieve_init ( nut_Segsieve self,
uint64_t  max,
uint64_t  preferred_bucket_size 
)

Set up the sieving primes header for a segmented sieve.

The user can then divide the range into segments as desired and process them in multiple threads, by calling nut_Segsieve_* functions with [a, b) intervals that partition the range.

Parameters
[out]selfthe segsieve header to initialize. Must be freed with nut_Segsieve_destroy
[in]maxthe inclusive upper bound of the range.
[in]preferred_bucket_sizedesigned to tell algorithms how big buckets should be for optimal cache use etc. Currently not really used. If 0, we just pick sqrt_max, and if nonzero, we forward it to self
Returns
true on success, in which case self contains a list of sieving primes and other information, or false on (allocation) failure

◆ nut_Segsieve_destroy()

void nut_Segsieve_destroy ( nut_Segsieve self)

Free the resources held by a segmented sieve header.

Note that this does not free per-thread work buffers or other resources not directly managed by self.

◆ nut_Segsieve_factorizations()

void nut_Segsieve_factorizations ( const nut_Segsieve *restrict  self,
uint64_t  a,
uint64_t  b,
size_t  pitch,
void *  buffer 
)

Sieve all factorizations in the range [a, b) using a modified, in place largest factor sieve This uses a pitched array of factorization structs nut_Factors, so it uses a lot of memory per element in the segment, and the segment size should be lowered accordingly.

In particular, we use 8 + 16*(omega+1) bytes PER segment entry, where omega is the max number of distinct prime divisors, which can be up to 15, ie up to 264 bytes per entry. So for example, if using a CPU with 512k L1d cache per core and smt, and sieving numbers over 614889782588491410, omega can be 15 so we would use a preferred bucket size of 992. If the upper bound were 10^12 instead, then omega is at most 11, so we use a preferred bucket size of 1310, Or you can just try a larger bucket size, since ~1k buckets are insanely small. My computer also has 16m L2 cache per core, which would lead to a bucket size of 41943 for omega 11 or 31775 for omega 15.

Parameters
[out]bufferthe factorizations are stored here, but for performance reasons, their first prime power will have the form p^1, where p is either the LARGEST prime divisor OR 1, so code consuming this data must be aware of that.

◆ nut_Segsieve_factorizations_mkbuffer()

void * nut_Segsieve_factorizations_mkbuffer ( const nut_Segsieve self,
size_t *  pitch 
)

Allocate a buffer for a thread to pass to nut_Segsieve_factorizations on its intervals.

Parameters
[in,out]pitchif 0, compute max omega for self->max and compute the related pitch, then store it here. if nonzero, we assume you are passing the pitch calculated from a previous call. This pointer must not be null.