1. System Objective
The primary function of the Dual Sieve Analyzer is to compute and analyze the deterministic outcome of a specific, idealized mathematical construct—a “Dual Sieve” acting on a finite set of integer indices k from 1 to K_LIMIT in order to demonstrate the structural necessity of infinite integers k not expressible as |6xy+x+y|. The software provides a high-accuracy empirical benchmark and a theoretical model to precisely quantify the model’s behavior and its deviations from reality.
(Note: This is not a particularly high performance tool and is prone to crashing at large values as it requires significant memory for the Boolean array.)
This project follows as an extension of the 6k+1 and 6k-1 pattern analyzer module.
The standalone system is designed to provide rigorous answers to the following questions for any given finite integer K_LIMIT:
- What is the exact number of integers k where 6k-1 and 6k+1 are prime?
- What is the exact outcome of a theoretical sieving process for k based on modular arithmetic?
- What is the precise, quantitative difference between the empirical count and the theoretical model’s result?
- What is the precise measure of statistical correlation between the two sieving processes within the model?
2. Algorithmic Components
The software is composed of two distinct computational engines: an empirical benchmark calculator and a theoretical model simulator.
2.1. Empirical Benchmark Calculation (find_k_by_full_prime_sieve)
This component’s function is to provide a high-accuracy, empirical benchmark for the number of prime-yielding k values within the specified range.
- Algorithm: A complete Sieve of Eratosthenes.
- Implementation:
- A NumPy boolean array (is_prime) is instantiated with a size of 6 * K_LIMIT + 2 and populated using the Sieve of Eratosthenes algorithm. This array serves as a complete and accurate primality map for the required number range.
- An array of indices (k_values) from 1 to K_LIMIT is generated.
- The exact counts of prime-yielding k are determined through direct, high-performance boolean array indexing against the primality map (e.g., k_minus_mask = is_prime[6 * k_values – 1]).
- The final counts are computed via numpy.sum() on the resulting boolean masks.
- Output: The exact integer counts of k for which 6k-1 is prime, 6k+1 is prime, and both are prime. This output is the ground truth for the finite domain [1, K_LIMIT].
2.2. Theoretical Model Simulation (dual_sieve_k_space)
This component’s function is to compute the deterministic outcome of an idealized dual sieving process.
- Algorithm: A direct sieve on the index set k using modular arithmetic.
- Implementation:
- Two NumPy boolean arrays (k_minus_survivors, k_plus_survivors) of size K_LIMIT + 1 are instantiated and initialized to True.
- A list of sieving primes p is generated up to the required threshold of floor(sqrt(6 * K_LIMIT + 1)).
- For each sieving prime p > 3, two distinct residue classes are calculated:
- r₁ = pow(6, -1, p)
- r₂ = p – r₁
- These residue classes represent the starting points of arithmetic progressions of k values that are guaranteed to produce composites. The model uses vectorized NumPy slicing to mark all elements in these progressions as False in their respective survivor arrays.
- The final set of model survivors for twin prime indices is computed via a logical AND (&) operation between the two survivor arrays.
- Output: The exact set of integer indices k that survive the defined theoretical sieving process.
3. Outputs
The software’s outputs constitute “computational proofs” of the following facts for the specified finite domain [1, K_LIMIT].
3.1. Demonstration of the Deterministic Outcome of a Finite Sieve
The Dual Sieve model is a deterministic algorithm. For any given K_LIMIT, its output is the single, unique, and precisely calculable result of applying the defined set of modular rules. This proves what the outcome of such a finite, idealized sieve must be.
3.2. Demonstration of the Distinctness of Sieving Rules
The model’s operation is a computational demonstration of a mathematical theorem: for any prime modulus p > 3, the modular multiplicative inverse 6⁻¹ (mod p) is unique and is not congruent to -6⁻¹ (mod p). The software’s correct execution for any K_LIMIT verifies this principle for all primes used in its sieve.
3.3. Demonstration of the Existence and Exact Magnitude of Sieve Anomalies
The Full SoE Validation mode provides evidence of the model’s deviation from empirical reality. For any K_LIMIT, the comparison table proves:
- The precise integer difference between the model’s survivor counts and the true prime index counts.
- This difference is the exact, net result of two opposing, verifiable algorithmic behaviors:
- Self-Sieving: The model’s undercount of twin prime indices is a proven consequence of the sieving algorithm eliminating k values that correspond to the sieving primes themselves.
- False Survivors: The model’s overcount in the individual 6k±1 sequences is a proven consequence of the sieve’s inability to eliminate composites whose prime factors all exceed the sqrt threshold.
3.4. Demonstration of Sieve Process Correlation
The model provides a demonstration that the two sieving processes are not statistically independent within the domain [1, K_LIMIT].
- The Naive Prediction calculates the expected number of survivors under a null hypothesis of independence.
- The ratio Prediction / Actual Model Result is a precise, quantitative measurement of the degree to which this null hypothesis is false.
- A ratio greater than 1.0 is a computational proof that the two sieving processes are correlated in such a way that they eliminate candidates more efficiently together than two independent random processes with the same individual densities.
Code:
"""
The Dual Sieve Analyzer: A Computational Tool for Prime Number Theory.
This script provides a computational framework for investigating the distribution
of twin primes by modeling their formation as a deterministic sieving process
in a structured index space ("k-space").
It features two primary modes of operation:
1. A theoretical model that simulates the idealized sieving process.
2. An empirical validation tool that compares the model's output to the
exact, ground-truth prime counts for a given finite range.
The software is designed to provide rigorous, computationally proven outputs
for finite domains and to serve as an educational tool for exploring the
subtleties of sieve theory.
"""
import time
from typing import Dict, List, Any
import matplotlib.pyplot as plt
import numpy as np
# --- Constants for UI and Explanations ---
THEORY_SUMMARY = """
======================================================================
THEORY: MODELING TWIN PRIMES AS A STRUCTURAL NECESSITY IN K-SPACE
======================================================================
This script models the existence of twin primes as a predictable, structural
consequence of how prime numbers are distributed. It does this by analyzing a
"Dual Sieve" acting on the regular, ordered set of indices ("k-space").
This tool provides multiple modes of analysis:
1. The Dual Sieve Model: A theoretical model to demonstrate the mechanics.
2. Full SoE Validation: An empirical tool to compare the model to ground truth.
3. Detailed Theory Explanation: A deep dive into the logical argument.
======================================================================
"""
DETAILED_THEORY_TEXT = """
======================================================================
A DETAILED EXPLANATION OF THE DUAL SIEVE MODEL
======================================================================
--- [ 1. The Core Idea: From Chance to Structure ] ---
The Twin Prime Conjecture proposes that there are infinitely many prime pairs
(p, p+2). Historically, this is viewed as a problem of chance: what is the
probability that two numbers so close together are both prime?
This project reframes the question entirely. Instead of viewing twin primes as
a random accident, it aims to demonstrate their existence as a predictable and
unavoidable consequence of the fundamental structure of the integers.
--- [ 2. The Strategic Shift: N-Space vs. K-Space ] ---
To see this structure, we must shift our perspective.
* N-Space is the familiar number line (1, 2, 3, ...). Primes appear on this
line with chaotic, irregular spacing. Finding patterns here is difficult.
* K-Space is the perfectly regular sequence of indices `k` (1, 2, 3, ...)
used in the formula for twin primes, `(6k-1, 6k+1)`.
By moving our analysis to k-space, we change the problem from finding irregularly
spaced primes to a much simpler question: "Which of these perfectly regular `k`
values have the special property of generating a prime pair?"
--- [ 3. The Mechanism: A Dual Sieve in K-Space ] ---
A `k` value generates a twin prime if and only if BOTH `6k-1` is prime AND `6k+1`
is prime. We can model this as two simultaneous sieving processes acting on the
same set of `k` values:
* Sieve 1: Eliminates all `k` for which `6k-1` is composite.
* Sieve 2: Eliminates all `k` for which `6k+1` is composite.
The `k` values that survive BOTH sieves are the twin prime indices. This script
directly models this "Dual Sieve" to see what properties it has.
--- [ 4. The Mathematical Engine: The Sieving Rules ] ---
The model uses modular arithmetic to determine exactly which `k` values to eliminate.
For any given sieving prime `p > 3`:
* For Sieve 1, we find `k` where `6k - 1` is a multiple of `p`.
This means solving `6k - 1 ≡ 0 (mod p)`, which gives the rule:
Eliminate all `k` that follow the pattern `k ≡ 6⁻¹ (mod p)`.
* For Sieve 2, we find `k` where `6k + 1` is a multiple of `p`.
This means solving `6k + 1 ≡ 0 (mod p)`, which gives the rule:
Eliminate all `k` that follow the pattern `k ≡ -6⁻¹ (mod p)`.
The most important mathematical insight is this: for any prime `p > 3`, the
two solutions `6⁻¹` and `-6⁻¹` are ALWAYS different. They can never be the same.
--- [ 5. The Conclusion: Why the Overlap is a Structural Necessity ] ---
Because the two sieving rules for any prime `p` are always distinct, each prime `p`
removes exactly TWO different residue classes (patterns) of `k`s. This leaves the
other `p-2` residue classes completely untouched by that prime.
Imagine trying to paint a wall with a comb that always has gaps. You can pass over
the wall as many times as you want, but you can never cover the entire surface.
The Dual Sieve is like this comb. The sieving process is structurally incapable
of eliminating every possible `k`. No matter how many primes you use, the sieve
is guaranteed to leave holes. Therefore, an infinite number of `k` values MUST
survive the process.
--- [ 6. The Reality Check: Why the Model Isn't a Perfect Counter ] ---
The Validation mode proves that the theoretical model's counts don't perfectly
match the real-world counts. This is not a flaw; it is a feature that confirms
our understanding of how sieves work. There are two opposing anomalies:
1. Self-Sieving (Causes Undercounting): The model eliminates `k` values that
correspond to the sieving primes themselves. For example, `p=5` is used
to sieve, and in doing so, it eliminates `k=1` (from the pair (5,7)).
2. False Survivors (Causes Overcounting): The model only uses sieving primes
up to `sqrt(N)`. It cannot eliminate `k` values that produce a composite
number whose prime factors are all larger than `sqrt(N)`.
======================================================================
"""
# --- Algorithmic Components ---
def _calculate_ground_truth(k_limit: int) -> Dict[str, int]:
"""
Calculates the exact number of prime-yielding k-values using a full SoE.
This function provides the empirical benchmark ("ground truth") for a given
finite range by generating a complete primality map up to 6*k_limit + 1.
Args:
k_limit: The maximum integer index k to analyze.
Returns:
A dictionary containing the exact counts of k's for '6k-1',
'6k+1', and twin primes.
"""
print(f"\n1. Running Full Sieve of Eratosthenes up to n = {6 * k_limit + 1:,}...")
n_max = 6 * k_limit + 2
is_prime = np.ones(n_max, dtype=bool)
is_prime[0:2] = False
# Standard Sieve of Eratosthenes algorithm
for i in range(2, int(n_max**0.5) + 1):
if is_prime[i]:
is_prime[i*i::i] = False
print("2. Counting actual prime-yielding k values...")
k_values = np.arange(1, k_limit + 1)
# Use the completed primality map to find exact counts via boolean indexing
k_minus_mask = is_prime[6 * k_values - 1]
k_plus_mask = is_prime[6 * k_values + 1]
k_twin_mask = k_minus_mask & k_plus_mask
return {
"count_minus": np.sum(k_minus_mask),
"count_plus": np.sum(k_plus_mask),
"count_twin": np.sum(k_twin_mask),
}
def _simulate_dual_sieve_model(k_limit: int, verbose: bool = True) -> Dict[str, Any]:
"""
Simulates the theoretical Dual Sieve model on the index space k.
This function computes the deterministic result of the idealized sieving
process based on modular arithmetic, without performing a full primality test.
Args:
k_limit: The maximum integer index k to analyze.
verbose: If True, prints progress updates to the console.
Returns:
A dictionary containing the survivor arrays and their integer counts.
"""
if verbose:
print(f"\n--- Setting up Dual Sieve Model for k up to {k_limit:,} ---")
# Initialize survivor arrays; index `i` corresponds to k=`i`
k_minus_survivors = np.ones(k_limit + 1, dtype=bool)
k_plus_survivors = np.ones(k_limit + 1, dtype=bool)
# Determine the required sieving primes
sieve_upper_bound = int((6 * k_limit + 1)**0.5)
if verbose:
print(f"1. Generating sieving primes up to {sieve_upper_bound:,}...")
sieve_primes = _generate_sieving_primes(sieve_upper_bound)
if verbose:
print("\n2. Sieving k-space directly based on modular rules...")
# Main sieving loop
for p in sieve_primes:
if p <= 3:
continue
# Calculate the two distinct sieving rules for prime p
inv6 = pow(6, -1, p)
start_k_minus = inv6
start_k_plus = p - inv6
# Mark non-survivors using efficient NumPy slicing
if start_k_minus <= k_limit:
k_minus_survivors[start_k_minus::p] = False
if start_k_plus <= k_limit:
k_plus_survivors[start_k_plus::p] = False
# The final overlap is the set of survivors from both sieves
k_twin_survivors = k_minus_survivors & k_plus_survivors
return {
"k_minus_survivors": k_minus_survivors,
"k_plus_survivors": k_plus_survivors,
"k_twin_survivors": k_twin_survivors,
"count_minus": np.sum(k_minus_survivors[1:]),
"count_plus": np.sum(k_plus_survivors[1:]),
"count_twin": np.sum(k_twin_survivors[1:]),
}
def _generate_sieving_primes(limit: int) -> List[int]:
"""
Generates a list of prime numbers up to a given limit. Helper function.
"""
is_prime = np.ones(limit + 1, dtype=bool)
is_prime[0:2] = False
for p in range(2, int(limit**0.5) + 1):
if is_prime[p]:
is_prime[p*p::p] = False
return [p for p, is_p in enumerate(is_prime) if is_p]
# --- UI and Controller Functions ---
def _plot_model_densities(results: Dict[str, Any], k_limit: int) -> None:
"""
Generates and displays a plot of the model's cumulative densities.
"""
print("\nGenerating density plot for the Dual Sieve Model...")
k_axis = np.arange(1, k_limit + 1)
# Calculate cumulative counts and densities for each sequence
c_minus = np.cumsum(results["k_minus_survivors"][1:])
c_plus = np.cumsum(results["k_plus_survivors"][1:])
c_twin = np.cumsum(results["k_twin_survivors"][1:])
density_minus = c_minus / k_axis
density_plus = c_plus / k_axis
density_twin = c_twin / k_axis
# Sample data for performance on very large plots
sample_step = max(1, k_limit // 5000)
plt.style.use('seaborn-v0_8-whitegrid')
plt.figure(figsize=(14, 8))
plt.plot(k_axis[::sample_step], density_minus[::sample_step], label='Model: k Density for 6k-1', color='C0')
plt.plot(k_axis[::sample_step], density_plus[::sample_step], label='Model: k Density for 6k+1', color='C1')
plt.plot(k_axis[::sample_step], density_twin[::sample_step], label='Model: k Density for Twin Primes', color='C2')
plt.xscale('log')
plt.title('Comparative Density of k-space Survivors from the Dual Sieve Model')
plt.xlabel('k (Log Scale)')
plt.ylabel('Cumulative Density (Survivors up to k / k)')
plt.legend()
plt.grid(True, which="both", ls="--")
plt.show()
def _run_dual_sieve_model(k_limit: int) -> None:
"""
Orchestrates the execution and output for the theoretical model.
"""
start_time = time.time()
results = _simulate_dual_sieve_model(k_limit)
print(f" Model simulation finished in {time.time() - start_time:.4f} seconds.")
print("\n--- Analysis of Dual Sieve Model Results ---")
count_m = results['count_minus']
count_p = results['count_plus']
count_t = results['count_twin']
density_m = count_m / k_limit
density_p = count_p / k_limit
# Calculate the naive prediction assuming statistical independence
predicted_count_t = (density_m * density_p) * k_limit
print("\n--- Model Survivor Counts ---")
print(f"k's for potential 6k-1 primes: {count_m:,} ({density_m:.4%})")
print(f"k's for potential 6k+1 primes: {count_p:,} ({density_p:.4%})")
print("\n--- Model Overlap Prediction vs. Reality ---")
print(f"Naive Predicted Twin Prime k's: {predicted_count_t:,.0f}")
print(f"Actual Model Twin Prime k's: {count_t:,}")
if count_t > 0:
# This ratio measures the correlation between the sieves
error_ratio = predicted_count_t / count_t
print(f"Prediction/Actual Ratio: {error_ratio:.4f}")
print("(Note: This ratio converges toward 1/C₂ ≈ 1.28, a known constant)")
if input("\nGenerate comparison plot for the model? (Y/N): ").strip().lower() == 'y':
_plot_model_densities(results, k_limit)
def _run_full_soe_validation(k_limit: int) -> None:
"""
Orchestrates the validation run, comparing ground truth to the model.
"""
start_time_soe = time.time()
actual_results = _calculate_ground_truth(k_limit)
print(f" Ground Truth calculation finished in {time.time() - start_time_soe:.4f} seconds.")
start_time_model = time.time()
model_results = _simulate_dual_sieve_model(k_limit, verbose=False)
print(f" Dual Sieve model simulation finished in {time.time() - start_time_model:.4f} seconds.")
_print_comparison_table(k_limit, actual_results, model_results)
def _print_comparison_table(k_limit: int, actuals: Dict[str, int], models: Dict[str, int]) -> None:
"""
Formats and prints the final validation summary table with a clear,
logically consistent explanation of the results.
"""
print("\n" + "="*70)
print(f" COMPARISON SUMMARY FOR K_LIMIT = {k_limit:,}")
print("="*70)
headers = ["Metric", "Actual (SoE)", "Model (Dual Sieve)", "Difference (Act-Mod)"]
print(f"{headers[0]:<25} {headers[1]:>12} {headers[2]:>20} {headers[3]:>21}")
print("-"*70)
metrics = ["k for 6k-1 Primes", "k for 6k+1 Primes", "k for Twin Primes"]
actual_counts = [actuals['count_minus'], actuals['count_plus'], actuals['count_twin']]
model_counts = [models['count_minus'], models['count_plus'], models['count_twin']]
for i in range(3):
diff = actual_counts[i] - model_counts[i]
print(f"{metrics[i]:<25} {actual_counts[i]:>12,} {model_counts[i]:>20,} {diff:>+21,}")
print("="*70)
print("\nExplanation of the Difference (Actual - Model):")
print(" - A POSITIVE difference means the Model UNDERCOUNTED the true value.")
print(" This is primarily caused by 'self-sieving', where the model eliminates")
print(" k-values corresponding to the sieving primes themselves.")
print("\n - A NEGATIVE difference means the Model OVERCOUNTED the true value.")
print(" This is primarily caused by 'false survivors', where the model fails to")
print(" eliminate k-values that produce composites of large primes.")
def main() -> None:
"""Main function to display the menu and run the analyzer."""
print(THEORY_SUMMARY)
while True:
print("\n--- Main Menu ---")
print("1. Run the Dual Sieve Model (Theoretical)")
print("2. Run Full SoE Validation (Ground Truth vs. Model)")
print("3. Explain the Theory in Detail")
print("4. Quit")
choice = input("Enter your choice (1-4): ").strip()
try:
if choice in ('1', '2'):
k_input = input(f"Enter the maximum k for analysis: ")
k_limit = int(k_input)
if k_limit <= 0:
print("Error: Please enter an integer greater than 0.")
continue
if choice == '1':
_run_dual_sieve_model(k_limit)
else: # choice == '2'
_run_full_soe_validation(k_limit)
elif choice == '3':
print(DETAILED_THEORY_TEXT)
input("\nPress Enter to return to the main menu...")
elif choice == '4':
break
else:
print("Invalid choice. Please enter a number from 1 to 4.")
except ValueError:
print("Error: Invalid input. Please enter a valid integer for k.")
except Exception as e:
print(f"\nAn unexpected error occurred: {e}")
print("\nExiting the Dual Sieve Analyzer. Goodbye!")
if __name__ == "__main__":
main()
Example Output:
======================================================================
THEORY: MODELING TWIN PRIMES AS A STRUCTURAL NECESSITY IN K-SPACE
======================================================================
This script models the existence of twin primes as a predictable, structural
consequence of how prime numbers are distributed. It does this by analyzing a
"Dual Sieve" acting on the regular, ordered set of indices ("k-space").
This tool provides multiple modes of analysis:
1. The Dual Sieve Model: A theoretical model to demonstrate the mechanics.
2. Full SoE Validation: An empirical tool to compare the model to ground truth.
3. Detailed Theory Explanation: A deep dive into the logical argument.
======================================================================
--- Main Menu ---
1. Run the Dual Sieve Model (Theoretical)
2. Run Full SoE Validation (Ground Truth vs. Model)
3. Explain the Theory in Detail
4. Quit
Enter your choice (1-4): 1
Enter the maximum k for analysis: 1000000
--- Setting up Dual Sieve Model for k up to 1,000,000 ---
1. Generating sieving primes up to 2,449...
2. Sieving k-space directly based on modular rules...
Model simulation finished in 0.0060 seconds.
--- Analysis of Dual Sieve Model Results ---
--- Model Survivor Counts ---
k's for potential 6k-1 primes: 206,317 (20.6317%)
k's for potential 6k+1 primes: 206,169 (20.6169%)
--- Model Overlap Prediction vs. Reality ---
Naive Predicted Twin Prime k's: 42,536
Actual Model Twin Prime k's: 37,844
Prediction/Actual Ratio: 1.1240
(Note: This ratio converges toward 1/C₂ ≈ 1.28, a known constant)
Generate comparison plot for the model? (Y/N): y
Generating density plot for the Dual Sieve Model...

--- Main Menu ---
1. Run the Dual Sieve Model (Theoretical)
2. Run Full SoE Validation (Ground Truth vs. Model)
3. Explain the Theory in Detail
4. Quit
Enter your choice (1-4): 2
Enter the maximum k for analysis: 1000000
1. Running Full Sieve of Eratosthenes up to n = 6,000,001...
2. Counting actual prime-yielding k values...
Ground Truth calculation finished in 0.0210 seconds.
Dual Sieve model simulation finished in 0.0040 seconds.
======================================================================
COMPARISON SUMMARY FOR K_LIMIT = 1,000,000
======================================================================
Metric Actual (SoE) Model (Dual Sieve) Difference (Act-Mod)
----------------------------------------------------------------------
k for 6k-1 Primes 206,502 206,317 +185
k for 6k+1 Primes 206,345 206,169 +176
k for Twin Primes 37,915 37,844 +71
======================================================================
Explanation of the Difference (Actual - Model):
- A POSITIVE difference means the Model UNDERCOUNTED the true value.
This is primarily caused by 'self-sieving', where the model eliminates
k-values corresponding to the sieving primes themselves.
- A NEGATIVE difference means the Model OVERCOUNTED the true value.
This is primarily caused by 'false survivors', where the model fails to
eliminate k-values that produce composites of large primes.
--- Main Menu ---
1. Run the Dual Sieve Model (Theoretical)
2. Run Full SoE Validation (Ground Truth vs. Model)
3. Explain the Theory in Detail
4. Quit
Enter your choice (1-4):
