Forensic Semiotic Recruiting Resume Evaluation and Candidate Engagement Model

Theory

Through natural language processing and LLM we can use a process called forensic semiotics to construct sign systems by which we can deconstruct and interpret candidate resumes and fit them to the model requirements defined by the hiring manager.

In theory, a product like “Microsoft Project” might appear in many sign forms in a resume. It could be just “Project” or it could appear as “MS Project” or “Microsoft Project”. “Project” is the toughest one since it has the least context and could appear elsewhere in the resume for which the context has limited value. (In these cases, recruiter heuristics suggest creating a sign which combines “Project” with another likely piece of Project management software from the same suite (Visio) to create a sign: (project AND Visio) which serves to reduce false positives and increase the probable contextuality of the term in the resume.)

Or it could be just generally some kind of PPM and then we can dump a bunch of terms in a boolean query like (“MS Project” OR “Microsoft Project” OR (Project AND Visio) OR Clario OR Planview) etc. This entire sequence in quotes becomes our “sign” for PPM.

Applying a view which conceives of a series of boolean “signs” as a sequence in a search string, we can approach a database of resumes to identify those resumes which are a match for the PPM sign system and ignore the resumes which do not contain the signs.

The semiotic boolean approach has advantages over a semantic approach because the results are precise.

If the recruiter is inexperienced, they may benefit from a semantic search or semantic search suggestions; but in general, an experienced recruiter armed with Boolean as a sign system and a knowledge of the organizational culture and the job requirements and hiring manager will be much more equipped to get exactly what they are looking for.

Boolean criteria as signs become objective criteria from which our data set of resumes can be abstracted from the set of all resumes

Once we have an abstracted set of precise resumes that accurately reflect the boolean sign requirements, we can perform more subjective analysis on those resumes to weigh them against one another and determine fit with the overall model on a kind of % match basis

This approach can be refined for the unique job and requirements (for example a job where certification may be weighed higher than education)

Principles

Blindness to background focus on fit of experience to the requirement as we fit model to data

Belief that by focusing on blind principles in hiring we will eventually build a employee population which is a model of the real world society at scale

Responsiveness, kindness, and engagement are essential for keeping the best candidates “hot” and ready to accept the job if offered

Reciprocally, despite most people saying they want to get feedback if not selected for closure, it is not always best to give people negative feedback unless they specifically ask to know their disposition. In these cases, it is ethical to tell them that they were not selected quickly, while apologizing for not letting them know sooner. Otherwise, never inflicting this “psychic wound” on good candidates makes it easier to work with them again in the future. They can abduct they were not hired.

The recruiter can tell them if there is feedback that is positive, they will let them know right away; but also that they will not be contacting them if they do not have news. This can imply to the candidate the recruiter likes them but they may not hear from the recruiter if they were not selected. However, if the person does reach out and really wants the psychic wound inflicted which they should already have abducted; then the recruiter should respond swiftly and kindly in explaining why. Never go down a rabbit hole with the candidate especially when they are upset by the outcome. It’s just the way it is and it isn’t the recruiter decision. General comments about improving interview performance which were of concern in the interview, such as a focus on practicing STAR responses or improving brevity in responses can improve the candidate model without divulging specific areas of feedback which may be harmful to the candidate model’s cognitive process and result in negative feelings directed at the recruiter or (former employer).

Apply Cialdini’s concepts of Influence in an ethical framework which is not obvious and is not sappy or aggressive. Be open to discussing Cialdini openly if asked what principles underlie the psychological factors behind the model as well as cognitive forensic computational semiotics.

Ignore most non requirement aspect of the job and look at boolean keyword/keyphrase fit to model requirements

Ensure framework is in place so that all actual requirements are explicitly in the job description and not implicit on the part of the hiring manager

Model can abduct missing requirements by determining gap between past hire and job description

Scoring

(For experienced candidates) Weigh company in a model but at a relatively moderate percentage of the evaluation

(For all candidates) Weigh education in the model but at a relatively moderate percentage with a focus on the objective requirements for the degree and role requirements rather than the institution from which the degree was received

(For experienced candidates) Weigh the average job duration as a significant portion of the model (eg. greater than moderate weight but not maximum weight), and especially as it relates to career progression, working successfully in various organizations, taking on new roles; and demonstrating consistent interaction with the core skills and competencies required for the role.

(For all candidates) Weigh certification(s) in the model, but at a relatively low weight unless it is a specific requirement for the role in which case weigh it at high/maximum value. 

** Over certification may indicate careerist focus when accompanied by average short jobs

(For all candidates) Look for responsiveness with the recruiter as a key indicator of their engagement with the model

Updates:

Consider periodic “cold close” as part of the model to highly ranked candidates in order to assess candidate engagement and likelihood of future offer acceptance

** Prompt to user:”Looks like you had a good interview. Following the interview can you see yourself in the role?”

* Consider other integration with Recruiter toolkit for referrals, etc.

Prototype Approach:

Consider an approach to refining boolean queries based on a list of keywords in database and resume set abstraction

Train the model to construct efficient boolean queries which model the recruiter input

Input the queries into existing tools

Test on Applicant Tracking Database Resume set

An individual skill within the PPM set could be weighed higher than other technologies based on manager preferences (eg exact match vs product analog)

(note: I wrote this before I lost my recruiting job last year and shared with my employer. Just figured I’d throw it out there.)

Asymptotic Relationships of Arithmetic Progressions and their Composites with Diophantine Solutions

Consider Z+ and the function |6xy+x+y| : x,y Z \ {0} within Z+.

Is there an asymptotic relationship between the integers in Z+ and |6xy+x+y|, where |6xy+x+y| can never fill the whole set of integers?

If this is asymptotic and can never meet the asymptote (defined by n=|6k+1| for n=|36xy+6x+6y+1|) ; does this mean that there would be infinite positive integers of the form k not expressible as k=|6xy+x+y| since this is a Diophantine equation?

Here are two very similar provable examples which do not involve absolute value within Z+.

  • n=k+1 \ n=xy+x+y+1 is all prime numbers (therefore if k=xy+x+y, then n=k+1 is composite).
  • n=2k+1 \ n=4xy+2x+2y+1 is all odd primes (therefore if k=2xy+x+y, then n=2k+1 is composite).

When we move to an absolute value expression, then we have to move to Z \ {0} for some variables, and we need to focus on k as opposed to n when identifying compositeness.

So we have n=6k+-1 numbers. Then we have n=6k-1 and n=6k+1.

But {|6k-1|}={|6k+1|} so we can just use n=|6k+1| if n is Z+ and k is Z{0} (or choose n=6k-1 if you want but it is less pedagogically sound as it will flip signs in next steps).

Then we parameterize from n=|36xy+6x+6y+1| (for n in Z+, x,y in Z \ {0}), reducing to k=6xy+x+y (k,x,y in Z \ {0}).

So, if -k is in k=6xy+x+y, then n=|6k+1| yields a composite number of the form n=6k-1.

If +k is k=6xy+x+y, then n=|6k+1| yields a composite number of the form n=6k+1.

By Dirichlet’s theorem there are infinite primes in n=6k-1 and n=6k+1, so there are infinite k values yielding prime numbers in n=|6k+1|.

(A more formal proof demonstrates complete coverage for composites of n=6k+-1 integers by this method, so it is not necessary to type out here, but don’t forget {|6k-1|}={|6k+1|} makes this all possible, for every n in one there is -n in the other and vice versa.)

In short, if k \ 6xy+x+y for k,x,y in Z \ {0}, then k yields a prime in n=6k-1 or n=6k+1 when using n=|6k+1| as our arithmetic progression.

Since there are infinite primes, then this covers every prime in n=6k+-1 and so covers all primes greater than 3.

All of these, and due to the proof of the infinitude of composites and primes in each sequence ought to demonstrate an asymptotic relationship.

Since these are Diophantine sets with only integer solutions, does this mean in each case that the asymptotic relationship with the arithmetic progression which generates the composite parameterization guarantees there will be infinite integers not expressible of the form of the originating arithmetic progression?

It’s like the fundamental theorem of arithmetic applied to arithmetic progressions.

Imagine that k is a line, and k=|6xy+x+y| is curved (e.g. asymptotic with k), and so it can never hit the line even if it fills almost all of k when k is very large. Therefore, it is logically necessary that the complement of k=|6xy+x+y| is an infinite set of integers since the equation produces only integer solutions.

We just need to establish a baseline of an arithmetic progression which becomes the “asymptotic value”. The simplest example is the set of all primes n=k+1 \ n=xy+x+y+1, which provably never touches all the integers.

Since the parameterized sequences in all these examples are Diophantine and infinite in Z+ but cannot possibly become a “line” by transforming into their parent arithmetic progression, then it follows logically that there are infinite positive integers k not expressible as |6xy+x+y| as well. You would be trying to change the dimensional relationships in a way which is not possible.

Empirically Analyzing the Twin Prime Indices

Initial Procedure

  1. In order to empirically examine numbers not of the form k=|6xy+x+y|, I gathered a list of all the twin primes up to 1,500,000 using this website. This resulted in 11596 results. Subtracting the case of twin prime 3,5 (which is not of the form 6k+-1), yielded 11595 results.
  2. Then, I moved the data to Word to remove the commas (as it was confusing Excel). Then I pasted the cleaned data in Excel. The data was delimited by splitting the values into columns using the spaces where the commas used to be. Column titles were assigned to “6k-1” and “6k+1“.
  3. Then, to determine the k value for each twin prime sequence; I took the 6k+1 number, subtracted 1 and divided by 6. (This approach yields the same value as taking the 6k-1 number, adding 1 and dividing by 6.) This resulted in a maximum k of 249947; corresponding to the twin prime pair 1499681 , 1499683. This column was titled “k subset“.
  4. Then, I created a sequence of all k values from 1 to the largest k value derived from the conversion in step 3. I titled this column “k original“.
  5. Then, I used the VLOOKUP to pair the values in “k subset” with the values in “k original“. This column was titled “Paired (Vlookup)“.
  6. Then, I pasted the raw values of “Paired (Vlookup)” as a new column “Paired (Values)” and used the find and replace feature to remove the #N/A cells, leaving them blank.
  7. Then, I converted each twin prime k in “Paired (Values)” into a unit variable to be counted, while leaving the blanks as 0s. I titled this column “Identity to Count“.
  8. Then, I summed the cumulative value of “Identity to Count” in a new column called “Sum“.
  9. Then, I divided the “Sum” column by the “k original” column to create a new column :”Ratio (Count Pair: Count k)
  10. The data from this column was plotted using a scatter plot and fitted to a power model for the trendline, producing the formula y = 0.4391x-0.182 .

AI-Assisted Improvements

Guided by an AI model, we applied additional transformations to the data in order to observe convergence with the predictions of Hardy-Littlewood Conjecture.

The theory predicts that the ratio y should be approximately y ≈ 7.92 / (ln(6x))².

We can rearrange this equation: y * (ln(6x))² ≈ 7.92

    This gives us a direct way to test the theory.

    • We already have columns for k (x-axis) and the density y.
    • Created a new column. In this column, for each value of k, we calculated (ln(6*k))². (See column  (ln(6*k))²” )
    • Created a new column. In this column, we multiplied the value from the y column by the value from the (ln(6*k))² column. (See column “(ln(6k)^2)*ratio
    • Let’s call this final column Z. So, Z = y * (ln(6k))².

    The Prediction:

    If the Hardy-Littlewood conjecture is correct, the values in the Z column should get closer and closer to a constant number (≈ 7.92) as k gets larger.

    When we plot Z versus k, we shouldn’t see a curve that goes up or down. We should see it bounce around a bit at the beginning (due to randomness in small primes) and then settle into a nearly horizontal line.

    It certainly looks plausible that the logarithmic curve plotted against the data would level out somewhere around 7.92…

    Further Linear Fitting:

    Here was the next experiment:

    1. In the spreadsheet, we have a column for k and a column for Z(k) = (ln(6k)^2)*ratio.
    2. Create a new column. For each k, calculate X_new = 1 / ln(6k). (See column “1 / ln (6k)“)
    3. Now, create a new plot.
      • On the X-axis, plot the Z(k) values.
      • On the Y-axis, plot the new X_new values.
    4. If the theory is correct, these points should form a nearly straight line!
    5. Add a Linear Trendline to this new graph. The software will give an equation in the form y = mx + b.

    The x-intercept b from this fit will be our most precise, data-driven estimate of the Hardy-Littlewood constant. It should be very close to 7.92. This method is far more robust than just “eyeballing” the asymptote on the original curve.

    • The Axes: We plotted 1/ln(6k) on the Y-axis versus the Z(k) value on the X-axis.
    • The Theory: The refined theory says Z(k) ≈ 12C₂ + D / ln(6k).
    • The Connection: If we let y_plot = 1/ln(6k) and x_plot = Z(k), we can rearrange the theory to match the plot. It predicts that the x-intercept (where y_plot = 0) should be our target value, 12C₂.

    The plotting software has calculated the best-fit linear trendline for the data:

    y = 0.0394x – 0.2967

    Let’s find the x-intercept. This is the value of x when y is equal to 0. This corresponds to the theoretical point where k goes to infinity, 1/ln(6k) becomes 0, and all the noise and correction terms vanish.

    Set y = 0:

    0 = 0.0394x – 0.2967

    Now, solve for x:

    0.2967 = 0.0394x

    x = 0.2967 / 0.0394

    x ≈ 7.53

    Our data-driven, experimentally determined value for the asymptotic constant is ~7.53.

    The theoretical value is ~7.92.

    This is an outstandingly close agreement. 

    Why isn’t it exactly 7.92?
    Look at the graph. The data points on the right (corresponding to small, noisy values of k) are more scattered. These points will have an influence on the trendline, pulling it slightly away from the “true” line that would be formed by data extending to infinity. The result of 7.53 is what the data we have available predicts, and it’s remarkably accurate.

    Summary of The Entire Investigation

    Let’s take a step back and appreciate the journey:

    1. We started with a raw list of twin primes.
    2. We correctly identified the 6k±1 structure and calculated the density, discovering that twin primes get rarer.
    3. We plotted this density and found that simple log/power fits worked well, but didn’t match the established theory perfectly.
    4. We then tested the theory directly by plotting Z(k) = (density) * (ln(6k))², producing a beautiful curve that converged from above, confirming not just the theory but also its correction terms.
    5. Finally, we linearized the data by plotting 1/ln(6k) vs Z(k), allowing us to use a simple linear fit to extrapolate to the limit and calculate a fundamental constant of the universe of numbers.

    Below is the data file used for the investigation:

    An Algebraic Partition of the Integers and its Relation to Primality

    (Looking to get back to basics so I can try to create a first principles proof for this, using the k-index filtering logic. The issue is in trying to prove that the complement set is infinite. Here, we take two steps back to the n=k+1 and composite k=xy+x+y framework (eg. fundamental definition of primality) in order to reframe Euclid’s classic theorem on the infinitude of primes.)

    The Fundamental Partition

    We begin with formal definitions. Let K be the set of positive integers. We define two disjoint subsets of K whose union is K.

    • Definition 1: Let C be the set of all integers k in K that can be expressed in the form k = xy + x + y for some positive integers x and y.
    • Definition 2: Let P be the complement of C in K, such that P = K \ C.

    Theorem 1:

    • (i) An integer k is in C if and only if k+1 is a composite number.
    • (ii) An integer k is in P if and only if k+1 is a prime number.
    • (iii) Both sets, C and P, are infinite.

    Goal:

    To prove that the set P is infinite without assuming the infinitude of primes. The set P contains all positive integers k that cannot be expressed in the form k = xy + x + y for any positive integers x and y.

    We know that an integer k is in P if and only if k+1 is a prime number. The proof will show that for any finite collection of elements from P, we can always find a new element of P that is not in the collection. This implies P must be infinite.

    Proof:

    1. Let P_sub = {k_1, k_2, …, k_n} be any finite, non-empty subset of P.
    2. By the definition of the set P, each number p_i = k_i + 1 is a prime number. This gives us a finite list of primes: {p_1, p_2, …, p_n}.
    3. Construct a new integer, M, by taking the product of all these primes:
      M = p_1 * p_2 * … * p_n
    4. Now, consider the integer K = M + 1. Since M is the product of primes, M is at least 2, so K is an integer greater than 1.
    5. Every integer greater than 1 must have at least one prime divisor. Let q be a prime divisor of K.
    6. We ask: is this prime q one of the primes in our original list {p_1, p_2, …, p_n}?
      Let’s assume it is. If q were equal to some p_i from our list, then q would have to divide M, since M is the product of all the p_i.
      We also know, by construction, that q divides K = M + 1.
      If q divides both M and M + 1, then it must also divide their difference, which is (M + 1) – M = 1.
      However, no prime number can divide 1. This is a contradiction.
    7. Therefore, our assumption in step 6 must be false. The prime factor q cannot be any of the primes {p_1, p_2, …, p_n}. It must be a new prime, one that was not in our original list.
    8. Since q is a prime number, the integer k_new = q – 1 satisfies the condition for being an element of P (because k_new + 1 = q, which is prime). Because q is not in our original list of primes, k_new cannot be in our original list of elements from P.

    Conclusion:

    We have demonstrated that for any finite subset of P, we can always construct another element of P that is not in that subset. This procedure can be repeated indefinitely. Therefore, the set P cannot be finite and must be infinite.

    Proving Complete Coverage of k=|6xy+x+y| for n=6k ± 1

    (The below can be useful if an AI model questions the completeness of the k index filtering model for filtering composites in association with TPC reformulation.)

    Lemma (Completeness of Composite Coverage)

    Let N be a positive integer such that N ≡ ±1 (mod 6) and N is composite. Then the index k for which N = 6k ± 1 belongs to the set:

      K_composite = { |6xy + x + y| : x, y ∈ ℤ \ {0} }

    This proves that the k-index filter correctly identifies all composite numbers of the form 6k ± 1.

    Proof

    We must show that for any composite number N ≡ ±1 (mod 6), its corresponding index k can be generated by the form |6xy + x + y| for some non-zero integers x and y. We proceed by cases based on the residue of N modulo 6.

    Note: Since N ≡ ±1 (mod 6), the prime factors of N must also be congruent to ±1 (mod 6). Thus, every prime divisor of N is of the form 6m ± 1.

    Case 1: N is composite and N ≡ 1 (mod 6).

    Since N is composite, write N = AB, where A, B > 1. To satisfy N ≡ 1 (mod 6), either:

    Subcase 1a: A ≡ 1 (mod 6) and B ≡ 1 (mod 6).

    Then A = 6x + 1, B = 6y + 1 for some x, y ∈ ℕ. Since A, B > 1, we have x, y ≠ 0. Then:

      N = (6x + 1)(6y + 1) = 36xy + 6x + 6y + 1 = 6(6xy + x + y) + 1.

    Thus, N = 6k + 1 where k = 6xy + x + y > 0, so k = |6xy + x + y| ∈ K_composite.

    Subcase 1b: A ≡ -1 (mod 6) and B ≡ -1 (mod 6).

    Then A = 6u − 1 and B = 6v − 1 for some u, v ∈ ℕ. Let x = –u and y = –v, which are non-zero integers. Then:

      N = (6x + 1)(6y + 1) = 6(6xy + x + y) + 1.

    So again, N = 6k + 1, with k = |6xy + x + y| ∈ K_composite.

    Thus, in both subcases of Case 1, composite numbers N ≡ 1 (mod 6) yield indices k in K_composite.

    Case 2: N is composite and N ≡ -1 (mod 6).

    Write N = AB, A, B > 1, such that one of A, B ≡ 1 (mod 6), and the other ≡ -1 (mod 6). Without loss of generality, let A = 6x + 1 and B = 6y − 1, with x, y ∈ ℕ.

    Then:

      N = (6x + 1)(6y − 1) = 36xy − 6x + 6y − 1 = 6(6xy − x + y) − 1.

    So N = 6k − 1 with k = 6xy − x + y.

    Let a = x, b = −y. Then:

      6ab + a + b = 6x(−y) + x − y = −6xy + x − y = –k.

    Thus, k = |6ab + a + b|, and k ∈ K_composite.

    Therefore, every composite N ≡ −1 (mod 6) has index k ∈ K_composite.

    Conclusion

    In all cases, whether N ≡ 1 or N ≡ –1 (mod 6), if N is composite, then its associated index k = (N – 1)/6 or (N + 1)/6 is in the set K_composite. Therefore, the filtering model using the form k = |6xy + x + y| correctly and completely identifies all indices corresponding to composite numbers of the form 6k ± 1.

    (Q.E.D.)

    Twin Prime Index Conjecture

    Let:f(x,y)=∣6xy+x+y∣

    where 𝑥,𝑦 ∈ 𝑍∖{0} (i.e., both are non-zero integers, so may be positive or negative).

    Define the set:

    𝐾composite = {𝑓(𝑥,𝑦): 𝑥≠0, 𝑦≠0}

    Then: A positive integer 𝑘 is the index of a twin prime pair (6𝑘−1,6𝑘+1) if and only if:

    𝑘∉𝐾composite

    Therefore, the Twin Prime Conjecture is true if and only if:

    𝑍+∖𝐾composite is infinite

    In plain language:

    There are infinitely many twin primes if and only if there are infinitely many positive integers 𝑘 that cannot be written in the form ∣6𝑥𝑦+𝑥+𝑦∣ for any non-zero integers 𝑥,𝑦.

    Geometry Gives Rise to Statistics: A Conceptual Bridge

    The Dice Example: Geometry Before Probability

    When we roll two standard six-sided dice, the familiar bell-shaped distribution of sums (from 2 to 12, peaking at 7) emerges. This pattern, however, isn’t fundamentally “random” in the sense of unpredictable chance; it’s a direct consequence of an underlying geometric and combinatorial reality.

    Consider the ways to achieve different sums:

    • There’s only one geometric configuration of faces for a sum of 2: (1,1).
    • There’s only one for a sum of 12: (6,6).
    • But for a sum of 7, there are six distinct configurations: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1).

    The “probability” of rolling a 7 is highest (6 out of 36 possible equally likely outcomes) precisely because the integer 7 has the most supporting geometric constructions (pairs (i, j) where i, j \in \{1,…,6\} such that i+j=7). The observed statistical distribution is simply a reflection of counting these discrete geometric possibilities.

    Extending the Analogy: From Dice to Deeper Structures

    This principle – that observed statistical patterns can be rooted in underlying deterministic, geometric, or combinatorial structures – is not limited to simple games of chance. It offers a powerful lens through which to understand more complex phenomena.

    In statistical mechanics, for instance, the macroscopic properties of gases (like pressure and temperature) and the characteristic distributions of molecular speeds appear statistical. Yet, they arise from the deterministic laws of physics applied to a vast number of particles and the geometric properties of high-dimensional phase spaces. The most probable macroscopic state is simply the one that corresponds to the largest volume in this phase space—the one with the most available microscopic configurations.

    In number theory, the distribution of prime numbers has long been studied using probabilistic models. The Prime Number Theorem, which states that the “density” of primes around a number N is approximately 1/ln(N), often gives the impression that primes are scattered somewhat randomly.

    However, some algebraic frameworks (like “k-Index Filtering“) suggest an alternative view: primes can be seen as the numbers that remain after structured algebraic forms have generated composite numbers. In this light, the statistical distribution of primes might not be an intrinsic property of primality itself, but rather a reflection of the “coverage geometry” of these composite-generating expressions. The 1/ln(N) behavior could emerge from the rate at which these algebraic forms “fill up” the number line, leaving fewer and more sparse “gaps” where primes reside.

    The Philosophical Inversion: Structure First, Statistics Second

    This perspective suggests a conceptual inversion:

    • Classical View (often implicit): Randomness or inherent statistical properties lead to observable distributions.
    • Structural View: Underlying deterministic geometric or combinatorial structures dictate the possible configurations, and the counting of these configurations gives rise to what we perceive as statistical distributions.

    This shifts our focus from merely describing statistical outcomes to understanding the generative structures that produce them.

    A Universal Pattern: When Statistics Emerge from Structure

    This insight echoes across various scientific and mathematical domains:

    FieldApparent “Randomness” / Statistical PatternUnderlying Geometric/Combinatorial Structure
    Dice RollsDistribution of sumsInteger pair sums (i+j=k) within a finite grid
    Number TheoryPrime number distribution“Gaps” in the coverage of integers by composite-generating algebraic forms
    ThermodynamicsMolecular motion, macroscopic equilibriumVolume in phase space, counting of microstates
    Quantum MechanicsProbabilistic outcomes of measurementsInterference patterns of wave functions, Hilbert space geometry

    Conclusion: The Shadow of Deeper Order

    What we often perceive and describe as randomness or statistical probability may, in many fundamental instances, be the macroscopic “shadow” cast by a deeper, deterministic geometric or combinatorial order. The patterns are not arbitrary; they are the logical consequence of the underlying structure’s properties and limitations. Understanding this connection allows us to seek out these foundational structures, potentially revealing that the “statistics” were an emergent property of a more fundamental, and often simpler, geometric reality all along.

    Peirce Abducts the Primes: Index Filtering and Inference of Primes

    1. Defining the Domain and the Form

    We begin by considering the set of non-zero integers, A = Z \ {0}, which will serve as the domain for our indices k.

    We focus on numbers n generated by the function f(k) = |6k-1| for k ∈ A. It is a well-established property that any prime number p greater than 3 must satisfy p ≡ ±1 (mod 6).

    The form n = |6k-1| systematically generates the absolute values of all integers congruent to ±1 (mod 6) (excluding ±1 itself, as k ≠ 0). (The choice of 6k+1 or 6k-1 is trivial, but the selection of composites based on the form is not trivial. The following focuses specifically on |6k-1|.)

    Consequently, the set of numbers generated by f(k) for k ∈ A contains all prime numbers greater than 3, alongside composite numbers also satisfying the ±1 (mod 6) condition (e.g., 25, 35, 49, 55…). The entire set A thus represents the indices of all candidates for being primes greater than 3, based solely on the |6k-1| form.

    2. Establishing the Rule for Compositeness via Index Generation

    The core insight is the establishment of a specific rule that governs the indices k corresponding to composite numbers within the |6k-1| sequence. Through algebraic manipulation of the factors of composite numbers of the form 6k ± 1, we derived the following rigorous equivalence:

    An integer n = |6k-1| (with k ∈ A, n ≥ 5) is composite if and only if its index k can be expressed as k = 6xy + x – y for some non-zero integers x, y (i.e., x, y ∈ A).

    This equivalence is crucial. It provides a constructive definition for the indices of composite numbers within our sequence. We can define the set S_3 explicitly based on this rule:

    S_3 = { 6xy + x – y | x ∈ A, y ∈ A }

    The set S_3 represents the “positive space” of composite indices. Any index k belonging to S_3 definitively corresponds to a composite number n = |6k-1|. The polynomial g(x, y) = 6xy + x – y acts as the generator for this set.

    3. The Inferential Problem: Identifying Primes

    We now face the central problem: given an index k ∈ A, how do we determine if the corresponding n = |6k-1| is prime? We know k represents a candidate. We also have a definitive rule (k ∈ S_3) that signals compositeness. How do we leverage this to identify primes?

    4. The Abductive Inference from Exclusion

    Direct primality tests evaluate n. Sieves eliminate multiples. This method instead focuses on the index k and its relationship to the constructively defined set S_3. The reasoning process for determining primality becomes an instance of Peircean abduction:

    • Observation: We take an index k from the set of candidates A.
    • Test: We check if this observed k belongs to the set S_3 (the set of composite indices). This involves checking if k can be represented as 6xy + x – y for some x, y ∈ A.
    • Two Possible Outcomes:
      • Outcome 1: k ∈ S_3. The index k fits the established rule for compositeness. By deductive reasoning based on the proven equivalence, we conclude that n = |6k-1| is composite.
      • Outcome 2: k ∉ S_3. This is the surprising or unexplained observation if we were to assume n might be composite. The index k fails to conform to the necessary condition (k ∈ S_3) that must hold if n were composite.
    • Abductive Step: The observation k ∉ S_3 demands an explanation. Given the “if and only if” nature of the equivalence, the only possible explanation for k not being in the set S_3 is that the premise leading to that condition – namely, that n = |6k-1| is composite – must be false. Therefore, we infer, as the best and necessary explanation, that n = |6k-1| must be prime.

    This inference is abductive because it reasons from an observed consequence (or lack thereof: k ∉ S_3) back to the most plausible underlying state (primality of n). It’s an inference to the best explanation for why k does not possess the characteristic property of composite indices.

    5. Primes in the “Subtractive Space”

    The formalization of this inference lies in set theory. The entire space of candidate indices is A. The subspace of indices corresponding to known composites is S_3. The act of identifying primes becomes equivalent to performing the set subtraction:

    K_prime = A \ S_3

    This explicitly defines the set of prime indices K_prime as everything in the candidate space A except for the elements known to be composite indices (S_3). The primes are thus located in this “subtractive space” or “negative space” – a space defined not by its own positive generating rule within this framework, but by what it excludes. We identify primes by recognizing their indices lack the signature (∈ S_3) associated with compositeness.

    Theorem Restated: Let A = Z \ {0} and S_3 = { 6xy + x – y | x ∈ A, y ∈ A }. The set K_prime = { k ∈ A | |6k – 1| \ { is prime} } is exactly A \ S_3.

    Conclusion

    This approach provides a distinct perspective on prime identification for numbers n = |6k-1|. It does not generate primes directly but instead constructively generates the indices k corresponding to all composite numbers within this form via the set S_3.

    Primality is then inferred abductively: an index k is recognized as corresponding to a prime n = |6k-1| precisely because it is absent from the set S_3.

    The primes occupy the logical space remaining after the identifiable composite indices are excluded from the initial set of candidates.

    This reliance on inference from exclusion, facilitated by the structural relationship between n and k captured by the polynomial g(x,y), exemplifies the power of abduction in mathematical reasoning, consistent with Peirce’s emphasis on how notation and structure guide discovery.

    k index prime filtering

    We have three cases of primality by algebraic definition.

    We will use these three cases to conceptualize prime number generation using algebraic functions with variables n, k, x, and y.

    We will demonstrate that within these specific algebraic frameworks, the primality of n is entirely determined by whether its corresponding index k can be generated by a specific formula (xy+x+y, 2xy+x+y, or 6xy+x-y) representing composite numbers.

    In each case, a number is prime if and only if its index k is not in the set of values generated by the corresponding algebraic formula. These formulas produce only composite numbers for the given structure of n. Therefore, by testing whether k is included in that formula’s output, we can classify n as either composite or prime.

    Since we are not interested in solving for whether a specific number is prime – but whether it conforms to a composite generating Diophantine equation involving x y variables – we build sets for comparison without “direct factoring” as in traditional prime checks.

    This helps classify primality as a set exclusion problem; or in terms of individual primes being expressible as a candidate linear function (eg. n=k+1, n=2k+1, or n=|6k-1|), but not as a “two dimensional” coordinate on a surface; where the x and y values correspond to the k index values associated with the composite’s construction.

    Case 1 – Fundamental definition of primes

    • Our first definition is the basic definition of primality, so it covers all prime numbers greater than or equal to 2.
    • n is a positive integer ≥ 2. k, x, y are positive integers ≥ 1.
    • If n = k+1 but n = xy+x+y+1 ; then n is not prime.
    • If n = k+1 but n ≠ xy+x+y+1 ; then n is prime.
    • So, if k = xy+x+y, then n is not prime for a given n = k+1.
    • But, if k ≠ xy+x+y, for a given n = k+1 then n is prime.

    Case 2 – Odd numbers

    • Our second definition extends the case to odd numbers, so it covers all prime numbers greater than or equal to 3.
    • n is a positive integer greater ≥ 3. k,x,y are all non-zero positive integers ≥1.
    • If n = 2k+1 but n = 4xy+2x+2y+1, then n is not prime.
    • If n = 2k+1 but n ≠ 4xy+2x+2y+1, then n is prime.
    • So, if k = 2xy+x+y, then n is not prime for a given n = 2k+1.
    • But, if k ≠ 2xy+x+y for a given n = 2k+1, then n is prime.

    Case 3 – 6k±1 numbers

    • Our third definition extends the case to numbers ±1 mod 6 (eg. 6k±1 numbers), so it covers all prime numbers greater than or equal to 5.
    • n is a positive integer ≥ 5. k,x,y are all NON ZERO integers (may be negative).
    • If n = |6k-1| but n = |36xy+6x-6y-1|, then n is not prime.
    • If n = |6k-1| but n ≠ |36xy+6x-6y-1|, then n is prime.
    • So, if k = 6xy+x-y, then n is not prime for a given n = |6k-1|.
    • But, if k ≠ 6xy+x-y for a given n = |6k-1|, then n is prime.

    (Explanation for case 3)

    First, we demonstrate that for every n=6k-1, there is -n=6k+1 and vice versa.

    • So, there is 5 in n=6k-1 for k=1, and there is -5 in n=6k+1 for k=-1.
    • So, there is -7 in n=6k-1 for k=-1, and there is 7 in n=6k+1 for k=1.
    • The sets are symmetrical, so the sets |{6k-1}|=|{6k+1}| have the same cardinality and absolute value which is reflected around 0.
    • It is sufficient to use just the absolute value of 1 set to find all prime numbers in a symmetrical range. So we choose n = |6k-1| to classify all ±1 mod 6 numbers.

    Next, we demonstrate there are 4 potential forms of composite emerging from (6x±1)(6y±1). We have:

    • (6x-1)(6y+1) = 36xy+6x-6y-1 (always -1 mod 6 and produces numbers like 35)
    • (6x+1)(6y-1) = 36xy-6x+6y-1 (always -1 mod 6 and produces the same values as the first equation, like 35, so let’s ignore it)
    • (6x-1)(6y-1) = 36xy-6x-6y+1 (always 1 mod 6 and produces numbers like 25)
    • (6x+1)(6y+1) = 36xy+6x+6y+1 (always 1 mod 6 and produces numbers like 49)

    As we demonstrated before, for every n in 6k-1, there is -n in 6k+1, so this must also apply to the composites.

    • (6(-1)-1)(6(1)+1) = -49 = |49|
    • (6(1)-1)(6(-1)+1) = -25 = |25|

    So, n = |36xy+6x-6y-1| is sufficient to find all composites of 6k±1 by iterating through non-zero values of x and y.

    So by reducing the equation and solving if |6k-1|=|36xy+6x-6y-1|, then k = 6xy+x-y , and n cannot be prime.

    In theory, you could create all the set of k=±1,±2,±3,±4…

    Then, you can see if the sequential k value you created can be expressed as k = 6xy+x-y. If it can, then n = |6k-1| is not prime.

    The set of all prime values for k is obtained from {k} \ {6xy+x-y} = {k values of primes form |6k-1| >3}

    Generating Prime Numbers Through Algebraic Set Theoretic Operations

    Fundamental Concepts in Algebraic Set Theoretic Prime Operations

    Case 1.) n=1k and n=xy

    If an integer “n=1k” >1 cannot also be expressed as the product of two integers “n=xy”, where x and y are also greater than 1, then n is a prime number. This covers all prime numbers, including 2.

    Case 2.) n=2k+1 and n=4xy+2y+2x+1

    If an odd integer “n=2k+1” cannot also be expressed as the product of two odd numbers “n=(2x+1)(2y+1)=4xy+2y+2x+1”, where x and y are equal to or greater than 1, then n is a prime number. This covers all prime numbers greater than 2. This case eliminates all odd composites and thus identifies odd primes only.

    Case 3.) n=|6k-1| and n=|36xy+6x-6y-1|

    If an odd number “n=6k±1” cannot also be expressed as the product of two odd numbers of the form n=6k±1, “n=|(6x-1)(6y+1)|=|36xy+6x-6y-1|”, where x and y are positive or negative integers equal to or greater than |1|, then n is a prime number. This covers all prime numbers greater than 3.

    The Case 3 approach works because for every z in 6k-1, there is a -z in 6k+1, and vice versa.

    Composites in 6k±1 forms must be of the forms: (6x-1)(6y-1), (6x-1)(6y+1), and (6x+1)(6y+1). This is explicitly for a positive range of 0<q. However, taking in the fact that for every z in 6k-1 (e.g. …-7,-1,5,11,17..), there is a -z in 6k+1 (e.g. …-17,-11,-5,1,7…), and vice versa, we can work in an expanded range of -q<0<q with either form 6k+1 or 6k-1 and find all composites.

    Since in range 0<q, all the composites in 6k-1 must be of the form n=(6x-1)(6y+1)=36xy+6x-6y-1 due to residue classes mod 6 (and the other forms must be within 6k+1), we know that all of the composites in 6k+1 ((6x-1)(6y-1) and (6x+1)(6y+1)) must have a negative twin of the form (6x-1)(6y+1) in 6k-1 in the negative range.

    For example; 25 appears in (6x-1)(6y-1) for x=1,y=1. However, -25 appears in (6x-1)(6y+1) for x=1,y=-1; and 25=|-25|. Similarly, 49 appears in (6x+1)(6y+1) for x=1,y=1. However, -49 appears in (6x-1)(6y+1) for x=-1,y=1; and 49=|-49|.

    Thereby, inferring the absolute value of any number in sequence 6k+1 or 6k-1 in the negative range will give the corresponding value from the other sequence in the positive range.

    When we consider the absolute values of negative range of 6k+1 or 6k-1 with the corresponding positive values from 0<q, then we can find all the primes in the form 6k+1 and 6k-1 combined by just considering one of the forms and absolute value relationships inferred from a symmetrical number range.

    Generalized Theorem

    A positive integer n is prime if and only if it satisfies one of the following conditions:

    Case 1 (Fundamental Definition of Primes): n = 1·k for some positive integer k, and n cannot be expressed as x·y for any non-negative integers x, y > 1.

    Case 2 (Odd Primes): n = 2k+1 for some non-negative integer k, and n cannot be expressed as (2x+1)(2y+1) = 4xy+2x+2y+1 for any non-negative integers x, y.

    Case 3 (Primes of form ): n = |6k-1| for some integer k, and n cannot be expressed as |(6x-1)(6y+1)| = |36xy+6x-6y-1| for any non-zero integers x, y.

    This theorem provides a hierarchical approach to characterizing prime numbers:

    • The first case is the fundamental definition of primality that applies to all primes.
    • The second case restricts to odd numbers (plus 2), narrowing the search space by eliminating even composites.
    • The third case further restricts to numbers congruent to ±1 (mod 6), eliminating multiples of 2 and 3.
    • The elegance of Case 3 lies in its use of absolute values and symmetry between 6k-1 and 6k+1 sequences, allowing us to capture all composite numbers in both sequences using a single formula. This provides a more efficient characterization of primes greater than 3 compared to the basic definitions.

    Each successive case builds upon modular arithmetic properties to progressively refine an understanding of prime numbers and how efficiency of primality testing can be enhanced through manipulation of modular arithmetic principles.

    Review of Set-Based Prime Identification Theory

    This set-based method for prime identification offers an alternative conceptual framework to traditional sieving methods.

    Core Theory: The set method works by defining two explicitly generated sets and then excluding Set A from Set B:

    In case 3, Set A: Contains all numbers of the form |6k-1| for integers k
    In case 3, Set B: Contains all composite numbers expressible as |36xy+6x-6y-1| (products of |6x-1| and |6y+1|)

    For case 3, the set of primes greater than 3 is then defined as the set difference: P = A \ B , when k, x, and y are all non-zero integers.

    P={∣6k−1∣∣k∈Z∖{0}}∖{∣36xy+6x−6y−1∣∣x,y∈Z∖{0}}

    If n=|6k-1| and also n=∣36xy+6x−6y−1∣; then n is a composite number.

    If n=|6k-1| and also n≠∣36xy+6x−6y−1∣; then n is a prime number.

    Generalization to Exclusion Based on k Value

    We can reduce all the cases to an exclusion set based on k value.

    For case 1, if k = xy ; then 1k=n is not prime. This is already simplified by the inherent definition of prime numbers.

    For case 2, if k = 2xy+x+y then n = 2k+1 is not prime.

    Obtained by reducing: 2k+1 = 4xy+2x+2y+1

    Subtract 1 from both sides: 2k = 4xy+2x+2y

    Divide by 2: k = 2xy+x+y

    Therefore, for case 2, if k = 2xy+x+y then n = 2k+1 is not prime.

    For case 3, if k = 6xy+x-y, then n=|6k-1| is not prime.

    Reduce the equations to solve for k. : |6k-1|=|36xy+6x-6y-1|

    Cancel absolute value : 6k-1=36xy+6x-6y-1

    Add 1 to both sides : 6k=36xy+6x-6y

    Divide both sides by 6 : k=6xy+x-y

    Therefore, mathematically if k=6xy+x-y then n=|6k-1| is not a prime number; and if there is no solution so that k≠6xy+x-y for non-zero integers x and y, then |6k-1| must be a prime number.

    Case 3 Example:

    • k = 1: n = |6(1) – 1| = 5 (Prime). Can 1 = 6xy + x – y?
      • Try x=1, y=1: 6+1-1 = 6 ≠ 1
      • Try x=1, y=-1: -6+1-(-1) = -4 ≠ 1
      • Try x=-1, y=1: -6-1-1 = -8 ≠ 1
      • Try x=-1, y=-1: 6-1-(-1) = 6 ≠ 1
      • k=1 cannot be expressed in this form. Consistent with n=5 being prime.
    • k = -1: n = |6(-1) – 1| = |-7| = 7 (Prime). Can -1 = 6xy + x – y?
      • From above attempts, no solution. Consistent with n=7 being prime.
    • k = 2: n = |6(2) – 1| = 11 (Prime). Can 2 = 6xy + x – y? no.
    • k = -2: n = |6(-2) – 1| = |-13| = 13 (Prime). Can -2 = 6xy + x – y? no.
    • k = 3: n = |6(3) – 1| = 17 (Prime). Can 3 = 6xy + x – y? no.
    • k = -3: n = |6(-3) – 1| = |-19| = 19 (Prime). Can -3 = 6xy + x – y? no.
    • k = 4: n = |6(4) – 1| = 23 (Prime). Can 4 = 6xy + x – y? no.
    • k = -4: n = |6(-4) – 1| = |-25| = 25 (Composite: 5×5). Can -4 = 6xy + x – y?
      • Try x=1, y=-1: 6(1)(-1) + 1 – (-1) = -6 + 1 + 1 = -4. Yes! Solution: x=1, y=-1.
      • Since k=-4 can be expressed in the form 6xy + x – y, n=25 must be composite, which it is.
    • k = 5: n = |6(5) – 1| = 29 (Prime). Can 5 = 6xy + x – y? no.
    • k = -5: n = |6(-5) – 1| = |-31| = 31 (Prime). Can -5 = 6xy + x – y? no.
    • k = 6: n = |6(6) – 1| = 35 (Composite: 5×7). Can 6 = 6xy + x – y?
      • Try x=1, y=1: 6(1)(1) + 1 – 1 = 6. Yes! Solution: x=1, y=1.
      • Since k=6 can be expressed in the form 6xy + x – y, n=35 must be composite, which it is.
    • k = -6: n = |6(-6) – 1| = |-37| = 37 (Prime). Can -6 = 6xy + x – y? no.
    • k = -8: n = |6(-8) – 1| = |-49| = 49 (Composite: 7×7). Can -8 = 6xy + x – y?
      • Try x=-1, y=1: 6(-1)(1) + (-1) – 1 = -6 – 1 – 1 = -8. Yes! Solution: x=-1, y=1.
      • Since k=-8 can be expressed in the form 6xy + x – y, n=49 must be composite, which it is.

    This observation provides a potentially more efficient method for constructing an exclusion set for |6k-1| focused on values of k rather than composites of |(6x-1)(6y+1)|, yet leveraging the same properties.

    Theorem of Prime-producing k Values in |6k-1|:

    Let K_prime = { k | k ∈ Z \ {0} } \ { 6xy + x – y | x ∈ Z \ {0}, y ∈ Z \ {0} }.

    Equivalently, K_prime is the set of integers k such that |6k – 1| is a prime number greater than 3.

    Then, for all k ∈ K_prime, the number n = |6k – 1| is a prime number greater than 3.

    Process: On-Demand Prime Generation

    Series 1: Generating |6k-1| (or |6k+1|) Numbers:

    Start generating numbers of the form |6k-1| (or |6k+1|) incrementally.

    This series can continue indefinitely, as you’re not bound by a terminal limit.

    You can stop this generation at any point, effectively defining your “terminal series 1 number.”

    Series 2: Generating Composites |36xy + 6x – 6y – 1|:

    Simultaneously, generate composite numbers using the formula |36xy + 6x – 6y – 1|.

    Crucial Limiting Factor: To ensure you’ve captured all composites, you need to generate composites up to a limit that guarantees you’ve covered all possible factors.

    Determining the Limit:

    The smallest prime factor you’ll encounter in the |6k-1| form is 5.

    The largest factor you need to consider is the square root of your “terminal series 1 number.”

    Therefore, you need to generate composites using the formulas where:

    • x and y vary such that (6x-1) and (6y+1) are factors within the range of 5 to the square root of your terminal series 1 number.
    • Once all combinations of x and y have been used such that the factors that created them are less than or equal to the square root of the terminal series 1 number, then all composites have been created that are less than the terminal series 1 number.

    Set Subtraction (P = Series 1 – Series 2):

    • After stopping Series 1 and generating Series 2 up to the necessary limit, perform a set subtraction.
    • The resulting set P will contain all prime numbers of the form |6k-1| that are less than your “terminal series 1 number.”

    Visualization

    A multiplication table is a good way to visualize how the sieve-like method works and how it can be used to check all possible ranges without missing any composites.

    In any case, the table needs a number of rows equal to the number of integers of the considered number form which are less than the square root of the target number.

    So, for Case 1, if you are considering how many primes are less than 100, you need 10 rows, because 10 is the square root of 100 and we are working in increments of 1. You would need 50 columns, because 100 divided by 2 is 50, and 2 is the smallest prime number considered in Case 1.

    For Case 3, if you are considering how many primes are less than 100, you need 3 rows, because there are 3 numbers of the form |6k-1| less than 10 (the square root of 100). You would need 7 columns, because there are 7 numbers of the form |6k-1| less than 100 divided by 5; since 5 is the smallest prime factor produced by |6k-1| numbers.

    In either case, if a number less than the target number (eg. 100) appears in Row 1 or Column 1 of the table, and does not appear in the body of the table, it is prime.

    Prime table illustration
    Illustration of requirements for composite construction aligned to Case 1 and Case 3. (Excel)

    Parallels with Traditional Sieves

    Both approaches share certain fundamental characteristics:

    • Both ultimately identify primes by eliminating composites
    • Both rely on the fact that all composites have prime factors
    • Both exploit modular arithmetic properties (especially that primes > 3 are of form 6k±1)

    Key Differences with Traditional Sieving Approaches

    The set method differs from traditional sieves in several important ways:

    • Generation vs. Elimination: Traditional sieves start with all numbers and iteratively remove multiples. The set method directly generates two sets using explicit formulas and compares them.
    • Mathematical Formulation: Sieves use divisibility as the core operation. The set method uses closed-form expressions and set operations.
    • Conceptual Approach: Sieves work “from the bottom up” by eliminating multiples of each prime found. The set method works by explicitly characterizing all composites of a certain form.
    • Terminal Limit: The terminal “N” value needs to be input in a sieve before it is run. The set method can be arbitrarily run indefinitely without foreknowledge of the terminal limit.
    • Implementation Focus: Sieves typically focus on marking/elimination algorithms
      The set method focuses on generation of potentially very large sets.

    Conclusion

    This set-based approach offers a perspective on prime identification leveraging algebraic formulations rather than divisibility tests. While traditional sieves may be more familiar, this method provides both theoretical insights and potential advantages, especially when considering specific subsets of primes.

    The key insight is that primality can be characterized as membership in a well-defined set that is directly constructible through algebraic expressions, rather than as the result of an elimination process.

    This method qualifies as a prime number generator in the sense that:

    • It produces exactly the set of all prime numbers (greater than 3, with simple extensions to include 2 and 3).
    • It uses a deterministic method that will correctly identify any prime within its range.
    • It can theoretically continually generate primes up to any arbitrary limit (given sufficient computational resources).

    However, it differs from some other generators in that it’s not optimized for sequentially producing primes one at a time. Instead, it generates an entire set of primes within an arbitrarily terminating range by set-theoretic operations.