Event-Based and LHV Simulation of an EPR-B Experiment: EPR-Simple and EPR-Clocked

In this note, I analyze the data generated by M. Fodje's (2013) simulation programs "epr-simple" and "epr-clocked". They were written in Python and published on Github. Inspection of the program descriptions shows that they make use of the detection-loophole and the coincidence-loophole respectively. I evaluate them with appropriate modified Bell-CHSH type inequalities: the Larsson detection-loophole adjusted CHSH, and the Larsson-Gill coincidence-loophole adjusted CHSH (NB: its correctness is conjecture, we do not have proof). The experimental efficiencies turn out to be approximately eta = 81% (close to optimal) and gamma = 55% (far from optimal). The observed values of CHSH are, as they should be, within the appropriately adjusted bounds. Fodjes' detection-loophole model turns out to be very, very close to Pearle's famous 1970 model, so the efficiency is close to optimal. The model has the same defect as Pearle's: the joint detection rates exhibit signaling. Fodje's coincidence-loophole model is actually a clever modification of his detection-loophole model. Because of this, however, it cannot lead to optimal efficiency.

One can recognise from the program descriptions that the two programs respectively use the detection loophole (Pearle, 1970) and the coincidence loophole (Pascazio, 1986;Larsson and Gill, 2004) to reproduce (to a good approximation) the singlet state correlation function and thereby to violate the CHSH inequality. See the appendix for further information on the models. For some time I suspected that the author was inspired by event-based simulations, based on these loopholes, which at that time had recently been published by Hans de Raedt and his collaborators, and were being discussed in various internet fora. However, according to recent private communication with Michel Fodje, he had come up with the ideas behind these models, as well as the particular implementation of the general ideas, completely by himself. He never studied the literature on these loopholes, or even considered them as "loopholes". He just tried, independently, and successfully, to create local realistic event-based simulation models which reproduced most features of past experiments. He was certainly provoked by some remarks of mine. On an internet forum I wrote, and he quoted these words, It is impossible to write a local realist computer simulation of a clocked experiment with no "non-detections", and which reliably reproduces the singlet correlations. (By reliably, I mean in the situation that the settings are not in your control but are delivered to you from outside; the number of runs is large; and that this computer program does this not just once in a blue moon, by luck, but most times it is run on different people's computers.) Unfortunately, I did not explain in an accompanying sentence what I meant precisely by the word "clocked "(the emphasis was very deliberately put there, by me, in my original statement). I could better have used the word "pulsed ". I meant that the time axis is split into a regular sequence of fixed time intervals called time slots, and in each time slot and in each wing of the experiment only one setting is chosen, at random, at the start of the time slot; then arbitrary algorithms in each wing of the experiment generate one binary outcome from the data which is collected there, and call it the outcome . . . even if there is no detection event there at all. With regularly spaced time slots and pulsed lasars this arrangement is nowadays quite standard.
If there is no detection event at all in one time slot they should use a fixed outcome, or even toss a coin to make one. Physicists (experimental or theoretical) tend to howl with disapproval at this suggestion. But I was serious. In medical statistics, this is called using the "intention to treat"principle, and is a standard way to deal with non-compliance in double-blind randomised clinical trials.
Fodje wrote that his simulations do not use post-selection or use the detection loophole since all emitted particles are detected. He did not refer to Larsson and Gill (2004), and apparently did not know the literature well, hence did not have any idea what I meant by the word "clocked ", though I certainly talked about it again and again on the internet fora which we both frequented. I meant that the experiment should be set up and analysed as a long regular sequence or run of N trials, each trial corresponding to a pair of time slots, one in each wing of the experiment. Per trial, two random binary settings go in and two binary outcomes come out.
The present versions of the programs on Github now do include extensive explanation of how the models works, but still no reference at all to the now extensive literature on the two loopholes which Fodje exploits.
I study the experimental efficiency of the two models in the CHSH setting. For the detection loophole, the efficiency η is defined to be the minimum over all setting pairs and over all permutations of the two parties Alice and Bob, of the probability that Party 1 detects a particle given that Party 2 has detected a particle. For the coincidence loophole it is defined in a similar way: the efficiency γ is defined to be the minimum over all setting pairs and over all permutations of the two parties Alice and Bob, of the probability that Party 1 has a detection which is paired to Party 2's detection, given that Party 2 has detected a particle.
If either loophole is present in the experiment, then the CHSH inequality is not applicable, or to be more precise, the statement that local hidden variables cannot violate CHSH is not true. I refer the reader to the survey paper Larsson (2014); arXiv eprint http: //arxiv.org/abs/1407.0363. One needs to make further (untestable) assumptions such as the "fair sampling" hypothesis in order to deduce impossibility of local hidden variables from violation of CHSH. However, it is not difficult to modify CHSH to take account of the possibly differential "post-selection"of particle pairs which is allowed by these two loopholes. The result is two bounds, replacing the usual bound "2": 4/η − 2 for the detection loophole, and 6/γ − 4 for the coincidence loophole; see Larsson (2014) formula (38) and formulas (50), (51), (52). Note that when η = 1 the detection loophole bound equals the usual CHSH bound 2, but as η decreases from 1, the bound increases above 2, at some point passing the best quantum mechanical prediction 2 √ 2 (the Tsirelson bound) and later even achieving the absolute bound 4. The bound is sharp: one can come up with local hidden variable models which exactly achieve the bound at the given efficiency. In particular, with η = 2/3 the detection loophole bound is 4, saying that it is possible for three of the four CHSH correlations to achieve their natural upper limit +1 and one of them its lower natural limit −1.
The coincidence loophole bound is also attainable, and for the same value of the efficiency, worse. In particular, already with γ = 3/4 one can attain three perfect correlations and one perfect anti-correlation.

epr-simple
The programme epr-simple uses the detection loophole (Pearle, 1970) so as to simulate violation of the CHSH inequality in a local-realistic way. The simulated experiment can be characterised as a pulsed experiment. At each of a long sequence of discrete time moments, two new particles are created at the source, and dispatched to two detectors. At the detectors, time and time again, a new pair of random settings is generated. The two particles are measured according to the settings and the outcome is either +1, −1, or 0; the latter corresponding to "no detection".
If either particle is not detected, the pair is rejected.
epr-simple only outputs some summary statistics for the accepted pairs. I added a few lines of code to the program so that it also outputs the "missing data". Not being an expert in Python programming, my additional code is pretty simple.
First of all, I reduced the total number of iterations to 10 million. The original code has 50 million, and this leads to memory problems on the (virtual) Linux Mint system which I use for Python work. Secondly, I added a code line "numpy.random.seed(1234)"in the block of code called "class Simulation(object)", so that identical results are obtained every time I run the code. This means that others should be able to reproduce the numerical results which I analyse here, exactly.
Finally, in the part of the code which outputs the simulation results for a test of the CHSH inequality, I added some lines to preserve the measurement outcomes in the case either measurement results in "zero"and then to cross-tabulate the results.
By the way, for his test of CHSH, Michel Fodje (thinking of the polarization measurements in quantum optics) took the angles 0 and 45 degrees for Alice's settings, and 22.5 and 67.5 for Bob. I have changed these to 0 and 90 for Alice, and 45 and 135 for Bob, as is appropriate for a spin-half experiment.
# Print out extra data print nmp, nm0, nmm # Print out extra data CHSH.append( (Ai*Bj).mean()) QM.append( -numpy.cos(numpy.radians(j-i)) ) I ran epr-simple, redirecting the output to a text file called "data.txt". In a text editor, I deleted all but 12 lines of that file; the lines containing the numbers which are read into R, and then printed out by the R code below.

S # CHSH 4 / etamin -2 # bound
We see that at this experimental efficiency, the local realism bound to CHSH is about 2.9, just above the Tsirelson bound 2 √ 2 ≈ 2.8. The simulation generates results just below the bound: also at about 2.8. The observed value of S, the quantum mechanical prediction 2 √ 2 and the adjusted CHSH bound are all quite close together: the simulation model is pretty close to optimal.

epr-clocked
The program epr-clocked uses the coincidence loophole (Pascazio, 1986). Michel Fodje calls this a "clocked experiment" where he means that time is continuous, the times of detection of particles are random and unpredictable. (I would have preferred to reserve the word "clocked " as synonym for "pulsed "). Because Alice and Bob's particles have different, random, delays (influenced by the detector settings which they meet), one cannot identify which particles were originally part of which particle pairs. Moreover, a small number of particles did not get detected at all, compounding this problem.
The experimenter scans through the data looking for detections which are within some short time interval of one another. This is called the detection window. Unpaired detections are discarded.
I ran the program, setting the spin to equal 0.5, and an experiment of duration 10 seconds. I let Alice and Bob use the settings for a CHSH experiment: Alice uses angles 0 and 90 degrees, Bob uses angles 45 and 135 degrees. (As in epr-simple, Michel Fodje took the angles corresponding to a polarization experiment instead of a spin experiment). I set the numpy random seed to the values "1234", "2345", and "3456"prior to running the source program, Alice's station program, and Bob's station programme respectively. This should make my results exactly reproducible . . . {} but didn't quite, because the 10 second duration of the experiment is ten seconds in "real time". It therefore depends on queries by the program of the actual time in the real world outside the computer, and this process itself can take different lengths of time on each new run. However, the difference between the data obtained in different runs (with the same seed) should be negligeable: the total number of particle pairs will vary slightly, but their initial segments should coincide.
In order to get a strictly reproducible simulation, I rewrote one section of the program "source.py". Below is my replacement code. Instead of running for 10 seconds of real time, the code simply generates exactly 200 000 emissions.

Npairs <-27416 + 27512 + 27345 + 27425
Nsingles <-199994 Npairs; Nsingles gamma <-Npairs/Nsingles gamma 6 / gamma -4 The "correct" bound to the coincidence-selected CHSH quantity S is not 2. In fact, we do not know it exactly, but a correct bound is conjectured to be 6/γ − 4 (Larsson, 2014). Here, γ is the effective efficiency of the experiment measured as the chance that a detected particle on one side of the experiment will be accepted as part of a coincidence pair. Notice that at γ = 1 (full efficiency) the adjusted bound is equal to the usual bound 2, but that as it decreases from 100% the bound rapidly increases. At γ = 3/4 it reaches its natural maximum of 4.
At the observed efficiency of about 55%, the corrected CHSH bound in this experiment is close to 7, far above the natural and absolute bound 4.
Because the efficiency is lower than in epr-simple, while the proper bound (adjusted CHSH) is higher, this experiment is a good deal worse than the previous one in terms of efficiency. It wouldn't be too difficult, at this level of experimental efficiency, to tune parameters of this model so as to get the observed value of CHSH up to its natural maximum of 4.
Incidentally, running epr-clocked many times, I experienced quite a few failures of the program "analyse.py"which is supposed to extract the coincidence pairs from the two data files. It seems that the Larsson algorithm for finding the pairs, which Fodje has adopted for this part of the data analysis, is failing in some circumstances. I could not find out what was the cause of this.
Fortunately it is now rather easy to import Numpy ("numerical python") binary data files into R using the package "RcppCNPy". It should also not be difficult to find a suitable alternative to Larsson's matching algorithm in the computer science literature and probably freely available in C++ libraries. Algorithms written in C++ can often easily be made available in R via "Rcpp". Hence one could replace Michel Fodje's "analyse.py"by one's own data analysis script; this would also allow a "proper "computation of the efficiency γ, taking the minimum over the efficiencies for each setting pair and both permutations of the two parties Alice and Bob.
The appendices below contain mathematical formulas of the two simulation models. I have done my best to extract these faithfully from the original Python code and accompanying explanations by the author Michel Fodje. I have earlier published translations into the R language: http://rpubs.com/gill1109/epr-simple, http://rpubs.com/gill1109/ epr-clocked-core, http://rpubs.com/gill1109/epr-clocked-full

Appendix: epr-simple
Here is a little simulation experiment with (my interpretation in R of) epr-simple. I plot the simulated correlation function and also the acceptance rate. With an effective sample size of N ≈ 0.8 × 10 6 , the statistical error in simulated estimated correlation coefficients is roughly of size 0.001, well below the resolution of the graphs plotted below. Thus the small visible deviations from the theoretical negative cosine curve are for real. I simplify the model by taking spin = 1/2. Formulas are further simplified by a sign flip of all measurement outcomes, which by the symmetries of the model, does not change the observed data statistics.
set.seed (9875) ## For reproducibility. Replace integer seed by your own, ## or delete this line and let your computer dream up one for you ## (it will use system time + process ID).
His special choice has P = (2/ √ V ) − 1 ∈ [0, 1] where V is uniform on [1,4], first expressed in this way by myself in 2014, see http://rpubs.com/gill1109/pearle and Gill (2015.) Below, the simulation is modified accordingly: just one line of code is altered. Now the experimental and theoretical curves are indistinguishable.

Appendix: epr-clocked
Here is the core part of (my interpretation in R of) epr-clocked: the emission of a pair of particles, supposing that both are detected and identified as belonging to a pair. However, if the detection times of the two particles are too far apart (larger than the so-called coincidence window), the pair is rejected. This would correspond to running the full epr-clocked model with very low emission rate. I have simplified the model by fixing spin = 1/2. To save computer time and memory, I re-use the hidden variables E ("e") and U from epr-simple.
Again, I plot the simulated correlation function and also the acceptance rate.
At Alice's measurement device, where the setting is α, the measurement outcome is sign cos(E − α). At Bob's measurement device, where the setting is β, the measurement outcome is sign cos(E − β + π).
Finally, the two detections are accepted as belonging to one particle pair if the difference between their two delay times is less than 0.034; i.e., if they are detected within the same time interval of length maximally 0.034.
The full "epr-clocked" model adds on top of this simple core, some further (relatively small) sources of noise, which do serve to smooth out the anomalous spike when the two particles are measured in the same direction. Moreover, in his simulations, Michel Fodje does not measure at fixed directions, but samples measurement directions uniformly at random in the circle. When he computes correlations, he has to bin measurement directions, resulting in another source of noise, again further smoothing out anomalous features in the observed correlation curve.

Appendix: epr-clocked optimized
Inspection of the data-sets generated by the last model turned up something very interesting: the coincidence window is so small, that for most measurement settings, a pair of detections is only accepted as a pair if both particles experience the same, zero, time delay! But this means that the two detections are accepted as a pair if and only if P < 1.28| cos(E − α) and P < 1.28| cos(E − β)|, because only in this case are both time delays identically equal to zero. Now we are free to divide throughout in these inequalities by the constant 1.28 and absorb it into the random variable P , and we are free to pick a different distribution for P . So let us take P the same as in the Pearle model! Let us also reduce the size of the coincidence window to make it almost impossible for particles which experience non-zero time delays to become paired with their partners.