1. Introduction
Amplified ultrashort laser pulses are in widespread use for the study of numerous phenomena and have provided important insights across many disciplines. Whether a study involves single-shot or multi-shot excitation of an effect, pulse-train stability is essential for optimal performance in any application. In other words, such measurements require consistency not only in the pulse energy but also in the pulse shape, i.e., its intensity and phase evolution during the pulse. For instance, high harmonic generation, which serves as the foundation for many important disciplines, including attosecond light sources [
1], ultrafast optoelectronics [
2], and attosecond spectroscopy [
3,
4,
5], demands high stability in all pulse characteristics. Moreover, spectroscopic methods [
6,
7,
8,
9,
10,
11,
12,
13] use ultrashort pulses to enable the exploration of fundamental properties of matter, such as transient photo-induced phenomena, electronic structure, the dynamics of bound and free electrons, quantum coherence, and quantum spin, all of which require pulse-train stability to prevent fluctuating excitations of the sample with each pulse, especially when nonlinear dependencies or small temporal or spectral variations in the ultrashort pulse are being investigated [
14]. The generation of ultrashort pulses at exotic wavelengths requires pump lasers, i.e., optical parametric oscillators and amplifiers, whose stability is especially consequential as the output pulses are nonlinearly dependent on the properties of the input light. Stable pump lasers are necessary to achieve consistent supercontinuum using optical fibers, in particular, hollow-core fibers [
15,
16], and for applications in high-intensity (>TW) ultrashort pulses, especially as the repetition rates and average power of these systems continue to increase [
17]. Indeed, the medical [
18,
19,
20,
21,
22] and industrial [
23,
24] domains now rely on amplified ultrashort laser pulse trains for surgical procedures, imaging, and micro-fabrication, including ultrashort pulse ablation manufacturing [
25,
26,
27], selective laser-induced etching [
28] and powder bed alloy fusion [
29,
30].
Furthermore, existing short-pulse petawatt-class lasers have now demonstrated new high-intensity laser-matter interactions that lead to secondary sources of high-energy photons [
31] neutrons [
32], and charged particles [
33,
34,
35], with applications to medical imaging [
36], cancer radiotherapy [
37], multi-modal radiography [
38] and tomography [
39], and even fast ignition for inertial fusion energy [
40,
41]. Looking into the future, the realization of the necessary secondary source flux for such applications will require petawatt lasers with high average power (100s kW) and increasing repetition rates (>kHz [
42,
43]). Over the last few years, high intensity (>10
18 W/cm
2), higher repetition rate (10-50Hz) laser systems have been commissioned [
44,
45] and are now online [
17,
46], with 10kHz high-average-power petawatt systems now designed [
47]. A central need for these high average power systems as their repetition rates approach the MHz level is rapid characterization and feedback [
48] to ensure the highest quality pulse trains and the most stable performance. In all the above applications, laser pulse-shape stability is crucial to ensure reliable secondary source generation.
Unfortunately, amplified ultrashort-laser pulses are often plagued by instability, which can be due to a number of factors, including thermal fluctuations, unstable pump sources, inconsistent mode locking, vibrational pointing jitter, and turbulence in the surrounding air, to name only a few such culprits. The resulting instability tends to be more prevalent in laser systems with higher energies and shorter pulses, and it is especially pronounced in cutting-edge laser systems that advance ultrashort-pulse technology, such as few-cycle-pulse systems and lasers at atypical wavelengths.
From the birth of the field of ultrafast optics, shot-to-shot variations in the pulse shape, that is, the intensity-and-phase vs. time, of such laser pulses have presented a particularly difficult challenge for ultrashort-pulse laser measurement [
49,
50]. When faced with a train of pulses with unstable pulse shapes, the temporal intensity autocorrelation (the first method for measuring ultrashort pulses) produces a broad background with a narrow spike atop it. The width of this spike, which has come to be known as the
coherence spike or
coherent artifact, corresponds only to the potentially much shorter coherent temporal component of the unstable pulses in the train. The coherent artifact is therefore shorter than the typical pulse in the unstable train. While it is an obvious spikelike feature when the instability is significant, it can actually be quite misleading when the instability is not so significant; in this case, it may blend into the rest of the trace, resulting in a mistakenly short pulse length by a factor of two or more, as well as the mistaken conclusion of a stable pulse shape.
While it is now possible to measure the complete intensity and phase vs. time for a stable pulse train [
51], many such measurement techniques have not been thoroughly studied for the case of pulse-shape instability and, unfortunately, like autocorrelation, can be badly confused by it. Currently, no device dedicated to the independent verification of pulse-shape instability exists, and, as a result, the responsibility of identifying the existence, severity, and kind of instability falls to the pulse-measurement methods themselves, which are not, in general, designed for this challenge. As a result, insufficient progress has been made in addressing this problem, and recent work [
52,
53,
54,
55,
56] has revealed, among other results, the startling fact that, in the presence of pulse-shape instability, some widely used (mostly interferometric) methods measure
only the coherent artifact.
There are numerous complications in attempting to measure such an unstable pulse train by any technique, as all measurement techniques necessarily provide a single resulting pulse from a given measured trace, and no one pulse can properly represent the many possible different pulses over which a measurement is made. As a result, the task is fundamentally impossible, and one must settle for quantities that are in some sense averaged. Unfortunately, this can yield results that are of little to no value.
Over the centuries, traditional spectrometers have provided a simple “average” spectrum. Of course, such measurements average out any spectral structure and so necessarily deliver a smoother spectrum than is in fact present. An extreme example of the highly misleading information provided by such a measurement is the multi-shot spectrometer measurement of the spectrum of a supercontinuum pulse from microstructure fiber, which yielded an extremely smooth spectrum, despite the fact that each individual pulse spectrum was wildly different and actually comprised over a
thousand sharp spikes [
58,
59].
As a result, a
typical spectrum, which often differs significantly from the
average spectrum, could be, and often is, significantly more complex, but it would be much more informative. Remarkably, in the above-mentioned study, the Cross-correlation Frequency-Resolved Optical Gating (XFROG) technique [
58,
59] used to measure the continuum averaged over
100 billion pulses but nevertheless actually provided such a typical spectrum. As a result, XFROG has become the standard method for measuring such light pulses.
The spectral phase requires even more significant consideration. It is well known that, for a given spectrum, the shortest pulse corresponds to a spectral phase that is flat, while that of a longer, typically more complex, pulse is necessarily complex. Of course, simply averaging the spectral phase over many pulses with random complex spectral phases will, like a spectrometer-measured spectrum, also yield a much simpler and smoother curve, indeed, often a flat spectral phase. As a result, measuring the average spectral phase of an unstable pulse train usually erroneously yields the shortest possible pulse for a given measured spectrum (which, by the way, is also usually averaged over many pulses in these methods and so is also anomalously smooth). Indeed, measuring the average spectral phase will always yield a pulse that is shorter than any of the individual pulses in the train, and often by a large factor. In other words, the average spectral phase is the frequency-domain description of the coherent artifact.
As a result, it is crucial that a pulse-measurement technique
not provide an average spectral phase, which is an essentially useless quantity. If it does, it will invariably yield a shorter pulse than is, in fact, present—unless the pulse train is perfectly stable (and spatially uniform). Indeed, such a method will be unable to differentiate a stable train of short, simple pulses from an unstable train of long, complicated pulses—the best- and worst-case scenarios, respectively, for most applications. Although critically important, this issue is often overlooked and/or poorly understood, and several popular techniques currently in use (and in use for decades) suffer from precisely this problem [
52,
53,
54,
55,
56]. As with the spectrum, but much more importantly here, a pulse-measurement technique should ideally provide a
typical spectral phase, or at least a phase as close as possible to it, which, coupled with a typical spectrum, would then more accurately reflect the average pulse length in time. If this is not possible (and it usually isn’t), the measurement should at least provide a typical pulse length.
Although single-shot measurement is the obvious solution to this problem, it is not possible for many laser systems, especially high-repetition-rate systems, for which even the shortest camera exposure times still capture multiple pulses in the train. And spatially averaging these quantities over a spatially complex beam would likely have precisely the same smoothing effect (although this effect has not been studied). Fortunately, some progress has been made: we earlier demonstrated that discrepancies between measured and retrieved Frequency Resolved Optical Gating (FROG) traces turn out to be a good indicator of instability. [
58,
59] They are a beneficial result of FROG’s overdetermination of the pulse: FROG’s
N×N data array is used to measure only 2
N pulse parameters. So, a trace that averaged over many different pulses cannot correspond to a single pulse. This has turned out to be a very helpful feature, allowing FROG to indicate instability by the presence of systematic discrepancies between measured and retrieved FROG traces.
Unfortunately, possible pulse-retrieval algorithm stagnation can also yield similar discrepancies. Even in the absence of instability, iterative algorithms and, in particular, FROG’s standard Generalized Projections (GP) algorithm can stagnate for complex pulses, yielding a retrieved pulse that bears little resemblance to a typical pulse and also depends on the initial guess. So, distinguishing between pulse-shape instability and algorithm stagnation—two very different issues—as the cause of such trace discrepancies is critical.
This has required us to redefine what we mean by algorithm “convergence” when dealing with pulse-shape instability [
60]. In the absence of instability, convergence is easy to identify: the only differences between the measured and retrieved traces are due to random noise in the measured trace. And stagnation can be easily visually identified by the presence of systematic errors in the difference between the two traces (assuming that the measurement was made correctly).
In the presence of instability, however, the convergence concept is more subtle. In this case, stagnation, as we recently defined it [
60], occurs when the RMS difference between the measured and retrieved traces, usually referred to as
G, is higher than the lowest achievable
G value for the given measured trace. But what is this latter value? Determining it requires running the relevant FROG algorithm numerous times (assuming that the algorithm converges at least once, which nearly always occurs in FROG in practice). The pulse with the lowest value of
G is declared the converged case and hence the best estimate of the typical pulse. Then comparing its trace with the measured trace, it is evident that the more systematic error between the two traces the more instability is present. This is reasonable, but, unfortunately, running an algorithm many times is neither convenient nor always completely convincing.
What is needed is an algorithm that always converges to the lowest possible
G value, even in the presence of instability in the first place. Fortunately, in previous work, we showed that this can be done for the second-harmonic-generation (SHG) version of FROG. Specifically, we demonstrated that our recently introduced Retrieved-Amplitude N-grid Algorithmic (RANA) approach [
61,
62,
63] not only achieves extremely reliable (100%) pulse-retrieval in SHG FROG for trains of stable pulse shapes, even in the presence of noise, but also does so for
unstable pulse trains and so reliably distinguishes between trains of stable and unstable pulse shapes. It also provides a reasonable estimate of the average pulse length, spectral width, and time-bandwidth product (
TBPrms) in the train. This is the case because the RANA approach is extremely reliable and, in studies involving tens of thousands of simulated pulses, even in the presence of significant noise, has
never stagnated. And specifically, we showed that it also did not stagnate even in the presence of pulse-shape instability [
53]. It also provided many of the characteristics of a “typical” pulse in the unstable train [
57] (although it tended to under-estimate the amount of typical pulse structure).
As a result, an analogous never-stagnating algorithm—even in the presence of instability—is also highly desirable for the versions of FROG that are generally used to measure amplified pulses. These include the polarization-gating (PG) and transient-grating (TG) FROG variants (which are highly desirable because they effectively eliminate the direction-of-time ambiguity of SHG FROG). Additionally, PG FROG is automatically phase-matched, and TG FROG is broadly phase-matched, so both are not usually limited by the pulse’s bandwidth.
PG FROG and the more common version of TG FROG are mathematically equivalent. The mathematical relation for their measured trace is:
where
E(
t) is the pulse’s complex electric field as a function of time,
t. Also,
ω is the angular frequency, and
τ is the delay between the pulses.
As a result, as we did for SHG FROG, we here consider trains of unstable complex pulses measured by PG and TG FROG using the analogous RANA approach for them. And we compare the performance of the RANA approach to that of the well-known generalized-projections (GP) algorithm without the RANA improvements for these two FROG beam geometries. We find, for these FROG variations, that the standard GP algorithm also often fails to converge for unstable pulse trains (as it occasionally does for complex stable ones), yielding variable and hence potentially confusing trace discrepancies. As a result, it is an imperfect indicator of instability. But we find that the RANA approach, on the other hand, yields minimal trace discrepancies for all the cases considered, that is, zero stagnations, even for highly unstable pulse trains. It also yields accurate pulse parameters, as well as much of the structure of a typical pulse. We conclude that PG and TG FROG, coupled with the RANA approach, like SHG FROG with RANA, provide highly reliable indicators of pulse-shape instability. And we find that PG/TG FROG yield an even better estimate of a typical pulse in the train, even in cases of high instability.
3. Results
As mentioned earlier, if an algorithm reliably converges, even in the presence of instability, we would expect it to achieve the minimum
G or
G’ error for all reasonable initial guesses. If not, then we would expect to see variations in these values depending on the initial guess.
Figure 1 depicts all the
G’ errors obtained from 100 runs using both the standard GP algorithm and RANA approach on noisy traces.
These results clearly illustrate that the RANA approach converges to the smallest value of
G’ for all runs and for all instability values. In other words, it converged in all cases. In contrast, the standard GP algorithm does fairly well for the least unstable pulse train, but, for increasing instability, it shows significant variability in the resulting
G’ errors, often requiring multiple attempts to achieve the minimum achievable
G’ error. For pulse trains with an average temporal FWHM of 26 fs, the standard GP algorithm successfully converged to acceptable pulses with traces closely matching the measured ones in 92% of the trials. This performance decreased to 86% and 44% for pulse trains with average FWHMs of 54 fs and 108 fs, respectively, that is, more instability. While this performance is not perfect and quite undesirable for the less stable trains, it is nevertheless considerably better than that of GP for SHG FROG that we found in our previous study [
60].
It is important to note that, in our studies, the impact of trace noise was minor compared to that of pulse-shape instability. As is well known, random noise in FROG traces can be minimized using simple preprocessing. In stable pulse trains, the converged G’ error should, and does, align with the average random noise remaining in the measured trace after preprocessing; G’ errors larger than this necessarily indicate pulse-train instability, at least for the RANA approach. For the standard GP algorithm, these discrepancies are a sum of stagnation and instability effects, and the best way to distinguish them using the standard GP algorithm at this time appears to be running the algorithm many times and using the result with the smallest G or G’ error, as we have done here and as is often done in practice. Of course, this is far less desirable than running a more reliable algorithm only once.
Figure 2a shows the resulting multi-shot trace of train #3. Note the obvious thin vertical line in it, i.e., the coherent artifact, which is intimately related to that of autocorrelation (recall that FROG is a spectrally resolved
autocorrelation). Of course, while FROG traces exhibit this artifact, this does not pose a problem in FROG, first because its presence alerts the algorithm and the user (as for autocorrelation) to the instability, and our results confirm this. Also, as such an artifact cannot occur in a stable-pulse or single-pulse FROG trace, all FROG algorithms necessarily ignore it, using it and other information in the trace to retrieve the best possible typical pulse, as we found in our earlier work for SHG FROG [
60]. This is also illustrated in the worst-case retrieved traces for both algorithms we studied, shown in
Figure 2b,c, where no such vertical line appears for either algorithm. In other words, even in the worst cases, no such artifact occurs in retrieved traces. Do note, however, that the standard GP algorithm is somewhat confused by this effect and retrieves a pulse with a larger area, that is, a larger TBP. This occurs, not only for the worst case, but also for the average retrieved pulse for GP. The RANA approach does much better and returns fairly accurate TBPs in all three cases.
Figure 2d-g show every retrieved pulse field using both the GP algorithm (d,f) and the RANA approach (e,g) from a randomly chosen noisy trace of train #3, with 108 fs average temporal FWHM. The standard GP algorithm yields a variety of pulse shapes, based to some extent on the initial guess, with GP-retrieved pulses showing a large range of variations in the temporal and spectral FWHMs. Additionally, the worst-case (and many other unshown) GP-retrieved traces bear no resemblance to the measured trace. In contrast, pulses retrieved using the RANA approach, although somewhat less structured than the actual pulses, do show some of the typical pulse structure and have pulse lengths and spectral widths that closely match those of the measured pulses in the unstable train. Although we should have no expectation of convergence to identical pulses on all runs of any algorithm in the presence of instability, when the trace corresponds to no single pulse, impressively, the maximum and minimum pulse lengths retrieved by RANA are quite similar, indicating that, even in the presence of instability, it achieves a reasonable “typical” pulse length.
Table 2 shows the average
G′ errors and
TBPs for the 100 RANA and GP retrievals from the various FROG traces. For the least unstable case, the two approaches yield similar results, indicating that the standard GP algorithm performs reasonably well for somewhat complex and unstable pulse trains. But, for the two most unstable trains, RANA yields smaller
TBPs and standard deviations than GP, indicating both better convergence and also less variations in the retrieved pulses. The larger standard deviations of the
TBP values of the standard GP algorithm indicate that, when convergence is not achieved, the retrieved fields exhibit significant variations, resulting in inconsistent and arguably unreliable results.
Figure 3 plots the RMS time-bandwidth product (
TBPrms) against the
G’ error for all of the retrieved pulses using both the GP and RANA approach. Because we used the same pulse trains in this simulation as in our previous work, the average theoretical pulse length and
TBPrms values for the fluctuating pulses of the trains were the same, with averages of
τFWHM = 26, 54, and 108fs and
TBPrms of 1.97, 4.75, and 9.28, respectively. When convergence is achieved with either the RANA approach or the GP algorithm, the
TBPrms values align well with the actual values. However,
Figure 3 also indicates that stagnated results from using the standard GP algorithm can lead to
TBPrms values that do not accurately reflect the average
TBPrms of the pulses within the train, with discrepancies increasing with increasing instability.
As discussed in our previous work [
63], the RANA approach uses four directly retrieved approximate spectra to produce the initial guesses from the measured trace for input into the iterative process, and we also did this in this study of PG/TG FROG. Note that these directly retrieved spectra serve only as initial guesses for the algorithm and need not be very accurate. In the absence of instability, such spectra usually accurately approximate the pulse spectrum, but, in the presence of instability, they instead reflect the
average spectrum. This is to be expected, but it could, in principle, degrade the performance of the RANA approach. However, RANA’s other advantageous features appear to compensate for this averaging. Also, the average spectrum, whose width is fairly accurate, is still a better initial guess than noise, the usual initial guess for standard GP.
Figure 4 plots the average spectra of the trace along with the first choices among these four retrieved spectra.
It is important to note that the mean retrieval times for traces with instability are significantly longer than those for stable trains for all algorithms. This is because convergence rates are slower when the trace does not match an actual pulse, requiring more iterations for the algorithm to converge. However, the average retrieval time for the RANA approach is less than that for the GP algorithm, even in cases where the GP algorithm converges.
4. Conclusions
In this study, we explored the convergence behavior of the standard GP algorithm and the recently introduced RANA approach for PG/TG FROG traces averaged over trains of pulses with unstable shapes and hence with coherent artifacts. As expected, we found that instability yields PG/TG FROG traces that do not correspond to traces from single pulses, which implies that these techniques are very useful for identifying instability, in addition to measuring the pulse intensity and phase vs. time. But we found that the standard GP algorithm often fails to clearly distinguish between discrepancies caused by instability artifacts and those due to algorithm stagnation, making it difficult for it to identify instability accurately. This necessitates multiple retrieval attempts to find the lowest G error, with no definitive way to determine if the discrepancies are due to instability or algorithm stagnation.
In contrast, our results show that, for traces of trains with unstable pulse shapes, the RANA-retrieved trace consistently achieves the minimum discrepancy with the measured trace, resulting in the lowest G’ error, that is, converges in all cases that we considered. Thus the RANA approach, established to have 100% reliability with stable pulses, also proves highly effective in evaluating pulse-train stability. This reliable convergence indicates that any discrepancies between the measured and retrieved traces are due to pulse-shape fluctuations, and not algorithm stagnation. Moreover, the retrieved field accurately reflects the average pulse length, spectral width, and TBP of the pulses in the train, providing the best available depiction of a typical pulse, although with somewhat less structure. It also provides some of the structure present in a typical pulse in the train.
In summary, the RANA approach offers the most reliable FROG pulse-retrieval approach for both stable and unstable pulse trains, making it a highly reliable gauge of the average pulse length, TBP, and pulse-train stability or instability and, of course, the precise pulse intensity and phase vs. time and frequency for stable pulse trains. As no other pulse-measurement technique has, to our knowledge, been shown to possess all of these characteristics, RANA (in particular, in conjunction with PG/TG FROG) yields the best general performance of all existing pulse-measurement techniques.
Finally, it is worth considering possible future effort on this subject. First, performing this analysis on larger sets of pulse trains would be helpful, as is always the case for iterative algorithms. Also, additional versions of FROG, including those that use the nonlinear processes, self-diffraction (which is mathematically equivalent to the alternative arrangement of TG FROG, not considered herein) and third-harmonic generation, would also benefit greatly from the RANA approach, and their ability to reliably discern instability should be determined. Also, the RANA approach can utilize any FROG algorithm (not just GP) as its kernel, and we believe that algorithm performance for both stable and unstable trains would be vastly improved using it in conjunction with those algorithms. Lastly, no technique, including that described herein, can determine the precise type of instability present in the pulse train. As the parameter space involved in such a study is vast, this is a challenge that is unlikely to be achieved any time soon, but we mention it here to inspire future generations.