# Programmable electrical coupling between stochastic magnetic tunnel junctions

Sidra Gibeault<sup>®</sup>,<sup>1</sup> Temitayo N. Adeyeye,<sup>1</sup> Liam A. Pocher,<sup>1</sup> Daniel P. Lathrop<sup>®</sup>,<sup>1</sup>

Matthew W. Daniels<sup>(D)</sup>,<sup>2</sup> Mark D. Stiles<sup>(D)</sup>,<sup>2</sup> Jabez J. McClelland,<sup>2</sup> William A. Borders<sup>(D)</sup>,<sup>2</sup>

Jason T. Ryan<sup>®</sup>,<sup>2</sup> Philippe Talatchian,<sup>3</sup> Ursula Ebels,<sup>3</sup> and Advait Madhavan<sup>®</sup>,<sup>2</sup>,\*

<sup>1</sup>Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, Maryland, USA

<sup>2</sup> Physical Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland, USA

<sup>3</sup>CEA, CNRS, Grenoble INP, SPINTEC, University Grenoble Alpes, Grenoble 38000, France

(Received 20 December 2023; revised 22 February 2024; accepted 27 February 2024; published 29 March 2024; corrected 3 January 2025)

Superparamagnetic tunnel junctions (SMTJs) are promising sources of randomness for compact and energy-efficient implementations of probabilistic computing techniques. Augmenting an SMTJ with electronic circuits, to convert the random telegraph fluctuations of its resistance state to stochastic digital signals, gives a basic building block known as a probabilistic bit or *p*-bit. Though scalable probabilistic computing methods connecting *p*-bits have been proposed, practical implementations are limited by either minimal tunability or energy-inefficient microprocessors in the loop. In this work, we experimentally demonstrate the functionality of a scalable analog unit cell, namely a pair of *p*-bits with programmable electrical coupling. This tunable coupling is implemented with operational amplifier circuits that have a time constant of approximately 1  $\mu$ s, which is faster than the mean dwell times of the SMTJs over most of the operating range. Programmability enables flexibility, allowing both positive and negative couplings, as well as coupling devices with widely varying device properties. These tunable coupling circuits can achieve the whole range of correlations from -1 to 1, for both devices with similar time scales, and devices whose time scales vary by an order of magnitude. This range of correlation allows such circuits to be used for scalable implementations of simulated annealing with probabilistic computing.

DOI: 10.1103/PhysRevApplied.21.034064

#### I. INTRODUCTION

Probabilistic computing is becoming increasingly popular given its applicability to many classes of optimization problems and its amenability to hardware acceleration [1]. The constraints of a given problem are encoded as interactions between Ising spins, which vary randomly to explore configuration space. Conventional implementations either generate random numbers in a centralized location and distribute them to the computational cores [2], or use microprocessors with digital-to-analog converters in the feedback loop to mediate interactions [3,4], neither of which are conducive to scaling. In this work, we use low barrier magnetic tunnel junctions (MTJs) to implement the Ising spins and analog circuits to mediate direct interactions between them, leading to scalable analog implementations of probabilistic computing kernels.

The central objects in probabilistic computers are probabilistic bits, also known as p-bits [5]. Unlike a conventional Boolean bit, which takes only deterministic 1 and 0 values, a p-bit uses its room-temperature instability to encode a probability value. This is particularly useful in applications such as energy-based machine-learning models [6,7], combinatorial optimization [8], and quantum simulation [1,9]. These applications map their problem statements into graphs of interconnected binary nodes with variable interaction strengths [8]. The parameters of the problem at hand are encoded in the interaction strengths, while the collective configuration of the binary nodes encodes the current proposed solution. The interaction strengths define an effective energy for each global configuration [10]. If the state of each node is probabilistic, the configuration evolves with time and can explore configuration space according to a Boltzmann distribution, tending to spend more time in configurations with lower effective energy. Optimal solutions or solutions within the required error margin can be found using simulated annealing [11], in which the interaction strengths are collectively increased in a way analogous to lowering the temperature so that the configuration spends more time in the lowest energy state.

Implementing such calculations on conventional computing systems can be time consuming and inefficient. For large problems, the interconnection weights are necessarily stored on off-chip memory, while iterative calculations

<sup>\*</sup>advait.madhavan@nist.gov

PHYS. REV. APPLIED 21, 034064 (2024)

of the solution are performed on chip [12]. Since conventional computing systems have separate locations for memory and computation, most of the performance limitations arise from having to move data between memory and computation centers [13] (the so-called von Neumann bottleneck). If stochastically fluctuating p-bits are used as the nodes, a large binary state space can be searched efficiently, in terms of both area and energy cost. Large arrays of low-barrier versions of MTJs would be particularly appealing implementations of p-bits provided they can be coupled by locally engineered, electronically tunable interactions between them. This would allow memory (interaction strengths) and computation (accumulation and randomness generation) to be co-located, promising significant performance boosts [14].

Magnetic tunnel junctions are nanoscale magnetic devices that consist of two ferromagnetic layers separated by a thin insulator [15]. The insulating layer is thin enough for electrons to tunnel from one ferromagnetic layer to another. The resistance seen by the tunneling electrons depends on the relative orientation of the magnetizations of the ferromagnetic layers [16]. There are two stable configurations of the magnetizations, namely parallel and antiparallel (denoted as P and AP). The device resistance, also known as the tunnel magnetoresistance (TMR), is smaller if the magnetizations are parallel to each other than if they are antiparallel [17]. The relative orientations between layers can be changed by applying an external magnetic field across the device or passing a current through the device [18,19].

MTJs have various properties that make them useful as memory devices when integrated into modern CMOS chips [20,21]. Their stable configurations are used to encode binary values: for example, the parallel state may encode a value of **0** and the antiparallel state a value of **1**. When used in storage applications [22,23] requiring long retention times (approximately equal to 10 years), the energy barrier between stable configurations is engineered to be  $\geq 40$  kT, where k is Boltzmann's constant and  $T \approx$ 300 K is room temperature. While MTJ-based magnetic random-access memory (MRAM) is a promising candidate for various memory and unconventional computing applications [23–27], reducing the energy barrier reduces exponentially the retention time of the MTJ [28,29].

For energy barriers less than 14 kT, thermal fluctuations at room temperature switch the device between its stable configurations [30] at time scales faster than a millisecond. In this regime, devices are said to be superparamagnetic tunnel junctions (SMTJs). A current applied across the device allows its fluctuating resistance state to be read out as a random telegraph voltage signal and simultaneously controls the relative time that the device spends in each of its stable states [29]. The ability to generate tunable random signals makes SMTJs useful for applications where cheap, energy-efficient randomness is required [31–34]. Conventional CMOS sources of randomness such as linear feedback shift registers are only pseudorandom and require comparatively large area and energy budgets [31].

Although proposals of large-scale probabilistic computing systems using SMTJs as p-bits have been reported in the literature [3,8,9], practical implementations are still limited to a handful of devices. The largest demonstrations use up to 80 p-bits and perform tasks such as integer factorization [3], invertible logic gates [5], learning Boolean functions [6], and traveling-salesman problems [4]. Larger-scale demonstrations are limited by practical factors such as a lack of integrated platforms for exploration, yield, and repeatability failures in the fabrication process, the need for an external magnetic field, and differences in time scales between memory and computation. External magnetic fields, in particular, which are required to set devices in the superparamagnetic regime in many experiments, are inefficient and impractical in commercial applications.

Mismatch of time scales is an especially relevant issue. In probabilistic computing, the states of the nodes are sampled and that set of sampled states determines the input controlling the state of each of the nodes. A key requirement for the correct operation of probabilistic circuits is that determining the input for each node must be completed before the next sample is taken so that the subsequent sample is representative of the input [5]. For free-running interactions, this requirement means that the time scales of the interaction circuits (which can become relatively complex), must be much smaller than the time scales of nodes themselves. Previous demonstrations [3,4,6] measure the device state and perform the calculations of the accumulated interactions with a microprocessor in the loop. This approach works because the devices have switching times in the tens of milliseconds or more, which is enough for thousands of microprocessor cycles to perform computations digitally.

Although using a microprocessor in the loop is useful to demonstrate the functionality and expressibility of probabilistic computing, it is impractical for scalable implementations. Recent papers report simple ways of directly coupling devices. When two SMTJs are connected in parallel [35,36] or in series [37,38] to a voltage source via a tunable resistor, the current flowing through one device depends on the state of the other. Although such approaches can scale theoretically [39], practical problems such as line resistance, device-to-device variability, and noise quickly overwhelm linear circuit solutions as the number of devices grows. Moreover, such simple connections between devices are not enough to implement the large-scale interactions required to build practical systems.

The fastest and most energy-efficient implementations would consist of CMOS integrated with SMTJs in the back end of the line [25]. Such implementations would involve

analog stages mediating interactions between SMTJ-based *p*-bit unit cells. In this work, we take a step in that direction with commercial off-the-shelf analog circuits in a printed circuit board (PCB) from factor. We experimentally demonstrate the functionality of a unit-cell coupling circuit between *p*-bits, with direct analog interaction between field-free SMTJs that have operational time scales in the tens to hundreds of microseconds. The speeds of the analog computation circuits themselves are much faster, on the order of 1 µs, allowing the devices to faithfully couple to each other. The degree of coupling can be tuned by varying the interaction strengths, while the uncoupled probabilities of the devices can be independently set by adjusting bias currents. These two features, coupled with a programmable gain circuit, provide the necessary ingredients to perform optimization based on simulated annealing. This unit cell can be readily adapted to more than two devices, making it a useful stepping stone towards the integration of arrays of devices.

The rest of this paper is organized as follows. Section II describes the coupling circuit and discusses how this circuit can be used to perform simulated annealing. Section III contains a description of the experimental setup and the details of the specific SMTJs used. Section IV presents the experimental results of coupling devices of various time scales. Section V describes a Markov model that accurately describes the switching time scales of the two-SMTJ coupled system, and Sec. VI discusses the scalability of the coupling circuit.

#### **II. SMTJ INTERACTIONS**

#### A. Interaction circuit

Our coupling circuit, which is implemented on a PCB, allows two SMTJs to mutually influence the current through each other. The circuit senses the resistive state of one SMTJ and adds or subtracts a differential current to a second SMTJ depending on the state of the first. The strength of the coupling can be tuned by adjusting the magnitude of the differential current. We make the coupling bidirectional, meaning that both SMTJs are influencing the current through each other, by using two independent, unidirectional interaction circuits. The unidirectional interaction circuit, shown in Fig. 1, consists of several operational amplifier stages that enable the raw, noisy SMTJ voltage to influence a second SMTJ. These signal conditioning stages, which can be classified as either input or output stages, have specific functions in the interaction pipeline.

The gain stage, which is the first stage in the input pipeline, receives its input from the last stage of the output pipeline of the previous device as shown in Fig. 1. Its function is to set the strength of the coupling by adjusting the signal swing. This is done by setting the value

of the  $R_{\rm GS}$  resistor, which together with the  $R_{\rm gain}$  resistor determines a programmable gain value  $G = R_{gain}/R_{GS}$ . The gain stage typically reduces the voltage spread of its digital input signal to a desired level in order to correctly influence the device it controls. In Sec. VI, we describe how the gain stage can be augmented to support multiple SMTJ inputs for scaled implementations. The next stage in the input pipeline is the level-shifting stage, which shifts the mean voltage of the gain-stage signal to some predetermined adjustable level. This level is typically appropriate for biasing the influenced device to the equal probability state. Therefore, when no input signal is received by the gain stage, a static current biases the influenced device to its maximum entropy state. The transconductance stage, which is the last stage of the input pipeline, turns the shifted voltage signal into a proportional two-level current, which is then input to the next SMTJ. Since the differential part of the current is proportional to G, the strength of the coupling from the previous SMTJ is adjusted through the gain.

The output pipeline consists of a single stage, which is the threshold stage. The noisy signal from the SMTJ is converted into a digital output signal based on a programmable reference voltage, which is controlled by  $I_{\text{fixed}}$  and  $R_{TH}$ . There is also a polarity selection in the threshold stage that can be used to invert the state of the device, effectively producing positive or negative coupling. The gain, SMTJ bias, and threshold reference voltage are programmed via digital potentiometers in this implementation.

## **B. SMTJ model and simulated annealing**

The modified Néel-Brown model [29,40] describes how the current flowing through an SMTJ affects its probability of being in either magnetoresistive state. The mean dwell time in each state at a current I is

$$\tau_{\pm}(I) = \tau_0 \exp\left[-\frac{\Delta E}{kT} \left(1 \mp \frac{I - I_0}{I_c}\right)\right], \qquad (1)$$

where  $\tau_0$  is the characteristic time scale,  $\Delta E$  is the energy barrier, *T* is the temperature, here nominally room temperature, *k* is Boltzmann's constant,  $I_c$  is the critical current, and  $I_0$  is the current at which the device spends equal time in each state. Here  $\tau_+$  is the mean dwell time in the AP state and  $\tau_-$  is the mean dwell time in the P state.

The probability of finding the device in either state can be written as the fraction of total time spent in that state

1

$$P_{\pm}(I) = \frac{\tau_{\pm}(I)}{\tau_{\pm}(I) + \tau_{\mp}(I)}$$
$$= \frac{1}{1 + \exp\left[\mp \frac{2\Delta E}{kT} \left(\frac{I-I_0}{I_c}\right)\right]}.$$
(2)

By changing the current applied to an SMTJ, the probability of being in either resistive state also changes.



FIG. 1. Unidirectional SMTJ interaction circuit. Two copies of this circuit are used, one in each direction, to achieve bidirectional coupling. The SMTJ voltage gets sensed by a threshold stage, which converts the stochastic fluctuations to a digital output signal using a programmable reference voltage set by  $I_{\text{fixed}}$  and  $R_{TH}$ . This digital signal is passed through a gain stage, which adjusts its magnitude while retaining the mid voltage level. The gain  $G = R_{\text{gain}}/R_{\text{GS}}$  is programmable. The level-shifting stage adjusts the mid voltage level while retaining the voltage difference between high and low levels.  $V_{LS}$  is adjustable and controls the size of the shift. The shifted voltage is converted by a transconductance stage to a fluctuating current, which facilitates influencing the next SMTJ. In this example, the current in the next SMTJ is an inverted version of the SMTJ state, but both positive and negative coupling are possible with our circuit. In this example,  $R_{TH} = 615 \Omega$ ,  $I_{\text{fixed}} = 1 \text{ mA}$ , G = 0.05,  $V_{\text{MID}} = 1.25 \text{ V}$ , and  $V_{DC} = 5.5 \text{ V}$ .

Our implementation of two balanced unidirectional interaction circuits can be thought of as the simplest unit cell of an Ising machine [41] and is capable of performing simulated annealing. In simulated annealing, the objective is to find the ground state of spins in an interconnected graph that minimizes the global energy as described by the Ising Hamiltonian [1]. For our simple circuit, the Hamiltonian is determined by the weights of the two-edge graph (unidirectional interactions) connecting the SMTJs. The joint state of the SMTJ pair that minimizes the energy is the desired solution. In this way, simulated annealing can be performed by starting with a gain G of 0 and gradually increasing it.

During simulated annealing, the probability of being in the *i*th joint state, e.g., (P, P), (AP, AP), (P, AP), (AP, P), is determined by a Boltzmann distribution

$$P_i(T) = \frac{1}{Z(T)} \exp\left[-\frac{E_i(G)}{kT}\right],\tag{3}$$

where  $E_i(G)$  is the effective energy of the *i*th joint state at gain value G and  $Z(T) = \sum_i \exp[-E_i(G)/kT]$ . This effective energy can be linearly related to a model energy  $E_i(G) = G\tilde{E}_i$ . Here,  $\tilde{E}_i = J_{12}S_1^iS_2^i$  where  $S_n^i$  is the state of SMTJ *n* in configuration *i* and  $J_{12}$  is the nominal coupling strength for that pair. Rewriting the probability in terms of the model energy we get the following relation:

$$P_i(T_{\text{eff}}) = \frac{1}{Z} \exp\left[\frac{-\tilde{E}_i}{kT_{\text{eff}}}\right].$$
 (4)

Here,  $T_{\text{eff}} = T/G$  is the effective temperature of the model.

The process of simulated annealing proceeds as follows. Initially, the two-device circuit, which is at room temperature, has a gain equal to zero. Equation (4) shows that even though the system is at room temperature, the model temperature diverges in the sense that all joint SMTJ states are equally probable; the devices are effectively uncoupled. As the gain increases, the interaction circuit causes the devices to become correlated. The model temperature drops and the system spends more time in lower-energy joint states as prescribed by Boltzmann statistics. In conventional simulated annealing, the system would finally converge to a solution with the lowest energy; in our case, at a maximal gain, the system still has enough energy from room temperature to explore low-energy degenerate states. SMTJs have a barrier breakdown current [42,43] that imposes a limitation on how high the gain can be set without destroying the SMTJs. Higher breakdown currents would allow us to lower the temperature further for better convergence, but infinite gain (zero temperature) would still not be possible.

### **III. EXPERIMENTAL SETUP**

To demonstrate the coupling of two SMTJs, we use perpendicular magnetic tunnel junctions with the following stack sequence: Si base/SiO2/TaN/[Co(0.5)/Pt(0.2)] 6/Ru(0.8)/[Co(0.6)/Pt(0.2)]3/Ta(0.2)/Co(0.9)/W(0.25)/Co-Fe-B(1)/MgO(0.8)/Co-Fe-B(1.4)/W(0.3)/Co-Fe-B(0.5)/ MgO(0.75)/Ta(150)/Ru(8). The numbers in parentheses refer to layer thicknesses in nanometers, while the numbers beside square brackets show the bilayer repetitions. These devices, almost circular in shape, have approximate diameters of 150 nm. They exhibit a TMR near 120% at room temperature and possess a resistance-area product of 10  $\Omega$   $\mu$ m<sup>2</sup>. These devices were initially developed as multifunctional magnetic tunnel junctions, useful for both nonvolatile memory and rf applications [42,43]. Barrier breakdown has been observed in these devices at currents around 1 mA (corresponding to voltages around 0.7 V).

Field-free stochastic magnetization fluctuations have been observed in these perpendicular MTJ devices for a large range of diameters; for details see Ref. [44]. These devices allow for field-free fluctuations for two reasons. First, at the Co-Fe-B thickness of 1.4 nm, the perpendicular magnetic anisotropy [45] nearly cancels the selfdemagnetization field, leaving only a small energy barrier. Second, the fringing field from the fixed layer is small enough that the energies of the parallel and antiparallel configurations can be aligned with relatively small external magnetic fields or moderate currents. Field-free operation of the coupling experiments ensures that operation remains practical and commercially viable. However, in the absence of a field, the current required to put the devices in a superparamagnetic state at room temperature is quite high (approximately equal to 0.95 mA), necessitating operation close to barrier failure and suggesting refinement of the device design for future applications. The SMTJs reside on a chip in 6 columns of 32 devices that can be accessed individually using a probe. These devices are nominally the same, with a diameter of 150 nm. However, due to the difficulty of fabricating these devices outside of a commercial facility, the properties of the devices on our chip vary quite significantly. Our experiments are performed with devices that have close enough properties to make the experiment convenient, including resistance ranges, operating current ranges, timing characteristics, and TMR. We access them pairwise with probes and connect them to the circuit board with low-impedance cables. The board then couples that pair of SMTJs together.

Figure 2(a) shows the average resistances of the SMTJs used in our experiments as a function of applied current.  $I_0$  here is defined as the current at which the probabilities of being in either device state are equal. The resistances are low and almost independent of current for currents much below  $I_0$  because the SMTJs are stable in the parallel state. As the current through the device approaches  $I_0$ , the



FIG. 2. Characteristics of the SMTJs used in our experiments. (a) Average resistance as a function of applied current. (b) Mean dwell time in the P (negative slope curves) and AP (positive slope) states as a function of applied current. SMTJ-1 and SMTJ-2 have similar timing characteristics, while SMTJ-3 switches an order of magnitude faster on average. The current at which the SMTJ is in the AP state 50% of the time is denoted as  $I_0$ , which is 0.95, 0.90, and 1.00 mA for SMTJ-1, SMTJ-2, and SMTJ-3, respectively. Statistical uncertainties are smaller than the plotting symbols.

increasing spin-transfer torque causes the device to start switching stochastically, spending increased time in the antiparallel state. This corresponds to the sharp increase in resistance between -3 to  $3 \mu$ A. As currents are further increased, the rate of resistance increase slows down since the device spends all of its time in the antiparallel state.

Figure 2(b) shows the mean dwell times of the devices in the parallel and antiparallel states as a function of applied current. The parallel state dwell times are monotonically decreasing, while the antiparallel state dwell times are monotonically increasing. This is again by virtue of lower currents stabilizing the parallel state, while large currents stabilize the antiparallel state. Although the resistance values of SMTJ-1 and SMTJ-2 shown in Fig. 2(a) are considerably different, the difference in resistance does not affect the coupling dynamics, because the coupling circuit allows us to set the threshold voltage and  $I_0$  separately for each SMTJ. We perform experiments with this pair of devices as the "ideal" case, treating them as identical due to their similar time scales over the current range of Fig. 2(b). On the other hand, SMTJ-3, whose resistance is between SMTJ-1 and SMTJ-2, is roughly an order of magnitude faster than the other two devices. Experiments done with SMTJ-2 and SMTJ-3 study the effect of coupling devices that have different time scales.

The PCB that houses the interaction circuit described in Sec. II A performs multiple functions. It receives inputs from the probe station on which the SMTJ chip resides via coaxial cables connected to ground-signal (GS) probes. Each stage of the interaction circuit has outputs that can be connected to an oscilloscope for debugging and data collection. Peripheral bias and control inputs required to adjust the  $I_0$  currents, threshold voltages, and interaction strengths are controlled by a microcontroller via a standard communication protocol. Both the microcontroller and the oscilloscope are connected to a host computer that performs all of the control operations, data collection, and analysis in software. It should be noted that the PCB we fabricated for our experiments is only capable of coupling two SMTJs, although scaling of this design to higher numbers of SMTJs is possible and is discussed in Sec. VI.

There is approximately 1  $\mu$ s of propagation delay in our circuit; when one SMTJ switches, it takes approximately equal to 1  $\mu$ s for the current through the other SMTJ to change in response. This delay makes our circuit suboptimal for high-speed applications; however, more specialized amplifiers could be chosen to reduce the propagation delay. As discussed in Sec. VI, the delay would also be significantly reduced in an integrated circuit compared to a PCB.

We conduct the experiment as follows. First, both SMTJs are provided their respective  $I_0$  currents to ensure that the probabilities of being in each state are equal. Then the gain is increased, increasing the effect that each SMTJ has on the other. Voltage-time traces are collected from each SMTJ at specific increments of gain. We then conduct statistical analyses on these voltage-time traces to demonstrate that increased gain concomitantly increases the degree of influence between the SMTJs. This two-SMTJ coupling experiment is performed three times. Following the naming convention in Fig. 2, we conduct one experiment with SMTJ-1 and SMTJ-2 coupled positively, one experiment with SMTJ-1 and SMTJ-2 coupled negatively, and one experiment with SMTJ-2 and SMTJ-3 coupled positively. The purpose of these experiments with different configurations is to verify the functionality of both positive and negative coupling and to explore the effect of differing SMTJ time scales on the dynamics of the interaction. The results are presented in the following section.

# **IV. RESULTS**

The SMTJs become more strongly coupled as the stochastic time between one SMTJ switching and the



FIG. 3. Demonstration of positive coupling between SMTJ-1 and SMTJ-2, which have similar switching time scales. (a) The log-linear plot of mean dwell times in each of the four joint states the pair of SMTJs can be in. Positive coupling causes the SMTJs to spend more time in the same state (both in P or both in AP) and less time in opposite states with increasing gain. Solid lines are calculated mean joint dwell times using the Markov model described in Sec. V. (b) Pearson correlation coefficient between SMTJ voltage-time traces as gain is increased from 0 to 0.06. Both top and bottom axes apply to both panels. Statistical uncertainties are smaller than the plotting symbols.

second SMTJ switching in response is reduced. If we consider the four joint states of the two-SMTJ system (P, P), (P, AP), (AP, P), and (AP, AP), we can define joint dwell times to be the mean dwell times in each of these four joint states. With positive coupling, when the system is in the (P, P) or (AP, AP) states, the currents through both SMTJs are such that there is a very low probability of either SMTJ switching out of its current state. We thus expect the dwell times in joint states (P, P) and (AP, AP) to increase with gain, and the dwell times in joint states (P, AP) and (AP, P) to decrease with gain, which is the behavior seen in Fig. 3(a). In the negatively coupled case, the joint states (P, AP) and (AP, P) are stabilized, so the dwell times in these states increase with gain as seen in Fig. 4(a).

Note that in Fig. 3(a), the (P, AP) and (AP, P) dwell times decrease linearly until around G = 0.03, where the curve begins to flatten out. This is a sampling artifact. As the gain increases, the shorter dwell times start to approach the sampling times. As these times become closer some short-time events get missed, making the mean dwell times



FIG. 4. Demonstration of negative coupling between SMTJ-1 and SMTJ-2, which have similar switching time scales. (a) Loglinear plot of mean dwell times in each of the four joint states the pair of SMTJs can be in. Negative coupling causes the SMTJs to spend more time in opposite states (one in P and the other in AP) and less time in the same state with increasing gain. Solid lines are calculated mean joint dwell times using the Markov model described in Sec. V. (b) Pearson correlation coefficient between SMTJ voltage-time traces as the gain is increased from 0 to 0.06. Both top and bottom axes apply to both panels. Statistical uncertainties are smaller than the plotting symbols.

of the shorter events artificially larger, but also combining two of the long-time events, making those mean dwell times artificially longer as well. Computing dwell times requires saving entire time traces, making faster sampling prohibitive. Though this affects model parameters for small dwell times and shows small deviations from the experiment, these issues do not affect the calculation of the experimental equal-time correlation coefficients between the two tunnel junctions, which is the main quantity of interest as a measure of the joint interaction between the two SMTJs.

The Pearson correlation coefficient can be used to quantify the similarity between the (digitized) SMTJ voltagetime traces for both positive and negative coupling [Fig. 3(b) and Fig. 4(b), respectively]. A Pearson correlation coefficient of 1 indicates high similarity between the two voltage-time traces, namely that the SMTJs spend a lot of time in the correlated joint states (P, P) and (AP, AP), and very little time in the anticorrelated joint states (P, AP) and (AP, P). A Pearson correlation coefficient of -1 indicates that the SMTJs are spending a lot of time in the joint states (P, AP) and (AP, P), and very little time in the joint states (P, P) and (AP, AP). A Pearson correlation coefficient of 0 indicates equal time in correlated and anticorrelated states. In the positive coupling experiment, the Pearson correlation starts at 0 and approaches 1 in the high gain limit. The negative coupling experiment has opposite dominant joint states to the positive coupling case, which results in the Pearson correlation approaching -1.

Previous methods of coupling between SMTJs [35,36, 38], which facilitate interaction via linear circuit elements have yielded a maximum Pearson correlation coefficient of 0.35. Such a low maximal correlation coefficient, coupled with the fact that these methods use a single resistor to control the interaction between their constituent SMTJs, make such approaches difficult to scale for simulated annealing tasks. On the other hand, our analog interaction circuit has a pair of independently programmable resistors for each pairwise set of SMTJs. It also features a global gain resistor that selects the effective temperature of the problem. Together, these features result in a Pearson correlation



FIG. 5. Demonstration of positive coupling between SMTJ-2 and SMTJ-3, which have different switching time scales. (a) The log-linear plot of mean dwell times in each of the four joint states the pair of SMTJs can be in. Positive coupling causes the SMTJs to spend more time in the same state (both in P or both in AP) and less time in opposite states with increasing gain. Solid lines are calculated mean joint dwell times using the Markov model described in Sec. V. (b) Pearson correlation coefficient between SMTJ voltage-time traces as the gain is increased from 0 to 0.06. Both top and bottom axes apply to both panels. Statistical uncertainties are smaller than the plotting symbols.

coefficient that approaches 1. This implies that our method is more viable for simulated annealing, where it is useful to implement programmable pairwise interactions and be able to lower the effective temperature as much as possible.

The results discussed so far have demonstrated an analog circuit that facilitates tunable coupling between SMTJs with similar time scales. Figure 5 shows that the circuit also successfully couples devices with very different time scales. When the positive coupling experiment is repeated with SMTJs of different time scales, the Pearson correlation and joint dwell times, plotted in Fig. 5, are very similar to the results in Fig. 3, aside from the joint state time scales being overall faster. In the next section, we examine how this difference in switching time scales affects the system's relaxation to equilibrium.

#### V. MARKOV MODELING

Since the coupling between SMTJs in our experiments is facilitated entirely by analog circuitry, device properties such as TMR and  $I_0$  have less of an impact on the coupling strength or on the behavior of the coupled system than in recent SMTJ coupling work [35,38]. However, the difference in time scales of the coupled SMTJs does have a sizeable impact on the dynamics of the coupled system. To examine this effect, we can look at the eigenvalues of the Markov transition matrix for this system. This is a  $4 \times 4$ matrix containing the transition rates between each of the four joint states (P, P), (P, AP), (AP, P), and (AP, AP), which we index as states 00, 01, 10, and 11, respectively.

We consider transitions where both SMTJs switch simultaneously [for instance, a switch from (P, P) to (AP, AP)] to have probability zero; all nonzero transition rates in our system represent single-SMTJ switching events. The switching rate of one SMTJ is dependent on the state of the other SMTJ due to the coupling; in the positive coupling case, when SMTJ-1 is in the P state, the current through SMTJ-2 will be  $I_0 - \Delta I$ . When SMTJ-1 is in the AP state, the current through SMTJ-2 will be  $I_0 + \Delta I$ . Referring to Fig. 6(a), we can then determine which inverse rate is the correct one for each of the eight possible transitions from one state into another. For example,  $1/\tau_a^+$  is the rate for the first SMTJ to switch from the antiparallel state to the parallel state when the second SMTJ is in the antiparallel state, and  $1/\tau_a^-$  is the rate for the first SMTJ to switch from the parallel state to the antiparallel state when the second SMTJ is in the parallel state.

The transition rate diagram is shown in Fig. 6(b), with Fig. 6(a) showing the relative locations of the inverse switching rates with respect to each other on a log-linear plot of mean dwell time against the current. The notation in Fig. 6 can be simplified with the assumption that both devices have the same slopes for the current dependences, in which case  $\tau_n^+ = \tau_n^- = \tau_n$ , where  $n = \{a, b, c, d\}$ . Since we are interested in examining how the difference in time



FIG. 6. (a) Conceptual mean dwell time plot [similar to Fig. 2(b)] showing the inverse switching rates in the two-SMTJ coupled system. Positive slope lines represent dwell times in the AP state, while negative slope lines represent dwell times in the P state. Note the log scale on the vertical axis. The mean dwell times in the P and AP states are equal at  $I_0$ , and these dwell times are labeled as  $\tau_{0,1}$  and  $\tau_{0,2}$ . The remaining labels are dwell times at either  $I_0 + \Delta I$  or  $I_0 - \Delta I$ , which are used to write the transition rates from one state into another. (b) Markov transition diagram for the two-SMTJ coupled system. The switching rate out of each state is effectively a single-device switching rate, which is dependent on the state of the other SMTJ due to coupling.

scales of the two devices impacts the coupled dynamics, we represent the difference in time scales by a scaling factor r on  $\tau_0$  via

$$\tau_{0,2} = \frac{1}{r} \tau_{0,1}.$$
 (5)

Note that *r* will always be greater than 1 since we will always choose SMTJ-1 to be the slower device. Rewriting the  $\tau_n$ s in terms of  $\tau_0$ , 1 yields

$$\begin{aligned} \tau_a &= g \tau_{0,1} & \tau_c = \frac{g}{r} \tau_{0,1} \\ \tau_b &= \frac{1}{g} \tau_{0,1} & \tau_d = \frac{1}{gr} \tau_{0,1}, \end{aligned}$$
 (6)

where  $g = \exp[B\Delta I]$  represents the change in the dwell times as a function of gain (*G*), since  $\Delta I$  is linearly related to the gain. We would like to emphasize that although the gain *G* is proportional to the current increment  $\Delta I$ , *g* is an exponential function of both *G* or equivalently  $\Delta I$ . For the sake of clarity, we have chosen to make *g* for the two SMTJs equal. This implies that *B* for the two SMTJs are equal, which would make the slopes of the lines in Fig. 6(a) equal.

To generate a Markov model for our coupled system, we refer to Appendix B of Ref. [35]. The Markov transition rate matrix **M** can be directly generated from Fig. 6(b) using the simplifications in Eq. (6). The steadystate probability distribution is given by the eigenvector corresponding to the zero eigenvalue of **M**, namely

$$v_0 = \left(1, \frac{1}{g^2}, \frac{1}{g^2}, 1\right).$$
 (7)

Note that this eigenvector, which encodes the (unnormalized) steady-state distribution, is only a function of g, and the probability of being in the two disfavored states decreases quadratically with the value of g. Note that gitself depends exponentially on the gain.

In addition to this zero eigenvalue, there are three negative eigenvalues. The one closest to zero provides the slowest rate of decay into the steady-state distribution. To see how selecting SMTJs with different time scales affects the coupled dynamics, we can look at the eigenvalue that produces the term that decays most slowly,

$$\lambda_1 = \frac{1}{\tau_{0,1}} \frac{-b + \sqrt{-16g^2 r + b^2}}{2g}$$
(8)  
$$b = (g^2 + 1)(r+1),$$

which depends on both g and r. So although the final distribution is governed by g alone, the time scale of the coupled system's relaxation to equilibrium increases not only with gain but also with a higher ratio of SMTJ time scales.

We use this model to predict the mean joint dwell times, and correlation coefficients during the coupled experiments. These model predictions are plotted as solid lines on the experimental dwell time and correlation coefficient plots in Sec. IV (Figs. 3–5). To apply the model to our experiment, we drop the assumption that *B* is the same for both SMTJs, reflecting the different slopes in Fig. 2. The result of this is that each SMTJ has its own *g* value, and the steady-state distribution becomes dependent on *r* in addition to *g*. We fit lines to the characteristic dwell time plots of Fig. 2 to obtain *B*, which we use to calculate *g* for each SMTJ.  $\tau_{0,1}$  and *r* were also determined from Fig. 2 for each pair of SMTJs. The negative reciprocals of the diagonal elements of **M** give the dwell times in each of the four joint states for a given  $\Delta I$ .

# **VI. DISCUSSION**

The analog circuits used to facilitate coupling in this paper can be scaled up for larger bench-top experiments or integrated implementations with simple modifications. The threshold stage, used to convert noisy SMTJ signals to clean digital levels can be implemented by skewed inverter circuits with a digitally programmable threshold, or an analog latch-based sense amplifier design. Such circuits have operating speeds <150 ps and are composed of <10 transistors [46,47]. Transconductance amplifiers for level shifting bias voltages and adding differential currents can also be practically implemented at nanosecond time scales. SMTJ switching speeds can be significantly increased before CMOS input and output signal conditioning circuits become latency bottlenecks.

In a network with many SMTJ circuits such as the one described here, the gain stage of each interaction circuit can be modified to accept inputs from each of the





FIG. 7. Interaction circuit gain stage in a network of *n* SMTJs. Each SMTJ's gain stage receives voltage inputs from the threshold stages of the other n - 1 SMTJs. This way, the output of the gain stage contains the total influence from all other SMTJs in the network. This modification is a simple method of scaling the proposed coupling method up to large values of *n*. The number of operational amplifiers scales as *n* while the number of resistors scales as  $n^2$ .

other SMTJs in the network. The gain stage will then be a summing amplifier circuit, with rows of resistors feeding into the amplifier, as shown in Fig. 7. This scalability opens the possibility of creating an Ising machine to solve realistically large problems. The number of operational amplifiers in the interaction circuits grows linearly with the number of nodes in such an Ising machine, while the number of resistors needed for the modified gain stage grows quadratically if the nodes are all-to-all connected. These quadratic circuits can be implemented with crossbars of resistive switches. The push for building hardware platforms for matrix multiplication has led to the development of low-latency, energy-efficient large-scale crossbars. CMOS-integrated resistive RAM crossbars in submicron nodes (22 and 130 nm) have recently been demonstrated on such applications with arrays as large as  $1024 \times 512$ . When scaled to representative kernel sizes  $(256 \times 256)$ , such crossbars incur latencies on the order of approximately equal to 1 to 10  $\mu$ s [48,49]. The crossbar arrays implementing the all-to-all connections are the most area, energy, and latency-intensive aspects of larger-scale designs. Solving larger problems will require codesign of devices, architectures, and algorithms. Ising Hamiltonians with large connectivity matrices will be partitioned into crossbar kernels that solve a subset of the global problem. Additional nodes to reduce the connectivity requirements for particular problems [8] would be needed, augmented with specific circuits to handle communication between kernels.

Since *p*-bit-based Ising machines operate on the principle that the programmable interaction circuits should operate at speeds faster than the nodes themselves,

microsecond switching SMTJs may be sufficient for scaled integrated implementations. However, the SMTJs used in this study are far from optimal. Estimates of enhanced SMTJ properties for integrated designs involve lower operating currents [50], faster operating speeds [33], and larger TMR [50]. Larger TMR enables easier CMOS readout, while lower operating currents enable superparamagnetic operation at room temperature without a high applied current, which is essential for energy efficiency and scalability. Recent research [51] has shown that determining the appropriate thickness of the free layer to correctly compensate for the fringing fields can allow for field-free operation at lower currents. The high energy cost of large static currents can be amortized by fast switching circuits that produce more random bits per unit of time and hence current but becomes difficult to justify when the faster operation is not beneficial for directly coupled designs.

Although we have demonstrated in this work that simple circuitry can effectively couple SMTJs in a way that can be used to solve large-scale problems, the only scalable approach will be to fabricate SMTJs in the back-endof-the-line of the chip containing the coupling circuitry. Future work will include a large-scale implementation of SMTJ coupling.

### ACKNOWLEDGMENTS

This work was funded by the National Institute of Standards and Technology, National Science Foundation and Agence Nationale de la Recherche. S.G., T.A., L.P., D.P.L., and A.M. acknowledge support under NSF Grant No. CCF-CISE-ANR-FET-2121957. A.M. also acknowledges support under the NIST Cooperative Research Agreement Award No. 70NANB14H209 through the University of Maryland. P.T. and U.E. acknowledge support under the ANR StochNet Project Award No. ANR-21-CE94-0002-01. The authors would like to thank Steve Moxim for help with experimental setup and Frank Mizrahi for valuable feedback. The authors acknowledge J. Langer and J. Wrona from Singulus Technologies for the MTJ stack deposition and N. Lamard, R. Sousa, L. Prejbeanu as well as the Upstream Technological platform PTA, Grenoble, France for the device nanofabrication.

- A. Lucas, Ising formulations of many NP problems, Front. Phys. 2, 5 (2014).
- [2] A. Grimaldi, L. Mazza, E. Raimondo, P. Tullo, D. Rodrigues, K. Y. Camsari, V. Crupi, M. Carpentieri, V. Puliafito, and G. Finocchio, Evaluating spintronicscompatible implementations of Ising machines, Preprint ArXiv:2304.04177 (2023).
- [3] W. A. Borders, A. Z. Pervaiz, S. Fukami, K. Y. Camsari, H. Ohno, and S. Datta, Integer factorization using stochastic magnetic tunnel junctions, Nature 573, 390 (2019).
- [4] J. Si, S. Yang, Y. Cen, J. Chen, Z. Yao, D.-J. Kim, K. Cai, J. Yoo, X. Fong, and H. Yang, Energy-efficient

superparamagnetic Ising machine and its application to traveling salesman problems, Preprint ArXiv:2306.11572 (2023).

- [5] K. Y. Camsari, R. Faria, B. M. Sutton, and S. Datta, Stochastic p-bits for invertible logic, Phys. Rev. X 7, 031014 (2017).
- [6] J. Kaiser, W. A. Borders, K. Y. Camsari, S. Fukami, H. Ohno, and S. Datta, Hardware-aware in situ learning based on stochastic magnetic tunnel junctions, Phys. Rev. Appl. 17, 014016 (2022).
- [7] N. A. Aadit, A. Grimaldi, M. Carpentieri, L. Theogarajan, G. Finocchio, and K. Y. Camsari, in 2021 IEEE International Electron Devices Meeting (IEDM) (IEEE, San Francisco, United States of America, 2021), p. 40.
- [8] N. A. Aadit, A. Grimaldi, M. Carpentieri, L. Theogarajan, J. M. Martinis, G. Finocchio, and K. Y. Camsari, Massively parallel probabilistic computing with sparse ising machines, Nat. Electron. 5, 460 (2022).
- [9] S. Chowdhury, K. Y. Camsari, and S. Datta, Accelerated quantum Monte Carlo with probabilistic computers, Commun. Phys. 6, 85 (2023).
- [10] A. Z. Pervaiz, B. M. Sutton, L. A. Ghantasala, and K. Y. Camsari, Weighted *p*-bits for FPGA implementation of probabilistic circuits, IEEE Trans. Neural Netw. Learn. Syst. 30, 1920 (2018).
- [11] D. Bertsimas and J. Tsitsiklis, Simulated annealing, Stat. Sci. 8, 10 (1993).
- [12] N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, and A. Borchers, *et al.*, in *Proceedings of the 44th Annual International Symposium on Computer Architecture* (ACM, Toronto, Canada, 2017), p. 1.
- [13] W. A. Wulf and S. A. McKee, Hitting the memory wall: Implications of the obvious, ACM SIGARCH Comput. Archit. News 23, 20 (1995).
- [14] A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, Memory devices and applications for in-memory computing, Nat. Nanotechnol. 15, 529 (2020).
- [15] B. Jinnai, K. Watanabe, S. Fukami, and H. Ohno, Scaling magnetic tunnel junction down to single-digit nanometers—challenges and prospects, Appl. Phys. Lett. 116, 160501 (2020).
- [16] M. Julliere, Tunneling between ferromagnetic films, Phys. Lett. A 54, 225 (1975).
- [17] J. S. Moodera, L. R. Kinder, T. M. Wong, and R. Meservey, Large magnetoresistance at room temperature in ferromagnetic thin film tunnel junctions, Phys. Rev. Lett. 74, 3273 (1995).
- [18] J. C. Slonczewski, Current-driven excitation of magnetic multilayers, J. Magn. Magn. Mater. 159, L1 (1996).
- [19] D. Ralph and M. Stiles, Spin transfer torques, J. Magn. Magn. Mater. 320, 1190 (2008).
- [20] D. C. Worledge, in 2022 IEEE International Memory Workshop (IMW) (IEEE, Dresden, Germany, 2022), p. 1.
- [21] K. Garello, F. Yasin, H. Hody, S. Couet, L. Souriau, S. H. Sharifi, J. Swerts, R. Carpenter, S. Rao, and W. Kim, *et al.*, in *2019 Symposium on VLSI Circuits* (IEEE, Kyoto, Japan, 2019), p. T194.
- [22] K. Lee, J. Bak, Y. Kim, C. Kim, A. Antonyan, D. Chang, S. Hwang, G. Lee, N. Ji, and W. Kim, et al., in 2019 IEEE

*International Electron Devices Meeting (IEDM)* (IEEE, San Francisco, United States of America, 2019), p. 2.

- [23] S. Aggarwal, H. Almasi, M. DeHerrera, B. Hughes, S. Ikegawa, J. Janesky, H. Lee, H. Lu, F. Mancoff, and K. Nagel, et al., in 2019 IEEE International Electron Devices Meeting (IEDM) (IEEE, San Francisco, United States of America, 2019), p. 2.
- [24] J. Grollier, D. Querlioz, K. Y. Camsari, K. Everschor-Sitte, S. Fukami, and M. D. Stiles, Neuromorphic spintronics, Nat. Electron. 3, 360 (2020).
- [25] S. Jung, H. Lee, S. Myung, H. Kim, S. K. Yoon, S.-W. Kwon, Y. Ju, M. Kim, W. Yi, and S. Han, *et al.*, A crossbar array of magnetoresistive memory devices for in-memory computing, Nature **601**, 211 (2022).
- [26] Q. Dong, Z. Wang, J. Lim, Y. Zhang, M. E. Sinangil, Y.-C. Shih, Y.-D. Chih, J. Chang, D. Blaauw, and D. Sylvester, A 1-Mb 28-nm LTLMTJ STT-MRAM with single-cap offsetcancelled sense amplifier and in situ self-write-termination, IEEE J. Solid-State Circuits 54, 231 (2018).
- [27] O. Golonzka, J.-G. Alzate, U. Arslan, M. Bohr, P. Bai, J. Brockman, B. Buford, C. Connor, N. Das, and B. Doyle, *et al.*, in 2018 IEEE International Electron Devices Meeting (IEDM) (IEEE, San Francisco, United States of America, 2018), p. 18.
- [28] S. Kanai, K. Hayakawa, H. Ohno, and S. Fukami, Theory of relaxation time of stochastic nanomagnets, Phys. Rev. B 103, 094423 (2021).
- [29] W. Rippard, R. Heindl, M. Pufall, S. Russek, and A. Kos, Thermal relaxation rates of magnetic nanoparticles in the presence of magnetic fields and spin-transfer effects, Phys. Rev. B 84, 064439 (2011).
- [30] M. Bapna and S. A. Majetich, Current control of timeaveraged magnetization in superparamagnetic tunnel junctions, Appl. Phys. Lett. 111, 243107 (2017).
- [31] M. W. Daniels, A. Madhavan, P. Talatchian, A. Mizrahi, and M. D. Stiles, Energy-efficient stochastic computing with superparamagnetic tunnel junctions, Phys. Rev. Appl. 13, 034016 (2020).
- [32] A. Shukla, L. Heller, M. G. Morshed, L. Rehm, A. W. Ghosh, A. D. Kent, and S. Rakheja, in 2023 24th International Symposium on Quality Electronic Design (ISQED) (IEEE, San Francisco, USA, 2023), p. 1.
- [33] L. Schnitzspan, M. Kläui, and G. Jakob, Nanosecond true random number generation with superparamagnetic tunnel junctions-identification of joule heating and spin-transfertorque effects, Preprint ArXiv:2301.05694 (2023).
- [34] D. Vodenicarevic, N. Locatelli, A. Mizrahi, J. S. Friedman, A. F. Vincent, M. Romera, A. Fukushima, K. Yakushiji, H. Kubota, S. Yuasa, S. Tiwari, J. Grollier, and D. Querlioz, Low-energy truly random number generation with superparamagnetic tunnel junctions for unconventional computing, Phys. Rev. Appl. 8, 054045 (2017).
- [35] P. Talatchian, M. W. Daniels, A. Madhavan, M. R. Pufall, E. Jué, W. H. Rippard, J. J. McClelland, and M. D. Stiles, Mutual control of stochastic switching for two electrically coupled superparamagnetic tunnel junctions, Phys. Rev. B 104, 054427 (2021).
- [36] N.-T. Phan, L. Soumah, A. Sidi El Valli, L. Hutin, L. Anghel, U. Ebels, and P. Talatchian, in *Proceedings of the* 17th ACM International Symposium on Nanoscale Architectures (ACM, Oregon, USA, 2022), p. 1.

- [37] A. Mizrahi, N. Locatelli, J. Grollier, and D. Querlioz, Synchronization of electrically coupled stochastic magnetic oscillators induced by thermal and electrical noise, Phys. Rev. B 94, 054419 (2016).
- [38] L. Schnitzspan, M. Kläui, and G. Jakob, Electrical coupling of superparamagnetic tunnel junctions mediated by spin-transfer-torques, Preprint ArXiv:2307.15165 (2023).
- [39] M. W. Daniels, W. A. Borders, N. Prasad, A. Madhavan, S. Gibeault, T. Adeyeye, L. Pocher, L. Wan, M. Tran, and J. A. Katine, *et al.*, in *Spintronics XVI*, Vol. 12656 (SPIE, San Diego, United States of America, 2023), p. 84.
- [40] W. F. Brown, Thermal fluctuations of a single-domain particle, Phys. Rev. 130, 1677 (1963).
- [41] N. Mohseni, P. L. McMahon, and T. Byrnes, Ising machines as hardware solvers of combinatorial optimization problems, Nat. Rev. Phys. 4, 363 (2022).
- [42] A. Chavent, V. Iurchuk, L. Tillie, Y. Bel, N. Lamard, L. Vila, U. Ebels, R. C. Sousa, B. Dieny, and G. Di Pendina, *et al.*, A multifunctional standardized magnetic tunnel junction stack embedding sensor, memory and oscillator functionality, J. Magn. Magn. Mater. **505**, 166647 (2020).
- [43] R. Ma, A. Sidi El Valli, M. Kreißig, G. Di Pendina, F. Protze, U. Ebels, G. Prenat, A. Chavent, V. Iurchuk, and R. Sousa, *et al.*, Microwave functionality of spintronic devices implemented in a hybrid complementary metal oxide semiconductor and magnetic tunnel junction technology, Electron. Lett. 57, 264 (2021).
- [44] A. S. El Valli, Ph.D. thesis, Université Grenoble Alpes [2020-...] (2022).
- [45] B. Dieny and M. Chshiev, Perpendicular magnetic anisotropy at transition metal/oxide interfaces and applications, Rev. Mod. Phys. 89, 025008 (2017).
- [46] B. Wicht, T. Nirschl, and D. Schmitt-Landsiedel, Yield and speed optimization of a latch-type voltage sense amplifier, IEEE J. Solid-State Circuits 39, 1148 (2004).
- [47] S. Capra, F. Crescioli, L. Frontini, M. Garci, and V. Liberali, in 2018 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, Florence, Italy, 2018), p. 1.
- [48] W. Wan, R. Kubendran, C. Schaefer, S. B. Eryilmaz, W. Zhang, D. Wu, S. Deiss, P. Raina, H. Qian, and B. Gao, *et al.*, A compute-in-memory chip based on resistive random-access memory, Nature **608**, 504 (2022).
- [49] J.-M. Hung, C.-X. Xue, H.-Y. Kao, Y.-H. Huang, F.-C. Chang, S.-P. Huang, T.-W. Liu, C.-J. Jhang, C.-I. Su, and W.-S. Khwa, *et al.*, A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for AI edge devices, Nat. Electron. 4, 921 (2021).
- [50] G. Hu, G. Lauer, J. Sun, P. Hashemi, C. Safranski, S. Brown, L. Buzi, E. Edwards, C. D'Emic, and E. Galligan, *et al.*, in 2021 IEEE International Electron Devices Meeting (IEDM) (IEEE, San Francisco, United States of America, 2021), p. 2.
- [51] L. Soumah, L. Desplat, N.-T. Phan, A. S. E. Valli, A. Madhavan, F. Disdier, S. Auffret, R. Sousa, U. Ebels, and P. Talatchian, Nanosecond stochastic operation in perpendicular superparamagnetic tunnel junctions, Preprint ArXiv:2402.03452 (2024).

*Correction:* An incorrect unit in the fourth sentence of Section III introduced during the production process has been fixed.