Abstract
Direction selectivity is a fundamental physiological property that arises from primary visual cortex (V1) circuitry, yet basic questions of how direction-selective (DS) receptive fields are constructed remain unanswered. We built a set of simple, plausible neuronal circuits that produce DS cells via different mechanisms and tested these circuits to determine how they can be distinguished experimentally. Our models consisted of populations of spiking units representing physiological cell classes ranging from LGN cells to V1 complex DS cells. They differed in network architecture and DS mechanism, including linear summation of non-DS simple-cell inputs or nonlinear pairwise combinations of non-DS inputs. The circuits also varied in the location of the DS time delay and whether the DS interaction was facilitatory or suppressive. We tested the models with visual stimuli often used experimentally, including sinusoidal gratings and flashed bars, and computed shuffle-corrected cross-correlograms (CCGs) of spike trains from pairs of units that would be accessible to extracellular recording. We found that CCGs revealed fundamental features of the DS models, including the location of signal delays in the DS circuit and the sign (facilitatory or suppressive) of DS interactions. We also found that correlation was strongly stimulus-dependent, changing with direction and temporal frequency in a manner that generalized across model architectures. Our models make specific predictions for designing, optimizing, and interpreting electrophysiology experiments aimed at resolving DS circuitry and provide new insights into mechanisms that could underlie stimulus-dependent correlation. The models are available and easy to explore at www.imodel.org.
Introduction
Direction selective (DS) neurons exist in nearly all visual animals, yet crucial questions about the underlying circuitry remain unanswered. In primate primary visual cortex (V1), two major unresolved issues are the location of the DS time delay (dt) and the nature (facilitatory vs suppressive) of the DS interaction.
All models of DS neurons essentially involve a time delay between spatially offset inputs (Reichardt, 1961; Barlow and Levick, 1965; Adelson and Bergen, 1985). Many studies suggest that this delay is presynaptic to the DS cell, for example, arising from temporal diversity in lateral geniculate nucleus (LGN) (De Valois and Cottaris, 1998; De Valois et al., 2000; Saul et al., 2005), or from latency variation in V1 simple cells (Wolfe and Palmer, 1998; Peterson et al., 2004). An alternative to presynaptic delay theories is that the DS delay arises postsynaptically, resulting from an integration stage within the DS cell itself. There is growing evidence that dendrites can act as local computational subunits (Borg-Graham and Grzywacz, 1992; Mel, 1993; Polsky et al., 2004; Branco et al., 2010), and the mechanisms responsible for generating a dendritic DS delay could be either passive (Rall, 1964; Koch et al., 1983, Agmon-Snir and Segev, 1993) or active (Koch et al., 1982; Mel 1992, 1993; Koch and Poggio, 1992; Branco et al., 2010).
Another unresolved issue is whether facilitatory or suppressive influences are predominant in creating cortical direction selectivity. Some models hold that excitatory inputs are arranged to sum optimally for stimuli moving in the preferred direction (Reichardt, 1961). Such facilitatory interactions are supported by evidence that preferred direction motion generates supralinear excitation (Albrecht and Geisler, 1991; DeAngelis et al., 1993). Conversely, the class of suppressive models involves inhibitory signals that diminish excitatory inputs for antipreferred direction motion (Barlow and Hill, 1963; Barlow and Levick, 1965). Evidence for the resulting sublinear summation has been reported in V1 DS cells (Reid et al., 1987, 1991; Tolhurst and Dean, 1991). Alternatively, excitatory and inhibitory inputs may combine to generate DS receptive fields (RFs) (Priebe and Ferster, 2005).
Spike train cross-correlation is a powerful tool for studying functional connectivity in the visual system (Perkel et al., 1967; Moore et al., 1970; Toyama et al., 1981; Michalski et al., 1983; Ts'o et al., 1986; Nelson et al., 1992; Alonso and Martinez, 1998; Usrey et al., 1998, 1999, 2000). To explore how cross-correlation can resolve DS circuits in vivo, we examined spike train cross-correlograms (CCGs) from non-DS cells and their postsynaptic DS units in a set of thalamocortical network models with populations of spiking LGN and V1 simple and complex cells with realistic synaptic connections. We built models that differed in the placement of the time delay and the sign of the DS interaction and found that the CCGs exhibit characteristic features that distinguish the different DS mechanisms. We also found that the DS mechanisms create a characteristic stimulus dependence in CCG features. Our results demonstrate the power of cross-correlation for elucidating DS circuitry and provide a guide to developing specific experimental protocols for solving long-standing questions of functional circuitry.
Materials and Methods
Model implementation.
The models presented here have been developed using a simulation framework built in our laboratory called WM, which is written in C. Input to WM consists of text files that describe models, stimuli, and desired outputs. The output is written in a binary format called nData (.nd). The WM framework can be downloaded (www.imodel.org/wm) and compiled from source code; however, to make our models more easily accessible, they are available on a website (www.imodel.org) that allows all models presented here to be viewed over the internet by anyone with a standard Java-enabled web browser. The iModel website allows users to view the visual stimuli, explore the synaptic connectivity within the models, and analyze the model outputs (e.g., spike trains) using Java applets. In addition, it specifically provides the resources needed to reproduce all simulations and plots in this manuscript (www.imodel.org/t/11/t1/).
In this study, we present results from several population models that we have built for simulating DS neurons in V1. Each model is a network of conductance-driven leaky integrate-and-fire (LIF) spiking units and consists of 2048 LGN units, 512–2048 orientation-tuned simple V1 units, and four DS units. Despite their scale, the fundamental architecture and computations performed in the models are straightforward, and a complete functional description is given below. When stating parameter values, we will provide in parentheses the textual names of the parameters in the model text files available at the iModel website.
Model classes and network architecture.
We consider three classes of DS models: presynaptic delay, postsynaptic delay, and linear models. Each contains four populations of LIF units: LGN, inhibitory (IN) and excitatory (EX) non-DS simple cells, and DS units (Fig. 1A). Each population consists of a 3D array of units in which the first two dimensions (x,y) represent spatial position in terms of either the RF center location within the visual field (for LGN; Fig. 1B) or the location of the cell body within an idealized cortical sheet (IN, EX, DS; Fig. 1D–F). The third dimension (z), referred to as the “layer,” allows each topographical location to contain multiple units that differ in some aspect of RF organization or physiological function (e.g., RF spatial phase). Units are referred to by the population name, followed by indices for the x, y, and z dimensions, e.g., EX_5_5_0.
In each model, the visual stimulus provides the input to the LGN units, which send excitatory inputs to IN and EX units in V1. From this stage, the model classes differ in architecture. In the presynaptic and postsynaptic delay models, IN units send inhibitory input to EX units, which provide synaptic input to the DS units (Fig. 1A) through a pairwise nonlinear interaction (described below). In the linear model, however, both IN and EX populations provide direct synaptic input to the DS population, without any special nonlinear interaction. Another key architectural difference among the three model classes is the addition of temporally delayed populations of LGN, IN, and EX units in the presynaptic delay model and the linear model. The delayed LGN population, LGN_D, is identical to the original but has a fixed time delay added to the temporal response. It sends input to the delayed V1 populations (IN_D and EX_D) using the same synaptic architecture as the non-delayed populations. The presence of the delayed populations are indicated by the red arrows in the circuit diagrams in Figure 2A (presynaptic delay model) and Figure 2E (linear model).
The detailed implementation of each model is described below, starting with the DS units and working back to the LGN. The LIF units and synaptic transmission, which are common to all models, are described last.
DS population.
The DS population consists of a small grid of 2 × 2 × 1 LIF units (Fig. 1F) and is aligned to the central 2 × 2 block of EX and IN units in terms of cortical position and preferred orientation. When tested with drifting sinusoidal gratings, the DS units in all models show similar tuning, with preferred spatial frequency (SF) = 1.6 cycles/°, temporal frequency (TF) = 10–12 Hz, and direction = 0° (Fig. 1G shows the latter). The DS units differ across model classes in both the populations that provide direct input and how inputs are combined.
Nonlinear subunit models: synaptic connectivity.
For the nonlinear models, the excitatory input to each DS unit is the sum of contributions from a set of nonlinear subunits, in which each subunit implements a pairwise interaction of the outputs of two spatially offset EX units (Fig. 2A,C). These inputs were chosen by first assigning synaptic weights to EX units using a Gaussian function in cortical distance (cdist), SD = 58 μm, and in orientation difference (cori), SD = 30°, in which the maximum weight, 1.0, indicates no difference in cortical position or target orientation preference. Connections are dropped for any weights <0.05 (minw), and each of the remaining connections was either kept or dropped by applying a random, independent connection probability of 0.35 (prob). The weights of the resulting set of inputs are then scaled to achieve a total equivalent weight of 50 unitary inputs (normw). Having established a set of EX inputs, each is then paired with the EX unit that has an RF spatial phase shift of 90°. In the presynaptic delay models, the matching phase-shifted unit is chosen from the temporally delayed population, thus the paired inputs are offset in space and time. Overall, single DS cells ended up having inputs that differed in preferred orientation by 26–45° and differed in preferred SF by ∼½ octave. The RF size of the DS units remained the same as that of the EX units.
In addition to Gaussian weighting, uniform weights were used in some versions of the presynaptic delay model. Uniform weights were limited to a cortical radius of 135 μm (cdist), with no explicit constraint on orientation. We varied the connection probability (prob) over the set of values {0.05, 0.15, 0.30, 0.50, 0.75}to generate networks with the following numbers of identically weighted inputs, {4, 8, 13, 24, 35}.
Pairwise nonlinear interaction.
To create DS signals, the paired spiking inputs within each subunit are combined nonlinearly. In the presynaptic delay model, in which delayed and non-delayed EX inputs are paired (Fig. 2A,B), the spike trains are each convolved with an α-function postsynaptic conductance (PSG) waveform [Kexc(t); Eq.14], to form signals, s1(t) and s2(t), which are scaled and multiplied to yield a conductance: for the ith synaptic interaction pair, where D = 1.75 (mask_amp). This facilitatory multiplicative interaction of signals offset in space and time is like a classical Reichardt detector (Reichardt, 1961). To generate an analogous suppressive interaction, we use the following: where D = 1 and [… ]+ indicates half-wave rectification.
In the postsynaptic delay models, the paired EX inputs have the same response latency, and the pairwise interaction itself includes a delay of the signal from the second unit in the pair (Fig. 2C). The delay is implemented by convolving the spike train of the second unit with a smooth, causal temporal filter, M(t), which we refer to as a mask: and M(t) = 0 for t < 0 (Fig. 2D). This function rises to a maximum of 1 at t = Δ, which is set to the value of the DS time delay, dt. Other smooth, causal functions that peak with a set delay would also be suitable. The result of this convolution forms the masking signal, s2(t) (Fig. 2D, red trace), associated with the second unit. The spikes from the first EX input to the subunit are multiplied by s2(t) and then convolved with the cortical PSG waveform, Kexc(t), to generate a conductance to the DS unit gi(t). For suppressive interactions, the function used to scale spikes from the first EX input is [1 − s2(t)]+, as in the presynaptic delay model.
The choice of dt is constrained by the SF and TF tuning of EX units in our model, which we have tuned to plausible values for V1 simple cells (De Valois et al., 1982; Foster et al., 1985; Hawken et al., 1996). Specifically, the optimal TF for our non-DS input cells is 10–12 Hz, giving a response period of 83–100 ms/cycle. In the quadrature model underlying our networks, non-DS inputs that are shifted by ¼ cycle are paired, so it takes 21–25 ms for the optimal grating to drift across the spatial shift between paired inputs. The exact value is not critical (10–40 ms is acceptable), and we set the dt to the round figure of 20 ms, which is close to optimal, except in Figure 3, G and H, in which dt is varied.
Our aim is to compare differences between the models that result from differences in the DS mechanism. In both the presynaptic and postsynaptic delay models, the populations preceding the DS level (LGN, IN, and EX units) have identical RFs and thus have the same tuning and firing rates in our simulations, given identical stimuli. The distribution over cortical distance and orientation of the EX inputs to the DS units is also identical in all of the models, as are the synaptic weights of these inputs. Thus, any differences in the behavior of the models should reflect solely differences in the DS mechanisms.
We also consider a hybrid of the presynaptic and postsynaptic delay models that has the same connectivity pattern as the presynaptic delay model but with a 10 ms time lag between delayed and non-delayed populations (rather than 20 ms). The nonlinear DS interaction is the same as for the postsynaptic delay model, but it pairs the delayed and non-delayed EX inputs and adds a postsynaptic delay of Δ = 10 ms, giving an overall dt of ∼20 ms.
Linear model: synaptic connectivity.
The linear model creates DS units with narrow orientation bandwidth (Fig. 1G, DS_Simp) in one step by directly summing conductance inputs from an appropriate combination of weakly oriented presynaptic units in the EX, IN, EX_D, and IN_D populations (Fig. 2E, gray ovals show presynaptic RFs). The inputs from each population are chosen to align with spatial Gabor-function templates: where (x,y) defines a position within the topography of the visual field, fs is the SF, φ is the phase, and σorth and σpar are the SDs of the Gaussians running orthogonal and parallel to the orientation of the sinusoid. Equation 4 corresponds to a vertically oriented template, but the template of each unit is rotated to match its target preferred orientation. The Gabor parameters are fs = 2 cycles/° (sf), σorth = 0.125° (sd_orth), and σpar = 0.215° (sd_par), with φ = 0° (phase) for EX inputs, 90° for EX_D, 180° for IN, and 270° for IN_D. The templates for the delayed populations are shifted 90° relative to the non-delayed populations, and those for the IN populations are opposite to those for the EX populations (Fig. 2E), establishing a “push–pull” arrangement. The 90° phase shift between delayed and non-delayed inputs imparts direction selectivity to both the summed excitatory and summed inhibitory conductances, and both have the same preferred direction as the DS unit that they drive. The push–pull arrangement generates excitatory and inhibitory conductances that are 180° out-of-phase in the preferred direction, matching conductance data reported for DS units in cat V1 (Priebe and Ferster, 2005).
For a given Gabor template, the algorithm for selecting specific inputs from the presynaptic population to the DS unit is as follows. For each potential presynaptic unit, an RF mask is computed by convolving a 2D array, which is 1 (−1) at the location of each ON (OFF) LGN input and 0 everywhere else with a 2D Gaussian equal to the LGN center Gaussian (Gctr in Eq. 6). A correlation coefficient (Pearson's r) is computed between the template and the mask for each unit. The mean and SD over all positive r values is computed, and initial synaptic weights are set by dividing each r value by the mean plus 2 SDs. All weights >1 are assigned to 1 to prevent a few of the most well-matched units from dominating the synaptic input. For efficiency in model execution, the number of inputs is minimized by pruning away weights <0.2 (minw). To make the connection pattern probabilistic, only 10% (prob) of these connections are maintained, chosen randomly. Finally, the remaining synaptic weights are scaled so that the total for each input population is equivalent to 30 (normw) unitary inputs.
IN and EX populations.
The IN and EX populations represent simple, non-DS V1 cells with varying degrees of orientation selectivity. In the presynaptic and postsynaptic delay models, the IN and EX populations each consist of 12 × 12 × 4 units (Fig. 1D,E) whose RFs have elongated ON and OFF subregions similar to 2D maps found in studies that have characterized the subunits of DS complex cells (Szulborski and Palmer, 1990). These RF profiles produce units with narrow orientation tuning (Fig. 1G, left). The z-layer index differentiates four spatial phases (0, 90, 180 and 270°) of the RFs (Fig. 1C). In the linear model, IN and EX units have smaller RFs (approximately half the size of their postsynaptic DS units) that are primarily unimodal, i.e., dominated by a single ON or OFF region (Fig. 2E, gray ovals), approximating the structure of subunits to DS cells also found experimentally (Livingstone and Conway, 2003). This RF structure produces weakly orientation-tuned units that approximate input-layer V1 cells (Fig. 1G, right). They are organized in a single layer, 20 × 20 × 1, with two spatial phases (0 and 180° corresponding to ON and OFF) alternating in a checkerboard manner. These smaller, less oriented units are more appropriate building blocks for the linear model, in which narrow orientation tuning and direction selectivity emerge in one synaptic step. The IN_D and EX_D populations, which are present in the presynaptic delay and linear models, are constructed identically to the IN and EX populations, except the former receive inputs from the LGN_D rather than LGN population. When characterized with classical drifting sine grating stimuli, the IN and EX units prefer on average TF = 12 Hz and SF = 1.6 cycles/° (1.2 cycles/° in the linear model).
The LGN synaptic input to each non-DS V1 unit is chosen probabilistically using a Gabor template, Gab(x,y) of Equation 4, that is rotated to match the target preferred orientation of the unit within an orientation map (Fig. 1D,E, color map) in which orientation varies from 0 to 180° around four pinwheel centers. The phase of the orientation map is such that the central units, e.g., EX_5_5_0 (highlighted in Fig. 1D), prefer approximately vertical (0°) orientation (indicated by red). For the presynaptic and postsynaptic delay models, the parameters of the Gabor function were fs = 2 cycles/° (sf), σorth = 0.125° (sd_orth), σpar = 0.215° (sd_par), φ = 0, 90, 180, or 270°, and the number of LGN inputs to each unit was 30 (nsamp). For the linear model, fs = 1 cycles/° (sf), σorth = 0.06° (sd_orth), σpar = 0.10° (sd_par), φ = 0, 180°, and the number of LGN inputs to each unit was 10 (nsamp). The specific inputs (as shown in Fig. 1B) were chosen probabilistically as follows. An LGN unit at position (i,j) is considered a match to the template if the value of |Gab(i,j)| is greater than a threshold, ε = 0.05 (eps). The probability of selecting any given match is the fraction of its value relative to the total value of all LGN units that are a match to the template. Chosen points at positive template values result in synapses from ON units, whereas negative values result in synapses from OFF units. The synaptic strengths are set to be equal for all LGN inputs. Because of the random sampling, the LGN-driven RFs of the V1 units are irregular (Fig. 1C) rather than ideal Gabor functions.
In the presynaptic and postsynaptic delay models, in which non-DS subunits with clear elongated ON and OFF subregions are desired, each EX unit receives inhibitory inputs from the IN population that are spatially anti-correlated with the EX unit in terms of the LGN-derived RFs (Fig. 1D,E). The synaptic weights of these anti-phase IN to EX connections are set by the strength of the anti-correlation. The algorithm for selecting the particular inputs is similar to that described above for the non-DS-to-DS connections in the linear model, except that the correlation r value is computed between the LGN-driven RF masks of EX and IN units. Only IN units within cortical distance 500 μm (cdist) of the EX unit are considered as possible inputs, and only negative r values are kept for additional processing, with the initial synaptic weight being set to |r|. For efficiency, weights <0.5 (minw) are pruned, and weights are scaled so that the total input to a given EX unit, wIN, is 10 (normw) unitary inhibitory inputs (defined below for the LIF model). The result of this anti-correlated connectivity is that the strongest inhibition to an EX unit comes from IN units with similar orientation but opposite ON/OFF subfields (Troyer et al., 1998), which reinforces the ON/OFF structure of the oriented EX RFs and attenuates the DC component of the feedforward excitation.
LGN population.
The LGN population (Fig. 1B) has dimensions 32 × 32 × 2 units. Layers 0 and 1 contain OFF- and ON-center units, respectively. Each step on the spatial grid represents 0.04° of visual angle, thus the representation covers a 1.28 × 1.28° patch of the visual field.
The raw input, R, to the LGN units is computed by convolving the visual stimulus, I (described below), with a difference of Gaussians (DOG) linear filter: The DOG filter is the difference of two functions that are Gaussian in x–y space and biphasic in time: where and This temporal filter, h(t), is the product of the biphasic temporal filter, hAB(t), used by Adelson and Bergen (1985): and a Gaussian envelope, hG(t) = e−t2/(2σh2), which controls the relative weight of the negative lobe of hAB(t). The parameter values for Gctr are α = 4 (amp1) and σg = 0.05° (sig1), for Gsurr are α = 0.1 (amp2) and σg = 0.25° (sig2). The temporal filter parameters are 1/k = 5 ms (tab_k), n = 3 (tab_n), δt = 8 ms (cs_delay), and σh = 25 ms (tsig).
The time-varying raw input, r(t), to a specific LGN ON unit, LGN_i_j_1, is R(i,j,t), whereas that for the OFF unit, LGN_i_j_0, is computed as Q − R(i,j,t), where Q is the integral of the DOG filter. The raw input, r(t), is transformed to an excitatory conductance, gexc(t), by scaling and adding an offset and noise, as follows: where GWN is Gaussian-filtered Gaussian white noise with SD σn and temporal SD σt, and A and B are constants. There is no inhibitory conductance to the LGN units. For LGN units in all models described here, A = 3.5 (gx_scale), B = 28.5 nS (gx_bias), σn = 2.0 nS (gx_noise/sd), and σt = 1.0 ms (gx_noise/tsd). The excitatory conductance drives our standard LIF model (described below) to produce an output spike train.
When driven with drifting sinusoidal grating stimuli, the LGN units have SF and TF tuning curves that peak at 1.6 cycles/° and 18 Hz, respectively. In DS models requiring the additional, temporally delayed LGN_D population, we simply impose on the raw temporal response, R, a fixed temporal delay, dt = 20 ms (t_delay), except when other values are stated. Otherwise, the function for the LGN and LGN_D populations are identical, except for independent noise.
Synaptic conductances.
All neurons in the models are conductance-driven LIF units, and the conductances are determined primarily by spikes, which are the sole means of communication between units. This section describes for each population how the conductances are computed. Only the stimulus-driven excitatory input to the LGN (described above) and additive conductance noise (described below) are not based on input spikes.
DS units.
In addition to the input from the nonlinear DS subunits described above, DS units also receive background spikes that activate their excitatory and inhibitory conductances. These background inputs, modeled as Poisson spike trains, provide a source of independent noise to each unit, maintain them near threshold, and establish a minimal spontaneous firing rate. The summed spike trains are represented as follows: where γ is the scale factor (amp) and NP is the number of Poisson spikes determined by the rate (rate), and tj are spike times (Table 1 shows rates and γ values). Thus, the total excitatory and inhibitory conductances for a given DS unit in the presynaptic and postsynaptic delay models are as follows: where gi(t) is the conductance produced from the ith subunit, Nsub is the number of subunits, Sexc(t) and Sinh(t) are the background excitatory and inhibitory spike trains, respectively, and Kexc/inh are the unitary PSG waveforms for the excitatory and inhibitory conductances (see below, LIF units). The PSGs are α functions: where 1/α = 3 ms is the time-to-peak (tau) and c = 1 nS is the maximal conductance (amp) for excitatory inputs, and 1/α = 4 ms (tau) and c = 4 nS (amp) for inhibitory inputs. This form of PSG was chosen for computational efficiency (Bernard et al., 1994).
In the linear model, the total conductances to a DS unit are as follows: where Sexc(t) and Sinh(t) are the background spikes, and the other S(t) terms represent the summed spike trains from the four input populations (indicated by the subscripts), each having the following general form: where Ncells is the total number of inputs to the DS unit from the given population, Nspikesi is the number of spikes from the ith unit in that population, wi is the scaled synaptic weight from the ith unit, and tij are the spike times.
EX and IN units.
The excitatory and inhibitory conductances for any given IN or EX unit can also be written as convolutions of spike trains with PSGs: where SLGN is the sum of the LGN input spikes (as in Eq. 17, with wi =1 for all i). The LGN PSG is a difference of exponentials function: for t ≥ 0 (0 otherwise), where τr =1 ms (tau_r) and τf = 4 ms (tau_f) are the rise and fall time constants, respectively.
The IN spikes, SIN(t), to a single EX unit (presynaptic and postsynaptic delay models), are also captured by Equation 17.
LIF units.
Each unit in the model is a conductance-driven LIF unit, where the intracellular voltage V is, where C is the membrane capacitance (c_x), Vexc (v_ex) and Vinh (v_in) are reversal potentials for the excitatory and inhibitory conductances, gexc and ginh, respectively, and Vleak (v_leak_x) is the reversal potential for the leak conductance, gleak (g_leak_x). When V reached Vthresh (v_th_x), a spike was discharged and V was set to Vreset (v_reset_x). To implement a refractory period, V was held at Vreset for a short refractory period (trefr_x) after each spike.
We used the same values as Troyer et al. (1998) for our IN and EX units: IN, C = 214 pF, gleak = 18 nS, Vleak = −81.6 mV, Vreset = −57.8 mV; EX, C = 500 pF, gleak = 25 nS, Vleak = −73.6 mV, Vreset = −56.5 mV. Our DS units have the same values as the EX units. For LGN, C = 200 pF, gleak = 75 nS, Vleak = −73.6 mV, and Vreset = −56.5 mV. Refractory periods were 1.5 ms for IN, 2.5 ms for EX and DS, and 0.5 ms for LGN (trefr_x). The LGN refractory period also had an added stochastic component, which was the absolute value of a zero-mean Gaussian random variable (SD = 2 ms; trefr_x_sd). For all units, Vexc = 0, Vinh = −70 mV, and Vthresh = −52.5 mV (values of Troyer et al., 1998). The voltage equation was simulated using a fifth-order Cash–Karp Runge–Kutta method with adaptive step size (Press et al., 1992). Any negative conductance values, if they occurred, were set to 0.
We also tested models in which the LIF computation was replaced with a Poisson-like spike generation mechanism, keeping the mean driven firing rates at similar levels. In these models, the total excitatory (g_exc) and inhibitory (g_inh) synaptic conductances from the LIF model are used to directly compute an output firing rate as follows: Rate(t) = offset + scale × (gexc + scale_in × ginh + offset0). In addition, we removed the background noise spikes because they are unnecessary. The Poisson versions of the models are available at www.imodel.org and are named with an additional “.Poiss ” extension, e.g., DS_Post_Fac.Poiss. The results are very similar to those obtained with the LIF spiking mechanism, demonstrating that the exact method by which spikes are generated is not critical for our conclusions. Nevertheless, we use the conductance-based mechanism here because it permits us to explore an important model (the linear model) that is based on conductance data.
Visual stimuli.
The visual stimulus, I(x,y,t) (Eq. 5), for our modeling framework was designed to closely approximate that presented on CRTs during typical electrophysiology experiments. We represented stimuli on the same spatial grid (32 × 32, 0.04°/pixel) used to define the LGN population and at a 2 ms resolution in time. Drifting sine-wave gratings were represented using values from 0 to 1 (black to white) and were presented within a circular aperture (2.5° diameter) that covered the entire RF. All simulations with grating stimuli were run at TF = 10 Hz, SF = 1.6 cycles/°, and contrast 50% unless indicated otherwise. For direction tuning curves (see Figs. 1G; 3A,B; 4A,B; 7A), we ran 10 trials of each stimulus (4 s per trial). For CCG plots using grating stimuli (see Figs. 3⇓–5, 7⇓–9), simulations were run for 100 trials.
We also tested the models with a random 1-D ternary white noise stimulus consisting of 16 adjacent bars presented simultaneously at a frame rate of 50 Hz. The bars have a width of 0.2°, length of 2.0°, and orientation 0° and are randomly assigned to be black, white, or mean gray for each 20 ms frame. Simulations with this stimulus were run for 50 trials at 8 s duration (see Fig. 6).
CCGs.
The main data analysis tool in this study is cross-correlation, and our methods have been described previously in full (Bair et al., 2001). We provide a brief description of the essential details here. Spike trains from M trials for the two neurons are represented as discrete binary signals of period T at the millisecond resolution: where k = 1,2 and 1 ≤ t ≤ T and 1 ≤ i ≤ M. The spike train cross-correlation function is defined as and the CCG is a normalized form of the cross-correlation function where λ1 and λ2 are the mean firing rates (in spikes per second) of the neurons. The function ϴ(τ) is a triangle representing the extent of overlap of the spike trains as a function of the discrete time lag τ: where T is the duration of the spike train segments used to compute C1,2. Dividing C1,2 by ϴ(τ) in Equation 24 changes the units of our CCG from raw coincidence count to coincidences per second and corrects for the triangular shape of C1,2 caused by finite duration data.
To eliminate stimulus-driven correlations, we subtract a shift predictor, which is the average correlation computed over all nonsimultaneous trials (Bair et al., 2001). After the shift predictor is subtracted, CCGs are smoothed by convolving with a Gaussian (SD = 2 ms). We defined the peak of the CCG to be the maximum positive value of the smoothed CCG between 0 and 50 ms after the presynaptic spike. Dips were similarly defined as the maximal negative value reached in this interval. The time-to-peak of the CCG was measured as the time lag at which the peak value was reached.
Results
Our results are organized in five sections as follows. First, we describe three classes of DS models that we have built, which differ in the location of the DS delay and whether the interaction of the non-DS inputs is linear or nonlinear. Second, we show how the location of the DS delay can be revealed using cross-correlation. Third, we introduce suppressive versions of the models and show that CCGs can also reveal the sign of DS nonlinearity, and we describe a novel form of stimulus-dependent correlation that is predicted by the models. Fourth, we show that these results hold for both linear and nonlinear models of direction selectivity. Fifth, we explore how the results hold up with variation in the number of presynaptic inputs, which can determine how visible key features of the CCGs would be for in vivo experiments. Finally, we outline a set of proposed experiments based on the predictions from the modeling results.
Simple network models for DS neurons
The models studied here are implemented as networks of LIF units that represent ON- and OFF-center LGN cells, V1 simple cells, and V1 DS cells (Fig. 1; see Materials and Methods). We introduce the models below in terms of three classes: presynaptic delay, postsynaptic delay, and linear. The first two classes involve multiplication of spatially offset inputs, similar to the Reichardt detector (Reichardt, 1961), whereas the latter approximates a space–time oriented linear motion filter.
Presynaptic delay models are widely supported by studies proposing that DS cells are built from non-DS inputs that have different response latencies or temporal phases (Adelson and Bergen, 1985; De Valois et al., 2000; Saul et al., 2005). In our implementation (Fig. 2A), each DS unit (yellow circle) receives input from orientation-tuned EX simple cells (green circles) that are driven by two distinct populations of LGN cells: delayed and non-delayed (red and black arrows, respectively). The LGN populations differ only in their temporal response attributable to the introduction of a delay, dt, that shifts the response latency of one population (Fig. 2B). Each input subunit (Fig. 2A, white circles) to the DS cell is formed by pairing a delayed excitatory unit (EX_D) with a non-delayed excitatory unit that has a spatial RF shifted by 90° of phase relative to the delayed unit. The signals arising from the paired non-DS cells are multiplied (see Materials and Methods), and the resulting outputs from multiple subunits are summed to form an excitatory conductance that drives the DS unit. Overall, this model involves a fundamentally nonlinear DS interaction between two inputs that are offset in space and time.
Our postsynaptic delay model is motivated by studies showing that dendrites possess mechanisms for nonlinear integration that allow different spatiotemporal patterns of input to be distinguished (Torre and Poggio, 1978; Koch et al., 1982; Borg-Graham and Gryzwacz, 1992; Mel, 1993; Branco et al., 2010). As in the presynaptic delay model, hypothetical subunits combine inputs from pairs of simple cells with spatially offset RFs, but there is no presynaptic temporal delay (Fig. 2C). The delay is implemented in the postsynaptic subunit by convolving the spikes arriving from one unit (the delayed unit) with a mask (a smooth, causal function) that peaks with a fixed dt after the spike and decays rapidly (Fig. 2D, red trace). The smooth, delayed signal is used to scale the conductance contributed by the spikes from the non-delayed unit in the pair. Thus, the mask creates a delayed window of sensitivity in the postsynaptic cell to the input from the non-delayed cell. This is not intended to represent a specific dendritic mechanism but is a simple computational representation of the impact of a postsynaptic delay on incoming spikes. For example, Koch et al. (1982) suggested this could arise in DS circuits from differences in the kinetics of conductances, and a recent cortical study has demonstrated directional, temporally structured, supralinear interactions between dendritic inputs (Branco et al., 2010). In summary, our postsynaptic and presynaptic delay models share the same architecture of synaptic connections but differ fundamentally only in where the DS delay is applied.
Our third model, the linear model, lacks the multiplicative nonlinearity of those described above. In this model, the DS cells receive direct input from both EX and IN simple non-DS units (Fig. 2E). These non-DS inputs are chosen so that collectively they create a tilted space–time filter, as motivated by demonstrations that the sum of separable filters can produce inseparable directional filters (Adelson and Bergen, 1985; Watson and Ahumada, 1985). In keeping with experimental evidence (Monier et al., 2003; Priebe and Ferster, 2005), the inhibitory and excitatory inputs are chosen so that excitatory and inhibitory conductances in the DS cell are tuned to the same preferred direction, and, to reproduce the observed time course of these conductances, the inhibitory and excitatory inputs are spatially out-of-phase by 180°. The delay (dt) in this model comes from two populations of LGN units that differ in response latency and drive separate delayed and non-delayed populations of excitatory and inhibitory units, as in our presynaptic delay model.
All three network models produced DS units with high direction indices (DIs). For example, Figure 3A shows the direction tuning curves for one DS unit (purple trace) and for a typical pair of excitatory units (black and cyan traces) that form an input subunit to the DS unit in the presynaptic delay model. Similar curves hold for the postsynaptic delay model (Fig. 3B) and the linear model (described below). The DIs ranged from 0.84 to 1.0 in these models. The preferred directions of all DS units were near 0° (rightward motion, by design), although the preferred orientations of the non-DS inputs varied substantially from 0° (see Materials and Methods). Additional functional characterization of the constituent units (LGN, IN, EX, and DS) in these models can be found online at www.imodel.org, in which the models are named DS_Post_Fac (postsynaptic delay), DS_Pre_Fac (presynaptic delay), and DS_Simp (linear model).
Probing the DS time delay
To date, there is no direct experimental evidence for where the time delay necessary for building cortical DS cells originates, but spike train cross-correlation offers a way to answer this question. Figure 3C shows the CCG between orientation-tuned simple EX units and a DS unit in the presynaptic delay model. The peak in the CCGs near 0 reflects the direct synaptic connection between the EX and DS units, and it is shifted slightly to the right of 0 because the DS unit fires with higher probability after its input. Importantly, both the delayed and the non-delayed units that form the DS subunit have the same time-to-peak in the CCG: the curves for the non-delayed (black) and delayed (cyan) inputs are superimposed and difficult to distinguish. Thus, for the presynaptic delay model, there is no information in the CCG about the relative delay between the non-DS inputs.
Figure 3D shows analogous CCGs for the postsynaptic delay model. In contrast to the previous example, the timing of the peak in the CCG depends on which EX unit is included in the CCG computation: the CCG for the input that gets delayed postsynaptically has a longer time-to-peak (cyan curve) than the CCG for the input that does not get delayed (black curve). In our network model, the DS unit receives inputs from multiple EX cells (34 in this example) with weights that fall off with cortical distance and with difference in preferred orientation (see Materials and Methods); however, the timing distinction here held for all inputs. For example, for a pair of EX inputs with lower synaptic weight, approximately half that of the strongest input to the recipient DS cell, the amplitude of the CCG peaks (Fig. 3C,D, dashed traces) are attenuated but the differences in timing remain clear.
These results demonstrate a simple but powerful principle: any delay that occurs before the point at which the presynaptic signals are recorded will not be reflected in the CCG, whereas any delay incurred after that point will be. Extracellular recordings in cortex typically reflect somatic action potentials; therefore, delays occurring in the retina, LGN, or the dendrites or soma of the EX units would all generate CCGs consistent with the presynaptic delay model. Conversely, any delay arising after spike generation in the presynaptic input will appear in the CCG between those spikes and the DS cell output. Thus, any type of postsynaptic delay, regardless of the mechanism, would generate CCGs with peaks shifted to reflect the delay, as observed for the postsynaptic delay model.
Thus, our models make different experimental predictions for DS circuits with presynaptic and postsynaptic delays. For presynaptic delays, the time-to-peak distribution for CCGs between non-DS inputs and postsynaptic DS targets would be narrow and centered close to 0 time lag, characteristic of fast monosynaptic connections in cortical circuits. Figure 3E shows this distribution for all excitatory input connections to all DS units in the presynaptic delay model (for details, see figure legend). In contrast, postsynaptic delays would produce a distribution of peak times that is far broader, encompassing short time lags for non-delayed inputs and longer lags for inputs that get delayed as part of the DS computation. Figure 3F shows this distribution for all excitatory inputs to DS units in the postsynaptic delay model. This distribution is bimodal, with the early peak (black arrow) reflecting non-delayed inputs and the late peak (cyan arrow) reflecting delayed inputs. Although the models have a single DS delay (20 ms) and fixed AMPA conductance time course (4 ms), diversity appears in the time-to-peak distributions as a result of errors in estimating the peak times for connections that have low synaptic weights, thus noisier CCGs. Even with this diversity, the scale of the postsynaptic delay can be discerned from the spread of values across the entire time-to-peak histogram.
To examine how the time-to-peak in the CCGs depended on the dt, we varied dt and computed CCGs between the delayed excitatory input unit and the postsynaptic DS unit. Reasonable dt values for these models, which involve ¼ cycle phase shifts, are ¼ period of optimal TFs. For example, 10–40 ms maps to TFs of ∼25 to 6 Hz, which covers a wide range of optimal TFs for V1 DS neurons. Varying dt over this range in the presynaptic delay model had no effect on the timing of the CCG peak, which remained at ∼7 ms (Fig. 3G). In the postsynaptic delay model, however, the peak shifted systematically as dt was increased (Fig. 3H).
It is possible that a combination of mechanisms could exist in vivo; therefore, we tested a hybrid model that combines presynaptic and postsynaptic delays. In this hybrid, the delayed excitatory input units are delayed by only 10 ms but have an additional 10 ms of postsynaptic delay. The distribution of CCG time-to-peak values from non-delayed and delayed inputs in the hybrid model are superimposed in Figure 3F (red line). The histogram has a bimodal distribution not unlike that for the postsynaptic delay model, but the second peak in the histogram now occurs 10 ms earlier than in the model with a purely postsynaptic 20 ms delay. Thus, in the hybrid model, the delayed inputs have CCG time-to-peak values that reflect the shorter postsynaptic delay. This reinforces the observation above that any postsynaptic delay will appear in CCG peak times, whereas presynaptic delays will not be visible in this analysis. In general, a more complete picture of where the DS delay originates can be built by measuring both the response latency and the CCG peak times for non-DS units that have significant CCG peaks to DS targets. Response latency distributions will be broad if substantial presynaptic delays are involved in computing direction selectivity, as suggested by De Valois and colleagues (De Valois and Cottaris, 1998; De Valois et al., 2000).
We have focused on how the CCG peak depends on dt, but it is worth noting that the CCG peak is sensitive to all the processes that shape the temporal profile of input to the DS neuron. For example, changing the time course of the synaptic conductance in the model will alter the shape and timing of the CCG peak, as has been explored in previous studies (Veredas et al., 2005; Ostojic et al., 2009). Likewise, the CCG peak for the delayed input in the postsynaptic delay model is determined primarily by the shape of the mask, which is substantially longer than, and thus obscures the shape of, the rapid AMPA PSG waveform. Our framework allows testing the relative influence of a variety of effects, e.g., background firing rates and passive time constants, in network models with functional inputs, which to date have been examined only in simpler circuit models (Melssen and Epping, 1987; Veredas et al., 2005; Ostojic et al., 2009).
Inter-neuronal correlation in facilitatory versus suppressive DS circuits
A second key unsolved problem of cortical DS circuitry concerns the type of nonlinear interaction involved. There is evidence for both suppressive interactions, in which input generated by stimulus motion in the antipreferred direction is less than expected from linear summation across the RF (Reid et al., 1991; Tolhurst and Dean, 1991), and facilitatory interactions, in which preferred direction input is greater than that predicted by a simple linear model (Albrecht and Geisler, 1991; DeAngelis et al., 1993).
To understand how CCGs could be used to address this problem, we implemented suppressive variants of the two facilitatory models described above. Both the presynaptic and postsynaptic delay models above operate by multiplying signals, s1 and s2, from the presynaptic excitatory units within a DS subunit. To implement suppression, we instead compute the product s1 × (1 − s2), where s2 is the delayed signal (see Materials and Methods). Thus, in our suppressive models, the non-delayed signal provides excitatory drive that is suppressed by the delayed signal. In these models, we do not explicitly represent the biophysical mechanism for suppression. In particular, we allow spikes arising from the EX population (normally intended to be excitatory simple cells) to cause suppression in postsynaptic DS units. The relevant units could have been formed into a separate population and relabeled as “IN” units, but for simplicity we have not done so.
The direction tuning curves for the suppressive model DS units (Fig. 4A,B, dashed lines) show less selectivity (have lower DIs) than those for the facilitatory models (solid lines). This reflects the nature of the underlying suppressive computation as follows. When the excitatory (s1) input fires, the complementary suppressive input (s2) to the DS subunit is unlikely to have fired recently by chance, because typical cortical firing rates are low. In the facilitatory models, this means it is unlikely that a given spike is facilitated, whereas in the suppressive models, it is unlikely that a given spike is suppressed. Thus, the suppressive models operate in a more permissive regime in which baseline firing rates are higher (Fig. 4A,B, dashed circles are larger than solid). The suppressive models can be adjusted to increase the DI by retuning other network parameters, for example, by tuning the excitatory units to have a higher F1/DC ratio for drifting gratings and a lower spontaneous firing rate. We built such a model and found results similar to those presented below (data not shown). We use the lower DI models here because they are identical to the facilitatory versions except in the DS mechanism.
We hypothesized that changing the sign of the DS interaction from facilitatory to suppressive would cause the CCG peak to be replaced by a dip for the suppressive connections. When stimulated by a drifting grating that was optimal for the DS unit, both the presynaptic and postsynaptic facilitatory models showed clear CCG peaks (Fig. 4C,D, solid blue traces). However, the CCGs for the suppressive versions of the models were relatively flat (Fig. 4C,D, dashed blue traces), showing little or no evidence of functional connection. The hypothesized dips in the CCG were, however, revealed when we presented a grating drifting in the antipreferred direction (Fig. 4C,D, dashed red traces). At the same time, antipreferred motion generated much smaller peaks for the facilitatory DS connections (solid red traces). These observations demonstrate that experimentally observed correlation between the non-DS and DS stage is only partly determined by the functional connection and depends strongly on the visual input.
To verify the robustness of the direction dependence, we computed CCGs for all connections from delayed excitatory units to DS units in all models (the delayed inputs being those that provide either the facilitatory or suppressive signal to the DS interaction). The CCG peak and dip amplitudes are plotted as a function of the synaptic strength in Figure 4E–H for preferred (blue) and antipreferred (red) motion. In the facilitatory models (Fig. 4E,F), the CCG peaks grew with synaptic weight for preferred motion (blue points) but remained low across all weights for antipreferred motion (red points). For suppressive models (Fig. 4G,H), CCG dips grew larger with weight for antipreferred motion (red points) but remained low across weights for preferred motion. In general, the strongest stimulus dependence was observed for high synaptic weight inputs, but moderate to weak inputs also showed clear stimulus dependence in both presynaptic (Fig. 4E,G) and postsynaptic (Fig. 4F,H) delay models. In places in which the red and blue data points overlap, a pairwise comparison revealed that this is primarily scatter across connections; the two data points for a particular connection almost always maintained the ordering consistent with the entire population of inputs. The few exceptions in each model were at the lowest synaptic weights.
In summary, facilitatory and suppressive interactions within functional DS circuits have distinct and stimulus-dependent signatures in the CCGs obtained in our models: facilitation is revealed as peaks in CCGs for preferred motion, whereas suppression is exposed as CCG dips in response to antipreferred motion. This raises important caveats for applying CCG analysis in vivo to assess the sign of DS interactions. In particular, relying on the optimal stimulus, although it has the advantage of raising DS firing rate and thereby improving CCG estimates, may completely conceal strong functional interaction. If circuitry like that of the suppressive model were encountered in vivo, in addition to CCG dips, peaks will also be found for the non-delayed excitatory inputs, as in the facilitatory models (Fig. 3C,D). Thus, experimental findings of both peaks and dips between non-DS inputs and DS units in vivo would be characteristic of suppressive DS circuits.
Understanding the mechanism behind the stimulus-dependent correlation observed in these modeling experiments is useful to help guide experimental design and to place these observations in the context of related studies. The underlying mechanism can be visualized by comparing the peristimulus time histograms (PSTHs) for a pair of non-delayed and delayed inputs that form a DS subunit in the suppressive presynaptic delay model. These signals, which enter the nonlinear interaction, have very little temporal overlap for preferred motion (Fig. 5A, shaded region indicates overlap) but have substantial overlap for antipreferred motion (Fig. 5B). It is in this region of overlap in which the delayed unit suppresses incoming spikes from the non-delayed input, thereby diminishing the firing of the DS unit. This suppression results in robust dips in the CCG for antipreferred motion. The lack of overlap in Figure 5A means that there is little chance for actual suppression during preferred motion and, thus, little or no dip in the CCG. This type of stimulus-dependent correlation is novel because it originates from the sensitivity of the DS mechanism to the relative timing of inputs rather than from changes in population recruitment or connection strengths, which have been explored extensively in the literature (see Discussion).
If it is indeed the temporal overlap of input spikes that controls the dip amplitude, then changing the stimulus TF should also affect the CCG, because a lower TF drives spikes over a longer time interval in the excitatory inputs, while the DS delay remains fixed at 20 ms. This can be seen in the PSTHs for low TF (1 Hz) gratings in both directions (Fig. 5C,D). There is more overlap for low TFs, thus more coincident spiking in the nonlinear suppressive interaction, even for the preferred direction (Fig. 5, compare C, A). This suggests that preferred direction stimuli should produce CCG dips when lower than optimal TFs are used to drive the cells. To verify this, we computed CCGs using preferred direction motion at low and high TFs in the suppressive presynaptic delay model (Fig. 5E, blue lines). At TF = 1 Hz (light blue line), the CCG dip is apparent, although it remains less than half as large as that for 10 Hz antipreferred motion (red line). Nevertheless, these results suggest that testing with low-TF, preferred-direction stimuli may aid in the detection of suppressive functional connections in experiments in which antipreferred motion produces too few spikes for reliably measuring correlation.
A similar direction and TF dependence also holds for the facilitatory model (Fig. 5F): the weak CCG peak (red line) for an antipreferred stimulus at the optimal TF (10 Hz) becomes substantially larger as TF is decreased to 1 Hz (orange line). However, because these CCG peaks are smaller than those for optimal preferred motion, facilitatory DS interactions will still be most effectively detected experimentally using the optimal TF and direction (blue line), in which the combination of a large CCG peak and high firing rate affords the best signal-to-noise ratio.
The visibility of peaks and dips in the CCGs strongly depend on the parameters of the sinusoidal gratings in the above tests, but in an experimental setting, it may be preferable to use a stimulus that reliably drives motion in multiple directions simultaneously. To test this idea, we used a stimulus that has often been used to map directional interactions in vivo: random 1-D ternary white noise. This stimulus consists of 16 adjacent bars (Fig. 6A, icons) at the optimal orientation for the DS unit, in which the luminance of each bar was assigned randomly to be black, white, or mean gray for each frame at 50 Hz (see Materials and Methods). This has been shown to drive DS complex cells in vivo (Emerson et al., 1992) at spike rates similar to those we used to generate CCGs with sinusoidal gratings (10–30 spikes/s). Figure 6 shows CCGs between delayed inputs and the recipient DS unit in the presynaptic (Fig. 6A) and postsynaptic (Fig. 6B) delay models in response to the random bar stimulus. This single stimulus is able to reveal robust CCG peaks for the facilitatory models (black traces) and dips for the suppressive models (gray traces). Because this stimulus involves a random sequence, unlike the sine stimulus, we tested 10 different randomization seeds and found that the CCG shape had very little dependence on the particular sequence (multiple traces show runs for different seeds). As with the grating stimuli, shift predictors (insets) have been subtracted. Thus, random dynamic stimuli appear to be better able to activate simultaneously the inputs to a variety of nonlinear DS interactions. More generally, such stimulation could be valuable for broadly screening CCGs across large populations in which the underlying mechanisms and the nature of temporal interactions are not well established.
In summary, Figures 4⇑–6 show how CCG features in our DS circuits depend strongly on the sign of the DS interaction and on the visual stimulus used. These results are instructive for both the design and analysis of CCGs obtained experimentally. In networks with facilitatory interactions, CCGs will show only peaks when there is a functional connection. These peaks should be strongest for stimuli in the preferred direction for the DS cell. In contrast, networks with suppressive interactions can produce CCGs that have both peaks and dips. Furthermore, dips will be most apparent when using stimuli that have antipreferred direction motion. Thus, to ensure that all DS interactions are revealed, it is necessary to use a stimulus protocol that drives both preferred and antipreferred direction motion, as we have shown for both gratings with varying direction and TF, and dynamic random stimuli. When determining the amount of data to collect, it is important to consider that CCG dips may only be revealed by presenting non-optimal stimuli.
Inter-neuronal correlation in a linear model for DS cells
Experimental evidence supports another important class of DS model that requires only linear summation of non-DS synaptic inputs followed by spike threshold to produce strongly DS responses (Reid et al., 1991; Jagadeesh et al., 1993, 1997; Priebe and Ferster, 2005); therefore, we examined whether the principles derived from the multiplicative nonlinear subunit models above also apply to this linear model. In its most basic form, this is a DS simple cell model, and we have implemented it as such (Fig. 2E; see Materials and Methods) and made it available as the DS_Simp model at www.imodel.org.
This linear model generated highly selective DS units (e.g., DS_0_1; Fig. 7A), with DIs ranging from 0.84 to 0.94. We computed CCGs between the non-DS inputs, which include both EX and IN units, and the postsynaptic DS unit for antipreferred and preferred motion (Fig. 7B). The CCG peaks for both delayed and non-delayed inputs occurred at similar, short time lags (data not shown). This is expected because the dt in the linear model arises in the LGN populations. It is therefore also a presynaptic delay model and behaves like the nonlinear presynaptic delay model with respect to CCG timing.
In the linear model, the responses of the DS units are determined by summation of non-DS inputs. We found that excitatory inputs generated peaks in the CCGs, whereas inhibitory inputs created dips. These peaks and dips were visible for both antipreferred and preferred (Fig. 7B) motion. However, the amplitude of the CCG features varied with direction in a manner consistent with the behavior of the nonlinear models above: the excitatory-to-DS CCG peaks were larger for preferred motion, whereas inhibitory-to-DS CCG dips were larger for antipreferred motion. This is somewhat difficult to observe in the plots of CCG amplitude against synaptic weight for preferred and antipreferred motion (Fig. 7C,D), but it is more clear when plotting the difference (preferred minus antipreferred) in CCG peak and dip amplitude (Fig. 7E,F, respectively). Stimulus dependence was apparent across all synaptic strengths for excitatory inputs (Fig. 7C,E) but less consistent for inhibitory inputs (Fig. 7D,F). However, for both sets of inputs, the difference in CCG amplitude with stimulus direction was statistically significant (see legend of Fig. 7).
We demonstrated above for the multiplicative models that direction- and TF-dependent correlation arises because the relative timing of spikes from paired non-DS inputs engages the DS interaction in a stimulus-dependent manner (Fig. 5). In the linear model, delayed and non-delayed inputs are not paired as in the multiplicative models. Therefore, to demonstrate how relative timing between inputs shapes the stimulus dependence of CCGs, we modified the linear model by simply eliminating the dt. This preserves the total amount of input to the DS unit in the two directions and disrupts only the relative timing of inputs. We found that this manipulation eliminated the difference in CCG peak and dip amplitude with stimulus direction (excitatory inputs, p = 0.59; inhibitory inputs p = 0.91, two-tailed paired t test). Thus, the relative timing of inputs underlying the DS mechanism in the linear model is also essential for stimulus dependence of the CCG.
That we find similar results for CCGs obtained with the multiplicative and linear models, which are substantially different but encompass the most widely cited schemes for direction selectivity, suggests that the temporal dependence of CCGs may be a fundamental characteristic of diverse DS circuits. This increases the likelihood that the trends revealed here may be observable in electrophysiological experiments.
Although the CCG across a non-DS-to-DS synapse can reveal much about the interaction, we found that it did not readily reveal whether that connection corresponds to one in a linear model (e.g., DS_Simp) or to a nonlinear multiplicative subunit (as in our presynaptic and postsynaptic delay models). This follows because, for a multiplicative subunit, the effect of an observed input on the output is determined by weighting (multiplying) its value by the value of the other (unobserved) input. This is essentially the same as for a single, weighted input to the linear model, which is multiplied by its synaptic weight. Analysis methods that use knowledge of responses for three units, including two nonlinearly paired inputs and the postsynaptic unit, can distinguish such a mechanism from a linear model, but collecting the relevant data in vivo would be substantially more difficult.
Network connectivity affects the amplitude of features in CCGs
Whether the patterns of correlation described above can be observed in vivo, when recording time is limited, may depend on network connectivity. If there are many independent inputs with low connection weights, CCG features could become too small to reliably reflect functional connections. Furthermore, the amplitude of the CCG peak generated by a particular connection may depend on the firing rate of the presynaptic neuron and how closely its tuning matches that of the postsynaptic DS cell. Although these issues have received little attention experimentally, they are important in practice and can be readily demonstrated and examined in our models.
A fundamental question about DS circuits is whether the total input to a DS cell is provided by many inputs with small weights, or fewer, strongly weighted inputs. To examine this in the presynaptic delay model, we assumed a uniform weight distribution over the excitatory inputs to the DS units, as opposed to the approximately Gaussian distributions for the models in the previous figures (see Materials and Methods). To maintain a constant DS firing rate as the number of inputs, n, increases, the synaptic weight of each input was decreased to keep the total input constant. Figure 8A shows CCGs for a single delayed excitatory-to-DS connection obtained while varying n in the presynaptic delay model (non-delayed inputs behaved similarly). The amplitude of the CCG peak decreases as n increases, reflecting the reduced relative contribution of one among many inputs. Although CCG amplitude is strongly dependent on synaptic weight, CCG time-to-peak remains constant as input strength is varied.
The CCG peak amplitudes for all delayed excitatory inputs to the same DS unit are plotted in Figure 8B as a function of n. The black lines connect all points corresponding to two particular synaptic connections that persisted as n increased. All connections show consistent decreases in CCG amplitude as the number of inputs to the DS unit increases. Despite the decrease with n, the peak measurements remained above baseline (Fig. 8B, red line). The CCG amplitudes are expected to decay toward a limit near 0 that depends on the correlation in the input population. Our delayed and non-delayed excitatory units have modest spike count correlation (rSC = 0.024 on average for neighbors; rSC is Pearson's correlation coefficient of spike counts for 100 4-s-long trials of a grating drifting at direction 0°), which arises from their common LGN and inhibitory inputs (see Materials and Methods). If correlation among inputs were higher, the EX-to-DS CCG peaks would be even larger than shown. This suggests that an input providing ∼
Although synaptic weight strongly influences the size, and thus visibility, of CCG peaks, Figure 8B shows that there remains a considerable spread in CCG peak amplitude, even when synaptic weights are identical. We found that this spread related to differences in firing rates and tuning across the presynaptic population. Figure 9A shows the full CCGs for the four delayed excitatory connections corresponding to the points at n = 4 in Figure 8B generated from responses to motion in the direction (0°) preferred by the postsynaptic DS unit. Plotting CCG peak amplitude versus mean firing rate for each input (Fig. 9B) reveals a clear positive correlation. Thus, inputs that fire more strongly tend to generate larger CCG peaks. Much of the variation in firing rate across units in response to a particular stimulus arises from variation in the RFs, or tuning, of those units. This variation is caused partly by probabilistic connectivity, e.g., the sparse sampling of LGN inputs to build simple V1 RFs, but more so by the broad bandwidth of orientation preferences (Gaussian, SD = 30°) of inputs to the DS unit. The diversity across orientation tuning curves for the DS unit and its inputs is shown in Figure 9E–I. By changing the visual stimulus from 0 to 337° motion, the CCGs and the firing rates reorder (Fig. 9C,D), and the input unit EX_D 5_8_0 (black trace) now has the largest response and CCG peak. This peak is as large as the largest obtained for the 0° stimulus, although the firing rate of the DS cell is significantly reduced for the 337° grating (Fig. 9E). The reordered CCG peak amplitudes still follow the firing rates of the inputs (Fig. 9D). This result is similar to previous experimental results showing that correlation among simultaneously recorded cells with similar spatial properties was strongest when the stimulus orientation was close to preferred for all the recorded cells (Snider et al., 1998; Lampl et al., 1999). Our result goes a step farther because it predicts that, even when tuning preferences of the recorded neurons are very similar, CCG peaks can be largest when stimuli are optimized to drive the input non-DS cells, not the postsynaptic DS cell. This suggests that it could be important to tailor stimuli to the tuning preferences of particular units in paired and multiunit recordings to fully reveal the nature of functional connections.
Overall, these results suggest that non-DS-to-DS CCG peaks should be visible for a broad range of physiologically plausible conditions on the number and distribution of inputs. Furthermore, we predict that CCG amplitude will depend on the preferred orientation of the non-DS input, with the strongest correlation when the stimulus is at the optimal orientation for the input. This again highlights the importance of careful selection of stimulus parameters for revealing functional correlations in DS circuits.
Proposed experiments
Our results suggest the following experimental protocols for revealing functional DS interactions. (1) Record spikes simultaneously from many DS and non-DS neurons with overlapping RFs, e.g., from layers 4C-α and 4B, respectively, in macaque (Hawken et al., 1988; Nassi and Callaway, 2007). (2) To detect both CCG peaks and dips, try altering the temporal coordination of non-DS signals by presenting gratings in both the preferred and antipreferred directions or at lower than optimal TFs, or by presenting dynamic random stimuli. (3) Vary orientation to match preferences for non-DS cells (putative inputs) because this may give the largest correlation values (Fig. 9).
The distribution of key parameters of peaks/dips of the resulting CCGs may then reveal the DS mechanism. The distribution of time-to-peak/dip in the CCG will depend on the location of DS delay, with postsynaptic delays generating a broader, possibly bimodal time-to-peak distribution (Fig. 3E,F). The distribution of CCG amplitudes reveals the sign of the DS interaction, with facilitatory interactions generating only peaks and suppressive DS interactions generating both peaks and dips. The amplitude of CCG features will depend on synaptic efficacy, which in principle decreases for large numbers of independent inputs (Fig. 8). V1 simple cells may receive 10–30 LGN inputs (Tanaka, 1983; Peters and Payne, 1993; Alonso et al., 2001), and, if this holds for other feedforward circuits, our simulations suggest that the peaks should be easily detectable.
In our study, we have presented non-DS-to-DS cross-correlation, but in vivo many neurons also have intermediate DI values, referred to as direction-biased (DB) cells. It is likely that, in these proposed experiments, in which multiple units are simultaneously recorded, there will be data from DB neurons, and computing CCGs between these DB units and both DS and non-DS units can shed light on how DB RFs are constructed. For instance, DB RFs may arise from circuitry similar to that of DS neurons (as would be the case with incomplete antipreferred direction suppression, e.g., our DS_Pre_Sup model), in which case DB neurons would receive delayed and non-delayed non-DS inputs that would be revealed by the CCG analyses proposed here.
Discussion
Our goals were to (1) build cortical circuit models that differ in their DS mechanism, (2) distinguish them using cross-correlation analysis, (3) examine the practicality and means of applying this in vivo, and (4) present the models in an online framework for use by others. Each is discussed below.
DS circuit models
We built a set of simple yet plausible circuit models implementing some predominant ideas regarding direction selectivity: (1) the temporal delay, (2) spatially offset detectors, (3) null-direction suppression, (4) multiplicative nonlinearity, and (5) linear, oriented spatiotemporal RFs. Ours differ from filter-based models (Adelson and Bergen, 1985; Heeger, 1993), which are simpler and have explained much electrophysiological data but do not address implementation in terms of spiking circuits. Without representing and processing spikes in relevant intermediate cell classes, our CCG analysis would be impossible. More broadly, the LGN and simple cells represented in our models are also involved in form, depth, and color processing, allowing a diverse set of future studies to refine these circuits. Our models relate more to detailed biophysical models (Suarez et al., 1995; Maex and Orban, 1996; Rao and Sejnowski, 2003) that include synaptic conductances, spike generation, and interconnected populations. Those studies focused on representing particular DS mechanisms in more detail, whereas we have emphasized the construction of a set of alternative models for developing experimental approaches.
Cross-correlation in DS circuit models
We found that spike train cross-correlation provides several key insights into DS circuits. First, correlation analysis can reveal the location of the DS delay. DS delays arising from variation in response latency of non-DS inputs produce a CCG time-to-peak consistent with a fast monosynaptic connection, whereas DS delays generated postsynaptically produce a CCG time-to-peak reflecting the additional duration of postsynaptic processing (Fig. 3). Second, we found that cross-correlation can reveal the sign of the DS interaction (Fig. 4). Facilitatory interactions generate CCG peaks, whereas suppressive interactions generate dips. Third, however, the peaks and dips are visible only to the extent that the stimulus appropriately activates the DS mechanism; thus, inter-neuronal correlation was stimulus dependent (Figs. 4, 5). These results were robust across a range of connection strengths (Fig. 8) and held for nonlinear and linear models (Fig. 7). Our results regarding stimulus-dependent correlation were unanticipated, offering new insight explained next.
Much attention has been given to the idea that inter-neuronal correlation is not fixed but varies with task (Vaadia et al., 1995), over time (Aertsen et al., 1989), or across stimuli (Snider et al., 1998; Kohn and Smith, 2005). These studies proposed mechanisms, including changes in connection strengths and changes in the distribution of input firing rates (“population recruitment”), that cannot explain the stimulus-dependent correlation in our models. Our connection strengths are fixed and, as stimulus direction reverses, the firing rates of the input EX units do not change on average. Instead, changes in correlation depend on the temporal coordination of activity from presynaptic inputs, which vary with direction and TF (Fig. 5). This holds in both nonlinear models in which non-DS inputs are pairwise combined and a linear model in which inputs are summed independently on the DS cell. A second feature of the observed stimulus-dependent correlation is that the correlation can be strongest for non-optimal stimuli (in terms of firing rate). Past experimental studies of stimulus-dependent correlation in V1 (Snider et al., 1998; Lampl et al., 1999; Kohn and Smith, 2005) found the strongest correlation for optimal stimuli. We found that CCG dips were largest for antipreferred motion (Fig. 5E) and that CCG peaks could be higher at orientations suboptimal for the DS unit (Fig. 9). Overall, our results suggest an alternative mechanism for stimulus-dependent correlation based on the temporal coordination of input at a subneuronal level.
Viability of proposed experiments
It is reasonable to believe that the experiments proposed above (last section of Results) should be viable because previous CCG studies have successfully identified monosynaptic connections between hierarchical levels in the visual system, e.g., retina and LGN (Mastronarde, 1987; Usrey et al., 1998, 1999), LGN and V1 (Tanaka, 1983; Reid and Alonso, 1995; Usrey et al., 2000), and between simple and complex cells within V1 (Toyama et al., 1981; Alonso and Martinez, 1998). Peterson et al. (2004, their Fig. 7) presented an example CCG between non-DS and DS cortical cells with a delayed time-to-peak, consistent with a postsynaptic delay mechanism. Multiunit array recordings in V1 facilitate such experiments (Kohn and Smith, 2005), and two-photon calcium imaging could reveal dendritic processing, as shown in retina, optic tectum, and mouse visual cortex (Hausselt et al., 2007; Bollmann and Engert, 2009; Jia et al., 2010). Thus, the experimental challenges that have precluded the type of analyses modeled here are being overcome.
Model framework and interface
The models we presented are publicly accessible and designed to be easy to test and manipulate. The online interface at www.imodel.org offers tools for three major interactive functions: (1) a library of visual stimuli that can be examined frame by frame and in real time as parameters are changed, (2) neuronal outputs, stored in our nData format, can be displayed and analyzed using the nData Viewer, and (3) model architecture and maps of synaptic connections can be visualized using the iModel Viewer. The iModel and nData Viewers are integrated to allow anatomical and physiological data to be examined together.
The underlying modeling application, WM, is also downloadable (www.imodel.org) and allows users to customize stimulus and model files, visualize the resulting model architecture, including monosynaptic and polysynaptic connectivity, animate visual stimuli superimposed on the architecture, and select units and signals to be displayed on-the-fly during simulation. WM can run on workstation clusters, allowing rapid execution of experiments with numerous stimuli. We aim to set a precedent in encouraging others to reproduce our results and understand features and limitations of various models through direct experimentation.
Limitations and future extensions
Our models do not include the potentially significant recurrent intracortical input present in V1 (Douglas and Martin, 1991; Peters and Payne, 1993). Several models of DS cells include recurrent connections (Suarez et al., 1995; Maex and Orban, 1996); however, DS in these models is still implemented using computations (delays and spatial offsets) like those here. Suarez et al. (1995) used Barlow–Levick null direction inhibition to achieve DS (like our suppressive postsynaptic delay model), with recurrence simply acting to amplify weak feedforward input. Maex and Orban (1996) used recurrent excitation to generate a delay loop for facilitation, making a facilitatory postsynaptic delay model. Recurrence is one of multiple mechanisms that could implement components within our models; thus, our principal observations may also hold for these models.
We focused on correlation across the non-DS-to-DS synapse, but correlation is often measured in vivo between similar units, e.g., DS–DS. We noted above that the EX–EX spike count correlation was 0.024 on average, consistent with recent reports that granular layer cells in macaque show virtually no correlated variability (Hansen et al., 2011). It will be important to characterize these correlations in the model, comparing them with data as it becomes available for specific cell types. Appropriate levels of DS–DS correlation will be crucial for extending the network to include spatial pooling in visual cortical area V5/middle temporal area MT, in which correlation impacts information transmission (Zohary et al., 1994).
Our nonlinear DS mechanisms are phenomenological black boxes. Although this makes the results more general, we may overlook signatures of DS interactions that might only be revealed if the exact biophysical mechanisms were simulated. To address this, additional experimental evidence for particular mechanisms will be required.
Footnotes
This work was funded by the Wellcome Trust, St. John's College Oxford, and the Department of Biological Structure, University of Washington (Seattle, WA).
- Correspondence should be addressed to Pamela M. Baker, Department of Biological Structure, University of Washington, 1959 NE Pacific Street, Box 357420, Seattle, WA 98195. pmbaker{at}uw.edu