Figures 2D–2G show the marginal moments for each cochlear envelope of each sound in our ensemble. All four statistics vary considerably across natural sound textures. Their values for noise are also informative. The envelope means, which provide a coarse measure of the power spectrum, do not have exceptional values for noise, lying in the middle of the set of natural sounds. However, the remaining envelope moments for noise all lie near the lower bound of the values obtained for natural textures, indicating that natural sounds tend to be
sparser than noise (see also Experiment 2b) (Attias and Schreiner, 1998). Cjk=∑tw(t)(sj(t)−μj)(sk(t)−μk)σjσk,j,k∈[1…32]suchthat(k−j)∈[1,2,3,5,8,11,16,21]. Our model included the correlation of each cochlear subband envelope with a subset of eight of its neighbors, a number that was typically sufficient to reproduce the qualitative
find more form of the full correlation matrix (interactions between overlapping subsets of filters allow the correlations to propagate across subbands). This was also perceptually sufficient: we found informally that imposing fewer correlations sometimes produced perceptually Selleckchem CB-839 weaker synthetic examples, and that incorporating additional correlations did not noticeably improve the results. Figure 3B shows the cochlear correlations for recordings of fire, applause, and a stream. The broadband events present in fire and applause, visible as vertical streaks in the spectrograms of Figure 4B, produce correlations between the envelopes of different cochlear subbands. Cross-band correlation, or “comodulation,” is common in natural sounds (Nelken et al., 1999), and we found it to be to be a major source
of variation among sound textures. The stream, for instance, contains much weaker comodulation. The mathematical form of the correlation does not uniquely specify the neural instantiation. It could be computed directly, by averaging a product as in the above equation. Alternatively, it could be computed with squared sums and differences, Thymidine kinase as are common in functional models of neural computation (Adelson and Bergen, 1985): Cjk=∑tw(t)(sj(t)−μj+sk(t)−μk)2−(sj(t)−μj−sk(t)+μk)24σjσk. For the modulation bands, the variance (power) was the principal marginal moment of interest. Collectively, these variances indicate the frequencies present in an envelope. Analogous quantities appear to be represented by the modulation-tuned neurons common to the early auditory system (whose responses code the power in their modulation passband). To make the modulation power statistics independent of the cochlear statistics, we normalized each by the variance of the corresponding cochlear envelope; the measured statistics thus represent the proportion of total envelope power captured by each modulation band: Mk,n=∑tw(t)bk,n(t)2σk2,k∈[1…32],n∈[1…20].