Journal publications - Conferences
Journal publications
2011
- "Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment", Pietro Berkes, Gergő
Orbán, Máté Lengyel, József Fiser, Science 2011.
331:83-87, doi:.
Abstract / .PDF / SupplementaryThe brain maintains internal models of its environment to interpret sensory inputs and to prepare actions. Although behavioral studies have demonstrated that these internal models are optimally adapted to the statistics of the environment, the neural underpinning of this adaptation is unknown. Using a Bayesian model of sensory cortical processing, we related stimulus-evoked and spontaneous neural activities to inferences and prior expectations in an internal model and predicted that they should match if the model is statistically optimal. To test this prediction, we analyzed visual cortical activity of awake ferrets during development. Similarity between spontaneous and evoked activities increased with age and was specific to responses evoked by natural scenes. This demonstrates the progressive adaptation of internal models to the statistics of natural stimuli at the neural level. -
"Right hemisphere dominance in visual statistical learning ", Matthew E.
Roser, József Fiser, Richard N. Aslin, Michael S. Gazzaniga, J Cogn
Neurosci. 2011. 23:1088-1099.
Abstract / .PDFSeveral studies report a right hemisphere (RH) advantage for visuo-spatial integration and a left hemisphere (LH) advantage for inferring conceptual knowledge from patterns of covariation. The present study examined hemispheric asymmetry in the implicit learning of new visual-feature combinations. A split-brain patient and normal control participants viewed multi-shape scenes presented in either the right or left visual fields. Unbeknownst to the participants the scenes were composed from a random combination of fixed pairs of shapes. Subsequent testing found that control participants could discriminate fixed-pair shapes from randomly combined shapes when presented in either visual field. The split-brain patient performed at chance except when both the practice and test displays were presented in the left visual field (RH). These results suggest that the statistical learning of new visual features is dominated by visuospatial processing in the right hemisphere and provide a prediction about how fMRI activation patterns might change during unsupervised statistical learning.
2010
- "Statistically optimal perception and learning: from behavior to neural
representations", József Fiser, Pietro Berkes, Gergő
Orbán, Máté Lengyel, Trends Cogn Sci. 2010.
14(3):119-130, doi: 10.1016/tics.2010.01.003.
Abstract / .PDFHuman perception has recently been characterized as statistical inference based on noisy and ambiguous sensory inputs. Moreover, suitable neural representations of uncertainty have been identified that could underlie such probabilistic computations. In this review, we argue that learning an internal model of the sensory environment is another key aspect of the same statistical inference procedure and thus perception and learning need to be treated jointly. We review evidence for statistically optimal learning in humans and animals, and re-evaluate possible neural representations of uncertainty based on their potential to support statistically optimal learning. We propose that spontaneous activity can have a functional role in such representations leading to a new, sampling-based, framework of how the cortex represents information and uncertainty.
2009
- "Perceptual learning and representational learning in humans and
animals", József Fiser, Learning and Behavior, 2009.
37:141-153.
Abstract / .PDFTraditionally, perceptual learning in humans and classical conditioning in animals have been considered as two very different research areas, with separate problems, paradigms, and explanations. However, a number of themes common to these fields of research emerge when they are approached from the more general concept of representational learning. To demonstrate this, I present results of several learning experiments with human adults and infants, exploring how internal representations of complex unknown visual patterns might emerge in the brain. I provide evidence that this learning cannot be captured fully by any simple pairwise associative learning scheme, but rather by a probabilistic inference process called Bayesian model averaging, in which the brain is assumed to formulate the most likely chunking/grouping of its previous experience into independent representational units. Such a generative model attempts to represent the entire world of stimuli with optimal ability to generalize to likely scenes in the future. I review the evidence showing that a similar philosophy and generative scheme of representation has successfully described a wide range of experimental data in the domain of classical conditioning in animals. These convergent findings suggest that statistical theories of representational learning might help to link human perceptual learning and animal classical conditioning results into a coherent framework. - "The other kind of perceptual learning", József Fiser,
Learning and Perception 1, pp. 69-87, 2009.
Abstract / .PDFIn the present review we discuss an extension of classical perceptual learning called the observational learning paradigm. We propose that studying the process how humans develop internal representation of their environment requires modifications of the original perceptual learning paradigm which lead to observational learning. We relate observational learning to other types of learning, mention some recent developments that enabled its emergence, and summarize the main empirical and modeling findings that observational learning studies obtained. We conclude by suggesting that observational learning studies have the potential of providing a unified framework to merge human statistical learning, chunk learning and rule learning. - "A Structured Model of Video Reproduces Primary Visual Cortical
Organisation", Pietro Berkes, Richard E. Turner, Maneesh Sahani,
PLoS Computational Biology, 2009. 5(9):
e1000495. doi:10.1371/journal.pcbi.1000495
Abstract / PaperThe visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition. - "Modular toolkit for Data Processing (MDP): a Python data
processing framework", Tiziano Zito, Niko Wilbert, Laurenz Wiskott,
Pietro Berkes, Frontiers in Neuroinformatics 2:8,
2009. doi:10.3389/neuro.11.008.2008
Abstract / .PDFModular toolkit for Data Processing (MDP) is a data processing framework written in Python. From the user's perspective, MDP is a collection of supervised and unsupervised learning algorithms and other data processing units that can be combined into data processing sequences and more complex feed-forward network architectures. Computations are performed efficiently in terms of speed and memory requirements. From the scientific developer's perspective, MDP is a modular framework, which can easily be expanded. The implementation of new algorithms is easy and intuitive. The new implemented units are then automatically integrated with the rest of the library. MDP has been written in the context of theoretical research in neuroscience, but it has been designed to be helpful in any context where trainable data processing algorithms are used. Its simplicity on the user's side, the variety of readily available algorithms, and the reusability of the implemented units make it also a useful educational tool.
2008
-
"Bayesian learning of visual chunks by human observers", Gergő
Orbán, József Fiser, Richard N. Aslin, Máté
Lengyel, Proc Natl Acad Sci USA 2008; 105(7):2745-2750, doi:
10.1073/pnas.0708424105.
Abstract / .PDF / SupplementaryEfficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input.
2007
-
"Perceived Object Trajectories During Occlusion Constrain
Visual Statistical Learning", József Fiser, Brian J. Scholl,
Richard N. Aslin, Psychon Bull Rev. 2007 Feb;14(1):173-178.
Abstract / .PDFVisual statistical learning of shape sequences was examined in the context of occluded object trajectories. In a learning phase, participants viewed a sequence of moving shapes whose trajectories and speed profiles elicited either a bouncing or a streaming percept: The sequences consisted of a shape moving toward and then passing behind an occluder, after which two different shapes emerged from behind the occluder. At issue was whether statistical learning linked both object transitions equally, or whether the percept of either bouncing or streaming constrained the association between pre- and postocclusion objects. In familiarity judgments following the learning, participants reliably selected the shape pair that conformed to the bouncing or streaming bias that was present during the learning phase. A follow-up experiment demonstrated that differential eye movements could not account for this finding. These results suggest that sequential statistical learning is constrained by the spatiotemporal perceptual biases that bind two shapes moving through occlusion, and that this constraint thus reduces the computational complexity of visual statistical learning.
2005
- "Methodological Challenges for
Understanding Cognitive Development in Infants", Richard N. Aslin
and József Fiser, Trends Cogn Sci. 2005 Mar; 9: 92-98.
Abstract / .PDFStudies of cognitive development in human infants have relied almost entirely on descriptive data at the behavioral level - the age at which a particular ability emerges. The underlying mechanisms of cognitive development remain largely unknown, despite attempts to correlate behavioral states with brain states. We argue that research on cognitive development must focus on theories of learning, and that these theories must reveal both the computational principles and the set of constraints that underlie developmental change. We discuss four specific issues in infant learning that gain renewed importance in light of this opinion. - "Encoding Multielement Scenes: Statistical Learning of Visual
Feature Hierarchies", József Fiser and Richard N. Aslin, J Exp
Psychol Gen. 2005 Nov; 134: 521-537.
Abstract / .PDFThe authors investigated how human adults encode and remember parts of multielement scenes composed of recursively embedded visual shape combinations. The authors found that shape combinations that are parts of larger configurations are less well remembered than shape combinations of the same kind that are not embedded. Combined with basic echanisms of statistical learning, this embeddedness constraint enables the development of complex new features for acquiring internal representations efficiently without being computationally intractable. The resulting representations also encode parts and wholes by chunking the visual input into components according to the statistical coherence of their constituents. These results suggest that a bootstrapping approach of constrained statistical learning offers a unified framework for investigating the formation of different internal representations in pattern and scene perception.
2004
- "Small Modulations of Ongoing Cortical Dynamics by Sensory Input
During Natural Vision", József Fiser, Chiayu Chiu and Michael
Weliky, Nature. 2004 Sep 30; 431:573-578.
Abstract / .PDF / Supplementary / Commentary 1 / Commentary 2During vision, it is believed that neural activity in the primary visual cortex is predominantly driven by sensory input from the environment. However, visual cortical neurons respond to repeated presentations of the same stimulus with a high degree of variability. Although this variability has been considered to be noise owing to random spontaneous activity within the cortex, recent studies show that spontaneous activity has a highly coherent spatio-temporal structure. This raises the possibility that the pattern of this spontaneous activity may shape neural responses during natural viewing conditions to a larger extent than previously thought. Here, we examine the relationship between spontaneous activity and the response of primary visual cortical neurons to dynamic natural-scene and random-noise film images in awake, freely viewing ferrets from the time of eye opening to maturity. The correspondence between evoked neural activity and the structure of the input signal was weak in young animals, but systematically improved with age. This improvement was linked to a shift in the dynamics of spontaneous activity. At all ages including the mature animal, correlations in spontaneous neural firing were only slightly modified by visual stimulation, irrespective of the sensory input. These results suggest that in both the developing and mature visual cortex, sensory evoked neural activity represents the modulation and triggering of ongoing circuit dynamics by input signals, rather than directly reflecting the structure of the input signal itself.
2003
- "Contrast Conservation in Human Vision",
József Fiser, Peter J. Bex, Walter Makous, Vision Res 2003
Nov; 43: 2637-2648.
Abstract / .PDFVisual experience, which is defined by brief saccadic sampling of complex scenes at high contrast, has typically been studied with static gratings at threshold contrast. To investigate how suprathreshold visual processing is related to threshold vision, we tested the temporal integration of contrast in the presence of large, sudden changes in the stimuli such occur during saccades under natural conditions. We observed completely different effects under threshold and suprathreshold viewing conditions. The threshold contrast of successively presented gratings that were either perpendicularly oriented or of inverted phase showed probability summation, implying no detectable interaction between independent visual detectors. However, at suprathreshold levels we found complete algebraic summation of contrast for stimuli longer than 53 ms. The same results were obtained during sudden changes between random noise patterns and between natural scenes. These results cannot be explained by traditional contrast gain-control mechanisms or the effect of contrast constancy. Rather, at suprathreshold levels, the visual system seems to conserve the contrast information from recently viewed images, perhaps for the efficient assessment of the contrast of the visual scene while the eye saccades from place to place. - "Coding of Natural Scenes in Primary Visual Cortex", Michael
Weliky, József Fiser, Ruskin H. Hunt, David N. Wagner, Neuron, 2003;
Feb 20; 37: 703-718.
Abstract / .PDFNatural scene coding in ferret visual cortex was investigated using a new technique for multi-site recording of neuronal activity from the cortical surface. Surface recordings accurately reflected radially aligned layer 2/3 activity. At individual sites, evoked activity to natural scenes was weakly correlated with the local image contrast structure falling within the cells’ classical receptive field. However, a population code, derived from activity integrated across cortical sites having retinotopically overlapping receptive fields, correlated strongly with the local image contrast structure. Cell responses demonstrated high lifetime sparseness, population sparseness, and high dispersal values, implying efficient neural coding in terms of information processing. These results indicate that while cells at an individual cortical site do not provide a reliable estimate of the local contrast structure in natural scenes, cell activity integrated across distributed cortical sites is closely related to this structure in the form of a sparse and dispersed code.
2002
- "Statistical Learning of New Visual Feature Combinations by
Infants", József Fiser and Richard N. Aslin, Proc Natl Acad Sci USA
2002 Nov 26; 99:15822-15826.
Abstract / .PDF / Commentary 1 / Commentary 2The ability of humans to recognize a nearly unlimited number of unique visual objects must be based on a robust and efficient learning mechanism that extracts complex visual features from the environment. To determine whether statistically optimal representations of scenes are formed during early development, we used a habituation paradigm with 9-month-old infants and found that, by mere observation of multielement scenes, they become sensitive to the underlying statistical structure of those scenes. After exposure to a large number of scenes, infants paid more attention not only to element pairs that cooccurred more often as embedded elements in the scenes than other pairs, but also to pairs that had higher predictability (conditional probability) between the elements of the pair. These findings suggest that, similar to lower-level visual representations, infants learn higher-order visual features based on the statistical coherence of elements within the scenes, thereby allowing them to develop an efficient representation for further associative learning. - "Statistical Learning of Higher-Order Temporal Structure From
Visual Shape Sequences", József Fiser and Richard N. Aslin, J Exp
Psychol Learn Mem Cogn. 2002 May; 28: 458-467.
Abstract / .PDFIn 3 experiments, the authors investigated the ability of observers to extract the probabilities of successive shape co-occurrences during passive viewing. Participants became sensitive to several temporal-order statistics, both rapidly and with no overt task or explicit instructions. Sequences of shapes presented during familiarization were distinguished from novel sequences of familiar shapes, as well as from shape sequences that were seen during familiarization but less frequently than other shape sequences, demonstrating at least the extraction of joint probabilities of 2 consecutive shapes. When joint probabilities did not differ, another higher-order statistic (conditional probability) was automatically computed, thereby allowing participants to predict the temporal order of shapes. Results of a single-shape test documented that lower-order statistics were retained during the extraction of higher-order statistics. These results suggest that observers automatically extract multiple statistics of temporal events that are suitable for efficient associative learning of new temporal features.
2001
- "Unsupervised Statistical Learning of Higher-Order Spatial
Structures from Visual Scenes", József Fiser and Richard N. Aslin,
Psychological Science, November 2001; 12: 499-504.
Abstract / .PDFThree experiments investigated the ability of human observers to extract the joint and conditional probabilities of shape co-occurrences during passive viewing of complex visual scenes. Results indicated that statistical learning of shape conjunctions was both rapid and automatic, as subjects were not instructed to attend to any particular features of the displays. Moreover, in addition to single-shape frequency, subjects acquired in parallel several different higher-order aspects of the statistical structure of the displays, including absolute shape-position relations in an array, shape-pair arrangements independent of position, and conditional probabilities of shape co-occurrences. Unsupervised learning of these higher-order statistics provides support for Barlow’s theory of visual recognition, which posits that detecting "suspicious coincidences" of elements during recognition is a necessary prerequisite for efficient learning of new visual features. - "Size Tuning in the Absence of Spatial
Frequency Tuning in Object Recognition", József Fiser, Suresh
Subramaniam, Irving Biederman, Vision Research, 2001; 41, 1931-1950.
Abstract / .PDFHow do we attend to objects at a variety of sizes as we view our visual world? Because of an advantage in identification of lowpass over highpass filtered patterns, as well as large over small images, a number of theorists have assumed that size-independent recognition is achieved by spatial frequency (SF) based coarse-to-fine tuning. We found that the advantage of large sizes or low SFs was lost when participants attempted to identify a target object (specified verbally) somewhere in the middle of a sequence of 40 images of objects, each shown for only 72 ms, as long as the target and distractors were the same size or spatial frequency (unfiltered or low or high bandpassed). When targets were of a different size or scale than the distractors, a marked advantage (pop out) was observed for large (unfiltered) and low SF targets against small (unfiltered) and high SF distractors, respectively, and a marked decrement for the complementary conditions. Importantly, this pattern of results for large and small images was unaffected by holding absolute or relative SF content constant over the different sizes and it could not be explained by simple luminance- or contrast-based pattern masking. These results suggest that size/scale tuning in object recognition was accomplished over the first several images (576 ms) in the sequence and that the size tuning was implemented by a mechanism sensitive to spatial extent rather than to variations in spatial frequency. - "Invariance of Long-term Visual Priming to Scale Reflection,
Translation, and Hemisphere", József Fiser and Irving Biederman,
Vision Research, 2001; 41, 221-234.
Abstract / .PDFThe representation of shape mediating visual object priming was investigated. In two blocks of trials, subjects named images of common objects presented for 185 ms that were bandpass filtered, either at high (10 cpd) or at low (2 cpd) center frequency with a 1.5 octave bandwidth, and positioned either 5º right or left of fixation. The second presentation of an image of a given object type could be filtered at the same or different band, be shown at the same or translated (and mirror reflected) position, and be the same exemplar as that in the first block or a same-name different-shaped exemplar (e.g. a different kind of chair). Second block reaction times (RTs) and error rates were markedly lower than they were on the first block, which, in the context of prior results, was indicative of strong priming. A change of exemplar in the second block resulted in a significant cost in RTs and error rates, indicating that a portion of the priming was visual and not just verbal or basic-level conceptual. However, a change in the spatial frequency (SF) content of the image had no effect on priming despite the dramatic difference it made in appearance of the objects. This invariance to SF changes was also preserved with centrally presented images in a second experiment. Priming was also invariant to a change in left–right position (and mirror orientation) of the image. The invariance over translation of such a large magnitude suggests that the locus of the representation mediating the priming is beyond an area that would be homologous to posterior TEO in the monkey. We conclude that this representation is insensitive to low level image variations (e.g. SF, precise position or orientation of features) that do not alter the basic part-structure of the object. Finally, recognition performance was unaffected by whether low or high bandpassed images were presented either in the left or right visual field, giving no support to the hypothesis of hemispheric differences in processing low and high spatial frequencies. - "Experience-dependent Visual Cue Intergration Based on
Consistencies Between Visual & Haptic Percepts", Joseph E. Atkins,
József Fiser and Robert A. Jacobs, Vision Research, 2001; 41,
449-461.
Abstract / .PDFWe study the hypothesis that observers can use haptic percepts as a standard against which the relative reliabilities of visual cues can be judged, and that these reliabilities determine how observers combine depth information provided by these cues. Using a novel visuo-haptic virtual reality environment, subjects viewed and grasped virtual objects. In Experiment 1, subjects were trained under motion relevant conditions, during which haptic and visual motion cues were consistent whereas haptic and visual texture cues were uncorrelated, and texture relevant conditions, during which haptic and texture cues were consistent whereas haptic and motion cues were uncorrelated. Subjects relied more on the motion cue after motion relevant training than after texture relevant training, and more on the texture cue after texture relevant training than after motion relevant training. Experiment 2 studied whether or not subjects could adapt their visual cue combination strategies in a context-dependent manner based on context-dependent consistencies between haptic and visual cues. Subjects successfully learned two cue combination strategies in parallel, and correctly applied each strategy in its appropriate context. Experiment 3, which was similar to Experiment 1 except that it used a more naturalistic experimental task, yielded the same pattern of results as Experiment 1 indicating that the findings do not depend on the precise nature of the experimental task. Overall, the results suggest that observers can involuntarily compare visual and haptic percepts in order to evaluate the relative reliabilities of visual cues, and that these reliabilities determine how cues are combined during three-dimensional visual perception.
2000
- "Minimizing Binding Errors Using Learned Conjunctive Features",
Bartlett W. Mel and József Fiser, Neural Computation, 12, 247-278.
Abstract / .PDFWe have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors — Malsburg’s classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word sizes, and feature counts.We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded.We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.
1999
- "Subordinate-level Object Classification Reexamined", Irving
Biederman, Suresh Subramaniam, Moshe Bar, Peter Kalocsai, József
Fiser, Psychological Research, 1999; 62: 131-153.
Abstract / .PDFThe classication of a table as round rather than square, a car as a Mazda rather than a Ford, a drill bit as 3/8-inch rather than 1/4-inch, and a face as Tom have all been regarded as a single process termed "subordinate classification". Despite the common label, the considerable heterogeneity of the perceptual processing required to achieve such classifications requires, minimally, a more detailed taxonomy. Perceptual information relevant to subordinate-level shape classications can be presumed to vary on continua of (a) the type of distinctive information that is present, nonaccidental or metric, (b) the size of the relevant contours or surfaces, and (c) the similarity of the to-be-discriminated features, such as whether a straight contour has to be distinguished from a contour of low curvature versus high curvature. We consider three, relatively pure cases. Case 1 subordinates may be distinguished by a representation, a geon structural description (GSD), specify ing a nonaccidental characterization of an object’s large parts and the relations among these parts, such as a round table versus a square table. Case 2 subordinates are also distinguished by GSDs, except that the distinctive GSDs are present at a small scale in a complex object so the location and mapping of the GSDs are contingent on an initial basic-level classification, such as when we use a logo to distinguish various makes of cars. Expertise for Cases 1 and 2 can be easily achieved through specification, often verbal, of the GSDs. Case 3 subordinates, which have furnished much of the grist for theorizing with "view-based" template models, requireone metric discriminations. Cases 1 and 2 account for the overwhelming majority of shape-based basic- and subordinate-level object classifications that people can and do make in their everyday lives. These classifications are typically made quickly, accurately, and with only modest costs of viewpoint changes. Whereas the activation of an array of multiscale, multiorientation filters, presumed to be at the initial stage of all shape process ing, may suffce for determining the similarity of the representations mediating recognition among Case 3 subordinate stimuli (and faces), Cases 1 and 2 require that the output of these flters be mapped to classifiers that make explicit the nonaccidental properties, parts, and relations specified by the GSDs.
1998
- "Distance Modulation of Neural Activity in the
Visual Cortex", Allan C. Dobbins, Richard M. Jeo, József
Fiser, John M. Allman, Science, 24 July 1998; 281 (5376):552-555.
Abstract / .PDF / CommentaryHumans use distance information to scale the size of objects. Earlier studies demonstrated changes in neural response as a function of gaze direction and gaze distance in the dorsal visual cortical pathway to parietal cortex. These findings have been interpreted as evidence of the parietal pathway’s role in spatial representation. Here, distance-dependent changes in neural response were also found to be common in neurons in the ventral pathway leading to inferotemporal cortex of monkeys. This result implies that the information necessary for object and spatial scaling is common to all visual cortical areas.
1996
- "To what extent can matching algorthms based on direct outputs
of low level generic descriptors account for human object
recognition?", József Fiser, Irving Biederman, Eric
E. Cooper, Spatial Vision, 1996; 10(3): 237-271.
Abstract / .PDFA number of recent successful models of face recognition posit only two layers, an input layer consisting of a lattice of spatial filters and a single subsequent stage by which those descriptor values are mapped directly onto an object representation layer by standard matching methods such as stochastic optimization. Is this approach sufficient for modeling human object recognition? We tested whether a highly efficient version of such a two-layer model would manifest effects similar to those shown by humans when given the task of recognizing images of objects that had been employed in a series of psychophysical experiments. System accuracy was quite high overall, but was qualitatively different from that evidenced by humans in object recognition tasks. The discrepancy between the system’s performance and human performance is likely to be revealed by all models that map filter values directly onto object units. These results suggest that human object recognition (as opposed to face recognition) may be difficult to approximate by models that do not posit hidden units for explicit representation of intermediate entities such as edges, viewpoint invariant classifiers, axes, shocks and/or object parts.
1995
- "Size invariance in visual object priming of
gray scale images", József Fiser, Irving Biederman,
Perception, 1995; 24(7):741-748.
Abstract / .PDFThe strength of visual priming of briefly presented gray scale pictures of real world objects, measured by naming reaction times and errors, was independent of whether the primed picture of the object was presented in the same or different size than the original picture. These findings replicate Biederman & Cooper’s (1992) results on size invariance in shape recognition, which were obtained with line drawings, and extend them to the domain of gray level images. Entry-level shape identification is based either predominantly on scale-invariant representations incorporating orientation and depth discontinuities which are well captured by line drawings, or both discontinuities and the representation derived from smooth gradual surface changes are scale invariant.
Conferences
2010
- "Sparseness is not actively optimized in V1.", Pietro Berkes, Benjamin L. White, & József Fiser, Frontiers in Systems Neuroscience.
Conference Abstract: Computational and systems neuroscience, 2010.
AbstractSparse coding is a powerful idea in computational neuroscience referring to the general principle that the cortex exploits the benefits of representing every stimulus by a small subset of neurons. Advantages of sparse coding include reduced dependencies, improved detection of co-activation of neurons, and a more efficient encoding of visual information. Computational models based on this principle have reproduced the main characteristics of simple cell receptive fields in the primary visual cortex (V1) when applied to natural images.
However, direct tests on neural data of whether sparse coding is an optimization principle actively implemented in the brain have been inconclusive so far. Although a number of electrophysiological studies have reported high levels of sparseness in V1, these measurements were made in absolute terms and thus it is an open question whether the observed high sparseness indicates optimality or simply high stimulus selectivity. Moreover, most of the recordings have been performed in anesthetized animals, but it is not clear how these results generalize to the cell responses in the awake condition.
To address this issue, we have focused on relative changes in sparseness. We analyzed neural data from ferret and rat V1 to verify two basic predictions of sparse coding: 1) Over learning, neural responses should become increasingly sparse, as the visual system adapts to the statistics of the environment. 2) An optimal sparse representation requires active competition between neurons that is realized by recurrent connections. Thus, as animals go from awake state to deep anesthesia, which is known to eliminate recurrent and top-down inputs, neural responses should become less sparse, since the neural interactions that support active sparsification of responses are disrupted.
To test the first prediction empirically, we measured the sparseness of neural responses in awake ferret V1 to natural movies at various stages of development, from eye opening to adulthood. Contrary to the prediction of sparse coding, we found that the neural code does adapt to represent natural stimuli over development, but sparseness steadily decreases with age. In addition, we observed a general increase in dependencies among neural responses. We addressed the second prediction by analyzing neural responses to natural movies in rats that were either awake or under different levels of anesthesia ranging from light to very deep. Again, contrary to the prediction, sparseness of cortical cells increased with increasing levels of anesthesia. We controlled for reduced responsiveness of the direct feedforward connections under anesthesia, by using appropriate sparseness measures and by quantifying the signal- to-noise ratio across levels of anesthesia, which did not change significantly.
These findings suggest that the representation in V1 is not actively optimized to maximize the sparseness of neural responses. A viable alternative is that the concept of efficient coding is implemented in the form of optimal statistical learning of parameters in an internal model of the environment. - "The flow of expected and unexpected sensory information through the
distributed forebrain network.", Maolong Cui, Donald B. Katz, Alfredo
Fontanini, & József Fiser, Frontiers in Systems Neuroscience.
Conference Abstract: Computational and systems neuroscience, 2010.
AbstractForebrain taste information processing is accomplished mainly by three reciprocally connected forebrain regions -primary gustatory cortex (GC), (basolateral) amygdala (AM), and orbitofrontal cortex (OFC)- loosely characterized as the neural sources of sensory, palatability-related, and cognitive information, respectively. It has been proposed that the perception of complex taste stimuli involves an intricate flow of information between these regions in real time. However, empirical confirmation of this hypothesis and a detailed analysis of the multidirectional flow of information during taste perception have not yet been presented before.
We have simultaneously recorded local field potentials from GC, AM, and OFC in awake behaving rats under two conditions as controlled aliquots of either preferred or not preferred taste stimuli were placed directly on their tongues via intra-oral cannulae. Half of the deliveries were "active", as the rat pressed a bar to receive the taste upon receiving an auditory 'go' signal, the other half of deliveries were "passive" when the rat received a tastant at random times. Peri-delivery signals from the three areas were analyzed by computing transfer entropy, a method that measures directional information transfer between coupled dynamic systems by assessing the reduction of uncertainty in predicting the current state of the systems based on their previous states.
The results of this analysis reveal the complexity and context specificity of perceptual neural taste processing. Passive taste deliveries caused an immediate and strong flow of information that ascended from GC to both AM and OFC (p<0.001). However, within the 1.5-2.0 sec in which our rats typically identified and acted on (swallowing or expelling) the tastes, feedback from AM to GC became a prominent feature of the field potential activity (p<0.001). This finding confirms and extends earlier single cell results showing that palatability-related information appears in AM single- neuron responses soon after taste delivery, and that there is a sudden shift in the content of both GC and AM single-neuron responses at ~1.0 sec following delivery, as palatability-related information appears in GC and subsides in AM.
The neural response to active taste deliveries differed from that to passive deliveries in important ways. The massive immediate GC to AM/OFC flow was greatly decreased and delayed. Instead, there was an increased and lasting information flow from OFC to GC (p<0.01) immediately after the tone. The likely reason for this reduction was obvious: tone onset led to an anticipation of taste delivery that activated a descending flow of information from the "cognitive centers" in OFC to the primary sensory cortex, which greatly changed the actual neural processing of the stimulus itself in GC.
These results place earlier single-neuron findings into a functional dynamic framework, and offer an explanation of how the parts of the sensory system work together to give rise to complex perception. They suggest that perception is not a simple bottom-up process in which a stimulus is coded by progressively "higher" centers of the brain, rather various bottom-up and top-down effects jointly define and greatly alter stimulus processing as early as in the primary sensory areas. - "Neural activity as samples from a probabilistic representation: evidence
from the auditory cortex.", Pietro Berkes, Stephen, V. David, Jonathan Fritz,
Shihab A. Shamma, & József Fiser, Frontiers in Systems Neuroscience.
Conference Abstract: Computational and systems neuroscience, 2010.
AbstractIn the past years, there has been a paradigm shift in the field of cognitive neuroscience as a number of behavioral studies demonstrated that animals and humans can take into account statistical uncertainties of task, reward, and their own behavior, in order to achieve optimal task performance. These results have been interpreted in terms of statistical inference in probabilistic models. However, such an interpretation raises the question of how cortical networks represent and make use of the probability distributions necessary to carry out such computations.
Recently, we have proposed that neural activity patterns correspond to samples from the posterior distribution over interpretations of the sensory input, a hypothesis that is consistent with several experimental observations (e.g. trial-to-trial variability). Last year, using this framework, we verified experimentally that the distribution of spontaneous activity in such probabilistic representations adapts over development to match that of evoked activity averaged over stimuli, based on recordings from V1 of awake ferrets.
In the present study, we define and test two novel predictions of this framework. First, we predict that the match between evoked and spontaneous activity should be specific to the distribution of neural activity evoked by natural stimuli, and not to that evoked by artificial stimulus ensembles. We expect this match to hold for instantaneous neural activity, and for temporal transitions between activity pattern. Second, if this hypothesis captures the general computational strategy in the sensory cortex, it should be valid across sensory modalities. To test these predictions, we analyzed single unit data (N=32 over 6 recordings) recorded simultaneously from multiple electrodes in the primary auditory cortex (A1) of awake ferrets in three stimulus conditions: a natural condition consisting in a stream of continuous speech, a white noise (0-20 kHz) condition, and a spontaneous activity condition where the animal was listening in silence. Speech was chosen since its spectrotemporal characteristics are similar to those of natural sounds. We analyzed the neural data, which was discretized in 25 ms bins, binarized, and the distribution of instantaneous, joint activity, and the transition probability from one activity pattern to the next was estimated in the three conditions. We measured dissimilarity between the silence and stimulus condition distributions using Kullback-Leibler divergence. The robustness of our results was estimated using a bootstrapping technique.
In agreement with our predictions, we found that the distribution of speech- evoked activity is consistently more similar to spontaneous activity than the distribution of noise-evoked activity, for both the instantaneous distribution of activity and for transition probability. These results provide new evidence for stimulus specific adaptation in the cortex that leads to preference for natural stimuli, and also provide additional support for the sampling hypothesis. Our findings in A1 complement our earlier data from V1, suggesting that the match between spontaneous and evoked activity might be a universal hallmark of representation and computation in sensory cortex. - "Suppression of intrinsic cortical response variability is state- and
stimulus-dependent", Benjamin L. White, Pietro Berkes, József Fiser,
Frontiers in Systems Neuroscience. Conference Abstract: Computational and
systems neuroscience, 2010.
AbstractNeural responses to identical sensory stimuli can be highly variable across trials, even in primary sensory areas of the cortex. This raises the question of how such areas reliably transmits sensory-evoked responses to guide appropriate behavior. Internally-generated, spontaneous activity, which is ubiquitous in the cortex, is a leading candidate for causing much of the observed response variability. Recent theoretical analyses suggested that chaotic spontaneous activity generated by a recurrent network model can be strongly suppressed by external input in a stimulus-dependent manner. A hallmark feature of this result is a non-monotonic temporal frequency - dependence, which implies that there is an optimal stimulus frequency for suppression of internally generated noise.
To test the prediction that cortical areas operate similar to such models, we investigated spontaneous and visually-evoked extracellular neural activity from 57 mostly multi-units (MUs) in the primary visual cortex (V1) of 6 rats. We recorded from the rats under five conditions: while fully awake and while under 4 different levels of isoflurane anesthesia. The anesthetized conditions were included to investigate the responses of the neural circuitry as its dynamic behavior is gradually modified. Anesthesia ranged from very light to deep, and stable levels were verified by various physiological parameters such as breathing rate, reflex response, and local field potential structure. Rats were head-fixed in a sound- and light- attenuating box while passively viewing flashing stimuli on a monitor 6 inches away from the retina. Five different stimulus conditions were used for all rats in all states. Full-field flashing visual stimuli were presented at four frequencies, ranging from 1 Hz to 7.5 Hz, and spontaneous neural activity was also recorded during periods of complete darkness. Stimulus appearance was interleaved and randomized. Variability was assessed by computing Fano-factors over a range of spike-counting intervals.
We found that variability in spontaneous neural firing is actively and selectively suppressed by visual stimulation both in awake and anesthetized conditions. However, the pattern of suppression was different: in the awake case, it followed the theoretical prediction showing a significant dip in the Fano-factor across the different temporal frequencies of the stimuli. This frequency-dependency vanished with increased anesthesia. In addition, we found that the lowest level of noise and the largest amount of suppression compared to the spontaneous condition across all evoked conditions occurred in the awake state. Importantly, power spectrum analysis showed that this patterns of frequency-dependent noise suppression could not be explained by differences in intrinsic neural oscillations.
These results suggest the existence of an active noise-suppression mechanism in the primary visual cortex of the awake animal that is tuned to operated maximally in the awake state for stimuli modulated at behaviorally relevant frequencies.
2009
- "No evidence for active sparsification in the visual cortex ", Pietro
Berkes, Benjamin L. White, József Fiser, In: Advances in Neural
Information Processing Systems 22 (NIPS*2009), Y. Bengio and D. Schuurmans
and J. Lafferty and C. K. I. Williams and A. Culotta (eds), pp 108-116., 2009.
Abstract / .PDFThe proposal that cortical activity in the visual cortex is optimized for sparse neural activity is one of the most established ideas in computational neuroscience. However, direct experimental evidence for optimal sparse coding remains inconclusive, mostly due to the lack of reference values on which to judge the measured sparseness. Here we analyze neural responses to natural movies in the primary visual cortex of ferrets at different stages of development and of rats while awake and under different levels of anesthesia. In contrast with prediction from a sparse coding model, our data shows that population and lifetime sparseness decrease with visual experience, and increase from the awake to anesthetized state. These results suggest that the representation in the primary visual cortex is not actively optimized to maximize sparseness. - "Neural evidence for statistically optimal
inference and learning in the
primary visual cortex ", Pietro Berkes, Gergő Orbán,
Máté Lengyel, József Fiser, Society of Neuroscience
Abstracts (39), 2009.
AbstractHow do we infer from sensation the state of the external world? Humans and animals have been shown to perform statistically optimal inference and learning during perception in the presence of noise and uncertainty in the presented stimuli. This points to a probabilistic representation of the sensory input, where evidence coming from sensation is optimally combined with an internal model of the environment. Indeed, neural correlates of the uncertainty and probability of behaviorally relevant stimuli have been reported in brain areas related to decision- making. Moreover, manipulations of the statistics of the environment are known to be reflected in changes in the neural representation, which are compatible with some probabilistic accounts of learning. However, there has been so far no evidence of statistically optimal inference and learning at the neural level. We have investigated general consequences of probabilistic inference in the sensory system under the assumptions that neural activity reflects sampling from the internal, probabilistic model of the world. This assumption makes the strong prediction that the joint distribution of spontaneous activity and that of evoked activity averaged over stimuli have to be identical. We analyzed multielectrode data from awake ferrets at various stage of post- natal development. Neural activity was recorded during evoked and spontaneous activity. We found that the similarity between activity evoked by natural movie stimuli and spontaneous activity significantly increased with visual experience, until, at the end of visual development, the two distributions were not significantly distinguishable (P>0.95). This similarity was brought about by a match between the spatial and temporal correlational structure of the activity patterns, rather than merely by preserved firing rates across conditions. Moreover, the match was specific to activity evoked by natural stimuli, and not by noise by grating stimuli. These results suggest that neural variability samples from a probabilistic model of the environment that is gradually being tuned to natural scene statistics by sensory experience as the visual system develops. The interpretation of neural activity as samples provides a missing link between the computational and neural level, opening the way to a systematic exploration of functional principles of cortical organization. - "Neural evidence for statistically optimal inference and
learning in primary visual cortex", Pietro Berkes, Gergő
Orbán, Máté Lengyel, József Fiser, Sloan-Swartz Centers for Theoretical
Neurobiology Annual Meeting, Boston, 2009.
AbstractHow do we infer from sensation the state of the external world? Human and animal subjects are able to take into account noise and uncertainty in behavioral task and perform statistically optimal inference and learning. Moreover, statistical models of natural images have been shown to reproduce many features of receptive field organization in primary visual cortex. However, there has been so far no evidence of optimal inference and learning at the neural level. In this talk, I will derive a general consequence of the statistical framework, predicting that the distribution of neural spontaneous activity and that of activity evoked by natural stimuli must become more and more similar with visual experience, and be identical in the ideal case, under the assumption that neural activity represents samples from an internal, probabilistic model of the environment. I will present data from multielectrode recording in awake ferrets a various stage of post-natal development that supports this prediction.The increasing similarity between the two distributions is found to be due to an increasing match between the spatial and temporal correlational structure of the activity patterns, and is specific to activity evoked by natural stimuli, and not by noise or grating stimuli. These results provide support for the statistical framework at the neural level, and suggest a novel interpretation for neural variability and spontaneous activity.
- "Implicit and explicit knowledge in visual statistical
learning", Kimberly MacKenzie, József Fiser, Conference
Abstract: Annual Meeting of the Vision Sciences Society, 2009.
Abstract / PosterVisual statistical learning has been established as a paradigm for testing implicit knowledge that accumulates gradually with experience. Typically, subjects are presented with a stream of scenes composed of simple shapes arranged according to co-occurrence rules. Subjects observe the scenes without a defined task, and during the test subjects' familiarity with the building blocks of the scenes is measured. However, the test in this paradigm usually directly follows the practice, while long-term effects are usually considered to last for hours or days. In addition, while the learning is implicit, the underlying structure of scenes can be summarized by a few explicit rules, which when told to the subject, the task becomes trivial. It is not clear, however, whether the implicit learning leads to explicit knowledge of the rules, or if the two types of learning are unrelated. To address these issues, we ran a modified visual statistical learning study, where subjects were tested one hour after the practice session. In addition, we varied the length of practice from 144 to 216 to 288 scenes. At short length, subjects showed no learning (55%, p>.05), in strong contrast with earlier results (74.7%, p<0.0001) where the practice and test without intermission yielded strong implicit learning. As the length of practice increased to 216, implicit familiarity emerged (82%, p<0.004), whereas with 288 trials not only did performance improve further (85%, p<0.0004), but explicit knowledge of the rules was reported by a majority of the subjects. Thus, even though visual statistical learning contributes to immediate familiarity, it is also the basis of more prolonged representations in long term memory. Moreover, this type of learning gradually leads to the emergence of explicit knowledge of the rules observed in the scenes, thus questioning the idea that implicit statistical and explicit rule learning are two separate processes. - "Visual Field Loss, Eye Movements and Visual
Search", Lee McIlreavy, József Fiser, Peter J. Bex,
Conference Abstract: Annual Meeting of the Vision Sciences Society,
2009.
Abstract / PosterObjectives: In performing search tasks, the visual system encodes information across the visual field and deploys a saccade to place a visually interesting target upon the fovea. The process of saccadic eye movements, punctuated by periods of fixation, continues until the desired target has been located. Loss of peripheral vision restricts the available visual information with which to plan saccades, while loss of central vision restricts the ability to resolve the high spatial information of a target. We investigate visuomotor adaptations to visual field loss with gaze-contingent peripheral and central scotomas.Methods: Spatial distortions (peak frequency 2 cpd) were placed at random locations in 25deg square natural scenes, with transitions from distorted to undistorted regions smoothed by a Gaussian (sd = 2 deg). Gaze-contingent central or peripheral simulated Gaussian scotomas sd=1 2 or 4 deg were updated at the screen rate (75Hz) based on a 250Hz eyetracker. The observer's task was to search the natural scene for the spatial distortion and to indicate its location using a mouse-controlled cursor.
Results: As the diameter of central scotomas increased or the diameter of peripheral scotomas decreased, so followed an increase in mean search times and the mean number of saccades and fixations. Fixation duration, saccade size and saccade duration were relatively unchanged across conditions.
Conclusions: Both central and peripheral visual field loss cause functional impairment in visual search. The deficit is largely attributed to an increase in the number of saccades and fixations, with little change in visuomotor dynamics. Subjects frequently made saccades into blind areas and did not modify fixation durations to compensate for reduced acuity or change in temporal integration, suggesting that adaptations to visual impairment are not automatic and may benefit from rehabilitation training.
- "The Less-Is-More principle in realistic visual statistical
learning", Aaron Glick & József Fiser, Conference Abstract:
Annual Meeting of the Vision Sciences Society, 2009.
Abstract / PosterWhile in previous studies, a number of abstract characteristics of visual statistical learning have been clarified under various 2-dimesional settings, little effort was directed to understand how real visual dimensions in 3-dimensonal scenes interact during such learning. In a series of experiments using realistic 3D shapes and the dimensions of color, texture, and motion, we tested the Less-Is-More principle of learning, namely the proposal that information in independent dimensions do not interact in a simple additive manner to help learning. Following the original statistical learning paradigm, twelve arbitrary 3D shapes were used to compose large 3'x3' scenes, where shape pairs followed particular co-occurrence pattern and scenes were composed of random combinations of such pairs. Similarly to the results with abstract 2D shapes, subjects automatically and implicitly learned the underlying structure of the scenes. However, there were notable differences in learning depending on the features of the stimuli. Humans performed well above chance in the baseline experiment with colored and textured shapes (63% correct, p<0.001). When they received the same training but with colors only, using a single type of shape and no texture, performance dropped to chance (51%, ns.), showing that providing the same color label information without "hooks" was not useful. However, removing color and texture or color and shape improved performance (both 68%, p< 0.001) showing that reducing the richness of the representations is not always detrimental. Finally, adding characteristic motion pattern to each shape did not elevate performance (65%, p< 0.001) demonstrating that even the most effective type of visual information does not necessarily speed up learning. These results support the Less-Is-More idea that the most effective learning requires the maximum amount of information that the system can reliably process based on its capacity limit and internal representation, which is not equivalent to having the most possible information. - "Orientation integration in complex visual processing", Henry
Galperin, Peter Bex, József Fiser, Conference Abstract: Annual
Meeting of the Vision Sciences Society, 2009.
Abstract / PosterHow does the visual system integrate local features to represent global object forms? Previously we quantified human orientation sensitivity in complex natural images and found that orientation is encoded only with limited precision defined by an internal threshold that is set by predictability of the stimulus (VSS 2007). Here we tested the generality of this finding by asking whether local orientation information is integrated differently when orientation noise was distributed across a scene, and in an object identification task for natural images that were reconstructed from a fixed number of Gabor wavelets. In the noise discrimination task, subjects viewed pairs of images where orientation noise was added to the elements of only one image, both images, or was distributed evenly between the two images, and were required to identify the noisier pair of images. Sensitivity to orientation noise with the addition of external noise produced a dipper function that did not change with the manner in which noise was distributed, suggesting that orientation information is integrated consistently irrespective of the distribution of orientation information across the scene. In the identification task, subjects identified an object from four categories, randomly selected from a total of 40 categories. The proportion of signal Gabors, whose orientation and position were taken from the object, and noise Gabors, whose positions were randomly assigned, was adjusted to find the form coherence threshold for 75% correct object identification. Signal elements consisted of pairs of adjacent Gabors whose orientation difference was low (contour-defining), high (corner-defining), or randomly selected. Thresholds for image identification were only slightly elevated compared with earlier discrimination results, and were equal for all types of signal elements used. These results suggest that orientation information is integrated by perceptual templates that depend on orientation predictability but not on the complexity level of the visual task. - "What eye-movements tell us about online learning of the
structure of scenes", Maolong Cui, Gergő
Orbán, Pietro Berkes,
József Fiser, Conference Abstract: Annual Meeting of the
Vision Sciences Society, 2009.
Abstract / PosterWe have recently proposed that representations of novel multi-element visual displays learned and stored in visual long-term memory encode the independent chunks of the underlying structure of the scenes (Orban et al. 2008 PNAS). Here we tested the hypothesis that this internal representation guides eye movement as subjects explore such displays in a memory task. We used scenes composed of two triplets of small black shapes randomly selected from an inventory of four triples and arbitrarily juxtaposed on a grid shown on a 3'x3' screen. In the main part of the experiment, we showed 144 trials with two scenes for 2 sec each with 500 msec blank between them, where the two scenes were identical except for one shape that was missing form the second scene. Subjects had to select from two alternatives the missing shape, and their eye movements were recorded during the encoding phase while they were looking at the first scene. In the second part of the experiment, we established the subject's confusion matrix between the shapes used in the experiment in the given configurations. We analyzed the amount of entropy reduction with each fixation in a given trial based on the individual elements of the display and based on the underlying chunk-structure, and correlated these entropies with the performance of the subject. We found that, on average, the difference between the entropy reduction between the first and last 10 trials was significantly increased and correlated with improved performance when entropy was calculated based on chunks, but no such reduction was detected when entropy calculation was based on individual shapes. These findings support the idea that subjects gradually learned about the underlying structure of the scenes and their eye movements were optimized to gain maximal information about the underlying structure with each new fixation. - "Coarse-to-fine learning in scene perception: Bayes trumps
Hebb", József Fiser, Gergő
Orbán, Máté Lengyel, Richard
N. Aslin, Conference Abstract: Annual Meeting of the Vision Sciences
Society, 2009.
Abstract / PosterRecent studies suggest that the coherent structures learned from multi-element visual scenes and represented in human memory can be best captured by Bayesian model comparison rather than by traditional iterative pair-wise associative learning. These two learning mechanisms are polar opposites in how their internal representation emerges. The Bayesian method favors the simplest model until additional evidence is gathered, which often means a global, approximate, low-pass description of the scene. In contrast, pair-wise associative learning, by necessity, first focuses on details defined by conjunctions of elementary features, and only later learns more extended global features. We conducted a visual statistical learning study to test explicitly the process by which humans develop their internal representation. Subjects were exposed to a family of scenes composed of unfamiliar shapes that formed pairs and triplets of elements according to a fixed underlying spatial structure. The scenes were composed hierarchically so that the true underlying pairs and triplets appeared in various arrangements that probabilistically, and falsely, gave rise to more global quadruple structures. Subjects were tested for both true vs. random pairs and false vs. random quadruples at two different points during learning -- after 32 practice trials (short) and after 64 trials (long). After short training, subjects were at chance with pairs (51%, p>0.47) but incorrectly recognized the false quadruples (60%, p<0.05). Showing a classic double dissociation after long training, subjects recognized the true pairs (59%, p<0.05) and were at chance with the quadruples (53%, p>0.6). These results are predicted well by a Bayesian model and impossible to capture with an associative learning scheme. Our findings support the idea that humans learn new visual representations by probabilistic inference instead of pair-wise associations, and provide a principled explanation of coarse-to-fine learning. - "Matching spontaneous and evoked activity in V1: a hallmark of
probabilistic inference", Pietro Berkes, Gergő
Orbán, Máté Lengyel,
József Fiser, Frontiers in Systems Neuroscience, Conference
Abstract: Computational and systems neuroscience, 2009. doi:
10.3389/conf.neuro.06.2009.03.314.
Abstract - "Characterizing neural dependencies with copula models", Pietro
Berkes, Frank Wood, Jonathan Pillow, Advances in Neural Information
Processing Systems 21, 2009. D. Koller, D. Schuurmans,
Y. Bengio, and L. Bottou, Eds., 119-136.
Abstract / .PDF / PosterThe coding of information by neural populations depends critically on the statistical dependencies between neuronal responses. However, there is no simple model that can simultaneously account for (1) marginal distributions over single-neuron spike counts that are discrete and non-negative; and (2) joint distributions over the responses of multiple neurons that are often strongly dependent. Here, we show that both marginal and joint properties of neural responses can be captured using copula models. Copulas are joint distributions that allow random variables with arbitrary marginals to be combined while incorporating arbitrary dependencies be- tween them. Different copulas capture different kinds of dependencies, allowing for a richer and more detailed description of dependencies than traditional sum- mary statistics, such as correlation coefficients. We explore a variety of copula models for joint neural response distributions, and derive an efficient maximum likelihood procedure for estimating them. We apply these models to neuronal data collected in macaque pre-motor cortex, and quantify the improvement in cod- ing accuracy afforded by incorporating the dependency structure between pairs of neurons. We find that more than one third of neuron pairs shows dependency concentrated in the lower or upper tails for their firing rate distribution.
2008
- "Sensitivity of implicit visual rule-learning to the saliency of
the stimuli", Kimberly MacKenzie, József Fiser, Conference
Abstract: Journal of Vision, 2008; 8(6):474, 474a,
http://journalofvision.org/8/6/474/,
doi:10.1167/8.6.474.
Abstract / PosterHuman infants have been shown to implicitly learn rules, such as the repetition of ABB or ABA patterns, regardless of the identity of the participating items, both with sequential information during language development and with simultaneously presented visual patterns. However, in these studies the ABB or ABA patterns were defined by the identity of the items themselves. This leaves open the question of how successful humans are in extracting such rules in more complex situations when the rule is defined by a particular feature dimension of the items rather than by their identity. We examined the performance of adults presented with an implicit rule-learning task where both the color and the size of the items followed some underlying rules. Subjects were first exposed to a series of three different shapes presented simultaneously: five triplet scenes were viewed ten times each in random order during the learning phase. Patterns within each triplet varied in both size and color saturation following two different rules (AAB vs. ABA). The test phase consisted of triplets made of new elements not seen in the learning phase, which varied in size but had identical color saturation. In each trial, subjects saw two triplets, an AAB and an ABA pattern, and judged which triplet seemed more familiar. Surprisingly, adult subjects did not find the pattern of sizes shown during practice more familiar than the alternative, with a size difference of either 100 or 150 percent. These results suggest that successful visual rule-learning requires a much higher saliency of the rule in the given feature dimension than is expected based on the discrimination results. - "Linking implicit chunk learning and the capacity of working
memory", József Fiser, Gergő
Orbán, Máté Lengyel, Conference
Abstract: Journal of Vision, 2008; 8(6):213, 213a, http://journalofvision.org/8/6/213/,
doi:10.1167/8.6.213.
Abstract / PosterClassical studies of the capacity of working memory have posited a fix limit for the maximum number of items human can store temporarily in their memory, such as 7±2 or 4±1. More recent results showed that when the stored items are viewed as complex multi-dimensional objects capacity can be increased and conversely, when distinctiveness of these items is minimized capacity is reduced. These results suggest a strong link between working memory and the nature of the representation of information based on the observer's long-term memory. To test this conjecture, we formalized the information content of a set of stimulus by its description length, which relates the "cost", the number of bits assigned to a particular stimulus, to its appearance likelihood given the representation the observer has. This formalism highlights that a high-complexity but familiar stimuli need less resource to encode and recall correctly than novel stimuli with lower complexity. Using this formalism, we developed a novel two-stage test to investigate the above conjecture. First, participants were trained in an unsupervised visual statistical learning task using multi-element scenes in which they are known to develop implicitly a chunked representation of the scenes. Next, they performed a change detection task using novel scenes that were composed from the same elements either with or without the chunk arrangements of the training session. Change detection results were significantly better with scenes that were composed of elements that retained the chunk arrangement. Thus the capacity of working memory determined by how easily the stimulus can be mapped onto the internal representation of the observer, and integrated object-based coding is a special case of this mapping. - "Modeling neural dependencies with Poisson copulas", Pietro
Berkes, Frank Wood, Jonathan Pillow, Frontiers in Computational
Neuroscience. Conference Abstract: Bernstein Symposium, 2008. doi:
10.3389/conf.neuro.10.2008.01.031
Abstract / PosterThe coding of information by neural populations depends critically on the statistical dependencies between neuronal responses. At the moment, however, we lack of a simple model that can simultaneously account for (1) marginal distributions over single-neuron spike counts that are typically close to Poisson; and (2) joint distributions over the responses of multiple neurons that are often strongly dependent. Here, we show that both marginal and joint properties of neural responses can be captured using Poisson copula models. Copulas are joint distributions that allow random variables with arbitrary marginals to be combined while incorporating arbitrary dependencies between them. Different copulas capture different kinds of dependencies, allowing for a riwcher and more detailed description of dependencies than traditional summary statistics, such as correlation coefficients. We explore a variety of Poisson copula models for joint neural response distributions, and derive an efficient maximum likelihood procedure for estimating them. We apply these models to neuronal data collected in the macaque pre-motor cortex, and quantify the improvement in coding accuracy afforded by incorporating the dependency structure between pairs of neurons. - "Relating evoked and spontaneous cortical activities in a
generative modeling framework", Gergő
Orbán, Pietro Berkes, Mate
Lengyel, József Fiser, Conference Abstract: Sloan-Swartz
Meeting of Theoretical Neurobiology, Princeton, NJ, USA, 2008.
AbstractRecently we proposed a computational framework in which we assumed that the visual cortex implicitly implements a generative model of the natural visual environment and performs its functions such as recognition and discrimination by inferring the underlying external causes of the visual input. In the present work, we test this framework by relating synthetic and measured neural data to the predictions of the underlying generative model. Two key elements of the proposal are that firing activity of individual neurons are samples form the underlying probability density function (pdf) that those cells represent, and that the spontaneous activity of the cortex represents the prior knowledge of the system about the external world. In order to test these ideas, a reliable method was developed to estimate the difference between the pdfs of the spontaneous and visually evoked activities based on a limited number of samples. Our method exploits the full statistical structure of the data to estimate the Kullback-Leibler divergence between pdfs of neural activities recorded under different conditions. First, we tested the method on synthetic data to demonstrate its feasibility, then we applied it to analyze neural recording from the primary visual cortex of awake behaving ferrets. Our results conforms the predictions of the generative framework and show how this framework can successfully describe the link between spontaneous and visually evoked activity and give a novel interpretation to the response variability of cortical responses. - "Looking for hallmarks of generative models in the visual
cortex", Gergő
Orbán, Pietro Berkes, Máté Lengyel, József Fiser,
Conference Abstract: Computational and systems neuroscience (Cosyne), 2008.
Abstract .PDF / Poster - "The relationship between local feature distributions and object
recognition", Henry Galperin, Peter Bex, József Fiser, Conference
Abstract: Annual Meeting of the Vision Sciences Society, 2008.
Abstract / PosterWe investigated the structure of image features that support human object recognition using a novel 2-AFC form coherence paradigm. Grayscale images of everyday objects were analyzed with a multi-scale bank of Gabor-wavelet filters whose responses defined the positions, orientations and phases of Gabor patches that were used to reconstruct a facsimile of the original image. Signal Gabors were assigned the parameters of the original image, noise Gabors were assigned random positions, leaving the other parameters, and therefore the overall amplitude spectrum, unchanged. Observers were shown the reconstructed, 100% signal image and were then required to discriminate a target image containing a proportion of signal elements from one containing only noise elements. A staircase determined the proportion of signal elements that were required for correct identification on 75% of trials. We used the statistics of the original image to determine which elements were designated signal and which were designated noise in seven conditions. Signal elements were selected at random or from areas where local orientation variability, density or luminance contrast was either high or low in the original scene. Thresholds were the same for random, orientation variability and density conditions, but were significantly lower for the high contrast and significantly higher for the low contrast conditions. Importantly, the latter result held whether the contrast of the Gabors in the reconstructed scene were either fixed at all the same value or followed the contrast of the original scene. This means that recognition performance is determined by the feature structure of the original scene that has high contrast and not the high contrast elements of the experimental image. These results show that, in general, image identification depends on specific relationships among local features that define natural scenes and not basic statistical measures such as feature density, variability or the contrast values of individual features. - "Coding of Position Information During Object Perception", Henry
Galperin, Peter Bex, József Fiser, Perception 37, ECVP
Abstract Supplement, 2008.
AbstractWe examine how local position information of different complex scenes is represented in the visual system. A 2AFC paradigm was used to examine internal noise and sampling efficiency for three classes of stimuli: natural objects, fractal patterns and random circular patterns, all synthesized from the same set of Gabor wavelets. Each trial, a noiseless source image was presented first for 1 sec, followed by a reference image that contained a fixed amount of external position noise (s) on each element, and a target image containing additional position noise (s+Ds) under the control of a staircase. Subjects identified the image with less noise. Equivalent noise functions fitting the results indicated approximately identical internal noise but sampling efficiency that increased with predictability across image classes. This suggests a flexible position representation that compares the observed structure with prior experience. - "Expectation of reward modulates responses in rat primary visual
cortex", Henry Galperin, Benjamin L. White, József Fiser,
Society of Neuroscience Abstracts (38), 2008.
Abstract / PosterClassical views of information flow in primary visual cortex suggest that orientation information is encoded early in a feedforward architecture and passed to higher levels of cortex for further processing. More recent studies suggest that top-down information can modulate processing of even basic visual attributes. We investigated whether responses in primary visual cortex are modulated by top-down effects evoked by differential rewarding of oriented grating stimuli. Multiunit extracellular recordings were obtained using a microwire electrode array chronically implanted in rat primary visual cortex while grating stimuli were presented under different reward conditions. An awake headfixed animal viewed alternating +45º and -45º sinusoidal grating stimuli. During a control sessions, gratings were passively presented with no reward. In three subsequent sessions, one grating (CS+) was paired with a water reward while the other grating (CS-) remained unrewarded. On the third rewarded session, units showed a two-fold increase in firing that plateaued and then returned to baseline during the CS+, while firing rates for the CS- remained relatively constant across sessions. In addition, coherence among units reflected timing of an expected visual stimulus change. These results suggest a more complex model of visual processing where topdown contextual information strongly and continuously influences stimulus-specific bottom-up processes at even the earliest stages of visual processing. - "The relationship between awake and anesthetized neural
responses in the primary visual cortex of the rat", Benjamin
L. White, József Fiser, Society of Neuroscience Abstracts
(38), 2008.
Abstract / PosterMuch of what we know about visual processing in the brain is based on neural data collected in anesthetized animals assuming that the essential aspects of the computations are preserved under such conditions. However, recent findings support an alternative view that visual processing depends upon ongoing activity, which is significantly altered in anesthetized preparations. Therefore, it is critical to assess how well the characteristics of neural responses to various stimuli in the anesthetized animal can predict responses in the awake animal. We collected multi-electrode recordings from the primary visual cortex of adult rats under different levels of anesthesia and while awake. Anesthesia was maintained by isoflourane concentrations between 0.6% to 2.0%, ranging from very lightly anesthetized to deeply anesthetized. Isolated unit and local field potential (LFP) activity were collected from sixteen electrodes. Responses were compared between conditions of darkness (the spontaneous condition), a natural scene movie, and full-field white-black modulation at frequencies of 1Hz, 2Hz, 4Hz, and 8Hz. There were significant, up to two-fold modulations of measurements of average firing rates, bursting rates, power spectral densities, population sparseness, and coherence between stimulus conditions in awake and anesthetized animals. However, there were strong interactions between the particular stimuli used and the condition of the animal, and due to these interactions responses in the awake condition could not be well predicted by the anesthetized responses. While, in general, coherence decreased with lower concentrations of isoflurane as suggested by previous findings, coherence in the theta band actually peaked at 4 Hz visual stimulus modulation while awake, and that coherence in the gamma and alpha bands reached a minimum at 1-2Hz stimulation while under anesthesia. We suggest that anesthesia selectively modulates the neural dynamics in the cortex, and thus the patterns of visually-evoked responses in the awake animal and under anesthesia are not related to each other in a straightforward manner. - "Characterizing internal dynamic states and their emergence in
the primary visual cortex of the awake ferret", Maolong Cui, Alfredo
Fontanini, Donald B. Katz, József Fiser, Society of
Neuroscience Abstracts (38), 2008.
Abstract / PosterAccording to recently emerging views on visual cortical processing, activity in the primary visual cortex is governed by dynamically changing internal states of the system modulated by the incoming information rather than being fully determined by the visual stimulus. We analyzed systematically the dynamical nature of these states and the conditions required for their emergence. Multi-electrode recordings in the primary visual cortex of awake behaving ferrets (N=30) were analyzed after normal and visually deprived development at different ages spanning the range between postnatal day (P) 24 and P170. Visual deprivation has been achieved by bilateral lid suture up to the time of the visual tests. Multi-unit recordings were obtained in three different conditions: in the dark, while the animals watched random noise sequences, and while they saw a natural movie. 10-second segments of continuous recordings under these conditions were used to train two alternative state-dependent models, one based on Hidden Markov modeling that assumes internal dynamical dependencies among subsequent internal states and the other based on Independent Component Analysis which does not assume such dependencies. HMM significantly outperformed ICA (p<0.001) for both normal and lid sutured animals. In addition, HMM performance increased with age (p<0.001), more so than ICA did (p<0.001). We also assessed the similarity between different underlying states across different conditions (Movie, Noise and Dark), by computing the Kullback-Leibler distance between the probability distribution of the observed population activity generated by the underlying states. We found that, in general, similarity between underlying states across conditions strongly increased with age for normal animals, but this similarity remained significantly lower than that for lid sutured animals (p<0.0001). In addition, the number of transitions in the oldest age group was higher in normal animals compared to lid sutured ones (p<0.001). The result suggests that positing dynamic underlying states that emerge with age and can capture the behavior of cell assemblies is critical in characterizing the neural activity in the primary visual cortex. However, both the behavior and the emergence of these states depend only partially on proper visual input, and it is determined to a large extent by internal processes.
2007
- "Integrating central and peripheral information during object
categorization", Kimberly MacKenzie, Yaihara Fortis-Santiago,
József Fiser, Conference Abstract: Journal of Vision, 2007;
7(9):194, 194a,
http://journalofvision.org/7/9/194/,
doi:10.1167/7.9.194
Abstract / PosterImages presented at fixation provide more information to the visual system than images presented parafoveally. However, it is not clear whether it is more beneficial to receive the larger amount of information first in sequential categorical comparisons. Theories based on activation of mental sets, pure information content, or interference make different predictions on the likely outcomes of such tasks. In our study, subjects made same-different category judgments on a large set of briefly appearing pairs of grayscale images of everyday objects, which were presented on a gray background. Each image extended 5 degrees of visual angle, could appear in either the center (C) or corners (S) of the screen for 12.5, 25, or 50 msec, and was followed by a random mask presented for 25 msec. Pairings of position, timing, and category were fully randomized and balanced across trials, and the ISI between the two images within a trial was kept at 12.5msec. Subjects were instructed to fixate at the center of the screen, and their eye movements were monitored. There was a significant advantage in conditions where the central image appeared first and the peripheral image second (C-S) compared to the opposite order (S-C) (t(16)=0.02, p [[lt]] 0.05). However, the relation between stimulus presentation time and categorization performance in the C-S condition was non-monotonic: longer duration was not always paired with better performance. These results rule out pure information-based explanations and suggest that object information received earlier constrains how efficiently information received subsequently is processed in categorization tasks. - "Human Orientation Sensitivity During Object Perception", Henry
Galperin, Peter Bex, József Fiser, Conference Abstract: Annual
Meeting of the Vision Sciences Society, 2007.
Abstract / PosterThe accurate representation of local contour orientation is crucial for object perception, yet little is known about how humans encode this information while viewing complex images. Using a novel image manipulation method, we assessed sensitivity to the local orientation structure of natural images of differing complexity. We found that the visual system involuntarily discounts substantial levels of orientation noise until it exceeds levels that are considerably higher than the smallest orientation change that can be discriminated for a single contour. The much higher threshold and a characteristic dipper function we observe do not fit the classic view of orientation processing, but can be readily explained by a higher-level template-based process that provides an a priori reference for the expected form of objects. - "The effect of anesthesia on neural activity in the primary
visual cortex of the rat", Benjamin L. White, József Fiser,
Society of Neuroscience Abstracts (33), 2007.
Abstract / PosterMuch of what we know about visual processing in the brain is based on neural data collected in anesthetized animals assuming that the essential aspects of the computations are preserved under such conditions. However, recent findings support an alternative view on visual perception that puts a strong emphasis on the role of ongoing activity which is all but eliminated in anesthetized preparations. According to this view, spontaneous activity represents momentary biases, contextual information and internal states of the brain that are essential for interpreting the incoming sensory information. Thus it is critical to understand what aspects of spontaneous activity carry relevant information for perception. To study this question, we collected and analyzed multi-electrode recordings in the primary visual cortex of adult rats under different levels of anesthesia. Anesthesia was induced by isoflourane ranging from 1.0 to 3.0% in increments of 0.5%. Isolated unit and local field potential (LFP) activity was collected from sixteen electrodes. Coherence analysis on LFPs revealed a clear increase from low to high levels of isoflurane anesthesia. Specifically, the mean coherence between electrodes over the LFP frequency range decreased with each increase in isoflurane concentration, from 1.0% to 3.0%. Variation about the mean also increased with higher levels of anesthesia. In addition, peaks in correlation were broader under light levels of anesthesia than under deep levels. Therefore, isoflurane anesthesia seems not only to reduce overall levels of cortical activity, but also to decrease the amount of correlation and coherence in ongoing activity. These results suggest that ongoing activity in the primary visual cortex of the rat has a structure that is appropriate for conveying relevant information for visual processing. In contrast to the presently dominant feed-forward view on perceptual processing, using this information requires a rapid dynamic integration of bottom-up an top-down signals in the primary visual cortex. -
"Do we develop visual representations based on pair-wise
statistics of the visual scene?" Gergő
Orbán, József Fiser,
Richard N. Aslin and Máté Lengyel, Conference abstract: Sloan-Swartz
Meeting of Theoretical Neurobiology, UC San Diego, CA, USA, 2007.
AbstractThe dominant view on how humans develop new visual representations is based on the paradigm of iterative associative learning. According to this account, new features are developed based on the strength of the pair-wise correlations between sub-elements, and complex features are learned by recursively associating already obtained features. In addition, Hebbian mechanisms of synaptic plasticity seem to provide a natural neural substrate for associative learning. However, this account has two major shortcomings. First, in associative learning, even the most complex features are extracted solely on the basis of pair-wise correlations between their sub-elements, while it is conceivable that there are features for which higher order statistics are necessary to learn. Second, learning about all pair-wise correlations can already be intractable since the storage requirement for such representations grows exponentially with the number of elements in a scene, and learning progressively higher order statistics only exacerbates this combinatorial explosion. We present the results of a series of experiments that assessed how humans learn about higher-order statistics. We found that learning in an unsupervised visual task is above chance even when pair-wise statistics contain no relevant information. We implemented a formal normative model of learning to group elements into features based on statistical contingencies using Bayesian model comparison, and demonstrate that humans perform close to Bayes-optimal. Although the computational requirements of learning based on model comparison are considerable, they are not incompatible with Hebbian plasticity, and offer a principled solution to the storage-requirement problem by generating optimally economical representations. The close fit of the model to human performance in a large set of experiments suggests that humans learn new complex information by generating the simplest sufficient representation based on previous experience and not by encoding the full correlational structure of the input. - "Spontaneous activity in V1: a probabilistic framework" Gergő
Orbán, Máté Lengyel, József Fiser, Conference abstract:
Sloan-Swartz Meeting of Theoretical Neurobiology, UC San Diego, CA,
USA, 2007.
AbstractIn this talk, we focus on two puzzles coming from two lines of research. First, cortical neurons show high level of spontaneous activity. The role of this metabologically expensive and richly structured ongoing neural signal with strong "stimulus independent" variance is presently unknown. Second, previous theoretical approaches proposed that neural activity in the primary visual cortex can be explained by a formal computational goal: cells in V1 are optimized for providing a sparse but complete and efficient representation of the structure of natural scene stimuli. According to the proposal, this efficient code for statistical estimates of natural scene stimuli would be learned via unsupervised learning using a set of natural image patches as stimuli. However, these codes can give an account for only the mean responses of cells obtained by averaging across multiple presentations of the same stimuli. Therefore, such codes generate correct responses only to a limited number of bar stimuli and they cannot explain any of the rich repertoire of responses to more complex stimuli. Neither can they clarify the within-trial variability observed in cells. Our proposal consists of two parts. First, we suggest that ongoing activity and the variance observed in the responses of cortical neurons to stimuli is not mere noise but contributes to the more faithful representation of the stimulus. Second, we propose that neural activity encodes not just the most probable single interpretation of the stimulus but also its uncertainty in the form of a probability distribution over possible interpretations. We explored the idea that activity in V1 reflects sampling of the "recognition distribution", the probability distribution of possible hypotheses that are congruent with both the present and past inputs to the system. We also used this sampled approximation to the true recognition distribution in a variant of the expectation-maximization algorithm in an unsupervised learning scheme to adapt the synaptic weights between cells so that they form the efficient code postulated by earlier studies. This learning scheme reproduced the linear filter properties of simple cells, just like the previous studies did. However, our results can also account for several properties of V1 receptive fields such as non-classical behaviors of receptive field without the need of using extra lateral connections or divisive gain control mechanisms. - "V1 activity as optimal Bayesian inference",
Gergő Orbán,
József Fiser, Máté Lengyel, Conference Abstract:
Computational and systems neuroscience (Cosyne), Salt Lake City, UT, 2007.
Abstract .PDF
2006
- "Mapping different states in neural activity in the primary
visual cortex of the awake ferret", József Fiser, Mark
Bourjaily, Chiayu Chiu, Michael Weliky, Society of Neuroscience
Abstracts (32), Atlanta, GA, 2006.
AbstractThere is a discrepancy between the generally accepted role of ongoing activity during visual development, where spontaneous firing is viewed as an important guiding activity indispensable for proper emergence of the visual structure, and during visual perception, where spontaneous neural activity is considered to be unwanted noise. This discrepancy stems from the presently dominant view which posits that visual information is analyzed in a feedforward signal-processing manner where ongoing activity is accidental and can be neglected. To study this discrepancy, we analyzed multi-electrode recordings in the primary visual cortex of awake behaving ferrets (N=20) at postnatal day (P) 24-26, P44-45, P71-90 and P131-168. Multi-unit recordings were obtained in three different conditions: in the dark, when the animals watched random noise sequences, and when they saw a natural movie. At all ages there was a significant spatio-temporal structure in the observed neural activity and this structure showed a distinctively evolving pattern across ages. The high spatial correlations across different recording sites during the dark condition ruled out the possibility of averaging out the "noise" correlations and thus questioning the validity of feed-forward signal processing models. An alternative model is based on a generative Bayesian framework where ongoing activity represents momentary perceptual biases of the brain based on previously obtained information and internal states. To test the validity of this framework, the same data was analyzed using a Hidden Markov Model. We found clearly distinct internal states in all conditions defined by approximately stationary firing rates and abrupt transitions between states. The identified HMMs were specific to particular conditions classifying untrained neural activity correctly about 90% of the times. These findings suggest that even in the primary visual cortex neural processing can be best described as a rapid dynamic transition between a large number of states, where the external input modulates the intrinsic dynamics by selectively boosting particular states. - "Analysis of spontaneous and sensory-driven activity in ferret
V1" J. Zhao, G. Szirtes, M. Eisele, J. Fiser, C. Chiu, M. Weliky,
K.D. Miller, Society of Neuroscience Abstracts (32), Atlanta, GA,
2006.
AbstractWe analyze multiunit recordings from linear arrays of 16 electrodes spanning 3 or 9 mm in awake ferret V1, as in Fiser et al. Nature 431:573 (2004). Recordings were made at ages ranging from 29 to 168 days postnatal. Fiser et al. 2004 found that activity from P30 to P90 was dominated by similar activity patterns whether in dark or when stimulated by white noise or a natural movie. They showed that temporal correlations on a single electrode were long at early ages but became progressively shorter, while spatial correlations at a single time were short-ranged at early ages but became long-range at later ages. Correspondingly, activity patterns became dominated by bursts spanning all electrodes. We find the principal components of simultaneous activity across the electrodes. At later ages, most of the variance is in the first component, which is uniform across electrodes (each electrode deviates by the same number of standard deviations from its mean activity). This component's autocorrelation shows some tendency to oscillate, with a bump of power in the range 10-17Hz. This temporal structure is quite similar for dark and movie stimuli. However, for noise stimuli, particularly at ages >= P120, very long-lasting oscillatory autocorrelation at 11-12 Hz is seen. This may represent alpha activity, which has been argued to represent an "idle" or "disengaged" state, suggesting the awake animal may disengage from the noise stimulus. More generally, this dominant first component seems likely to represent a global state rather than specific visual input. Subtracting off the principal component, the remaining activity shows correlations that are much more localized in space and time. Power in the remaining activity seems to fall off as a power of spatial frequency, suggesting that it might have no characteristic spatial scale. - "Distinct states of firing patterns in the primary visual cortex of awake ferrets", József Fiser, Mark Bourjaily, Chiayu Chiu, Michael Weliky, Conference abstract: Sloan-Swartz Meeting of Theoretical Neurobiology, Columbia University, USA, 2006.