My research focuses on the acquisition of structured visual information and the conversion of this information into sophisticated internal representations for controlling behavior. We use an integrated approach with three main components, human visual and learning experiments, computational modeling of learning, and multi-electrode recording from behaving animals. The recurrent theme of our work is the pursuit of a statistically based and biologically sound framework to link low-level visual mechanisms (e.g., adaptation) with the development and learning of higher level complex features and constancies for efficient visual representations of objects and scenes.
During development, humans and animals learn to understand their visual environment based on their sensory experience. Despite decades of research, it is still not clear what representations the brain uses in this process and how it acquires them. We follow a systematic research program to clarify these issues. We have conducted a series of adult and infant experiments showing that humans possess a fundamental ability to extract statistical regularities of unknown visual scenes automatically both in time and space from a very early age. We argue that this basic ability is key in the formation of visual representations from the simplest levels of luminance changes to the level of conscious memory traces, rules and abstract knowledge. Currently we are in the process of investigating the interaction between this learning ability and various perceptual constraints due to e.g., eye movements, clutter, occlusion and other presumably more hardwired constraints such as Gestalt rules, and the consolidation effect due to sleep. Using advanced low-level psychophysical tools, we also investigate what visual features humans use for object recognition.
Our computational modeling work interprets our experimental data in a Bayesian framework. We have demonstrated that probabilistic models, more specifically generative statistical model selection learning can better capture human behavior observed in our experiments than simple associative learning can. This suggests that humans interpret their sensory input through an "unconscious inference" process that follows precisely the statistical structure of the environment but aims at the simplest possible internal description of the input. We have shown that this framework gives a statistically based interpretation of empirical Gestalt rules, decision making, attention, and chunking as well as provides a tightly coupled explanation for visual recognition and visual learning.
The Bayesian framework raises the question of feasible implementation of such a scheme in the brain as it requires a continuous reciprocal interaction between groups of elements at different levels of the hierarchical representation encoded in the cortex. This dynamic collective coding is in contrast with the traditional feed forward view of how visual information is processed in the cortex. We have shown that both at the level of primary visual cortex and at higher areas the representation of visual information is best described as the activity pattern of cell assemblies rather than a set of individual feature detectors. We have also shown that the precise developmental pattern and the correlational structure of cell responses in the primary visual cortex calls in question the notion that ongoing cortical activity is accidental noise unrelated to visual coding. Instead, we proposed the “Sampling Hypothesis”, namely that the membrane potential of ongoing activity represents samples from the computed posterior distribution, which is the result of the probabilistic computation carried out by the cell assemblies when they combine incoming information with relevant internal knowledge of the world for perception and learning. Currently we are evaluating this proposition by matching the predictions of the framework to empirically measured cell responses, and expanding the evidence for the Sampling Hypothesis from the primary visual cortex to higher visual areas, other modalities, and prefrontal areas. This framework and hypothesis support Hebb's original notion of internal dynamical states being crucial for integrating cognitive processes beyond simple stimulus-response associations, and it can potentially close the gap between direct response functions and complex behavior.
Roser, M. E., Fiser, J., Aslin, R. N., & Gazzaniga, M. S. (2011) Right Hemisphere Dominance in Visual Statistical Learning Journal of Cognitive Neuroscience 23, pp. 1088-1099. [abstract]
Berkes P, Orbán G, Lengyel M, Fiser J. Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science. 2011 Jan 7;331(6013):83-7. [abstract]
Fiser, J., Berkes, P., Orbán, G., & Lengyel, M. (2010). Statistically optimal perception and learning: from behavior to neural representations. Trends in Cognitive Sciences, 14, pp. 119-130. [abstract]
Fiser, J. (2009) The other kind of perceptual learning. Learning and Perception, 1, pp. 69-87.
Fiser J. Perceptual learning and representational learning in humans and animals.(2009) Learn Behav. 2009 May;37(2):141-53.
Orbán G, Fiser J, Aslin RN, Lengyel M. (2008)
Bayesian learning of visual chunks by human observers. Proc Natl Acad Sci U S A. 2008 Feb 19;105(7):2745-50.
Fiser J, Scholl BJ, Aslin RN. (2007)
Perceived object trajectories during occlusion constrain visual statistical learning. Psychon Bull Rev. 2007 Feb;14(1):173-8.
Fiser, J., & Aslin, R.N. (2005). Encoding multi-element
scenes: Statistical learning of visual feature hierarchies. Journal of Experimental Psychology: General, 134:
Aslin, R. N., & Fiser, J. (2005). Methodological challenges
for understanding cognitive development in infants. Trends
in Cognitive Sciences, 9: 92-98. [abstract]
Fiser, J., Chiu, C., & Weliky, M. (2004). Small modulation
of ongoing cortical dynamics by sensory input during natural
vision. Nature, 431: 573-578. [abstract]
Fiser, J., Bex, P.J., & Makous, W.L. (2003). Contrast conservation
in human vision. Vision Research, 43: 2637-2648.
Weliky, M., Fiser, J., Hunt, H.R., & Wagner, D.N. (2003).
Coding of natural scenes in primary visual cortex. Neuron,
37: 703-718. [abstract]
Fiser, J., & Aslin, R.N. (2002). Statistical learning of
new visual feature combinations by infants. Proceedings
of the National Academy of Sciences, 99: 15822-15826.
Fiser, J., & Aslin, R.N. (2002). Statistical learning of
higher-order temporal structure from visual shape sequences.
Journal of Experimental Psychology: Learning Memory & Cognition,
28, pp. 458-467. [abstract]
Fiser, J., & Aslin, R.N. (2001). Unsupervised statistical
learning of higher-order spatial structures from visual
scenes. Psychological Science, 12: 499-504. [abstract]
Fiser, J., Subramaniam, S. and Biederman, I. (2001). Size
tuning in the absence of spatial frequency tuning in object
recognition. Vision Research, 41: 1931-1950. [abstract]
Atkins, J., Fiser, J. and Jacobs, R.A. (2001). Experience-dependent
visual cue integration based on consistencies between visual
and haptic percepts. Vision Research, 41: 449-461.
Fiser, J. & Biederman, I. (2001). Invariance of long-term
visual priming to scale, reflection, translation and hemisphere. Vision Research, 41: 221-234. [abstract]
Mel, B.W. & Fiser, J. (2000). Minimizing binding errors
using learned conjunctive features. Neural Computation,
12: 731-762. [abstract]
Dobbins, A.C., Jeo, R.M., Fiser, J. and Allman, J.M. (1998).
Distance modulation of neural activity in the visual cortex. Science, 281: 552-555. [abstract]
Last review: August 18, 2011