DD303 Recognition

Question	Answer
Difference between perception & recognition?	>Perception: what you see - shapes, colours etc (enough for you to navigate around the environment) >Recognition: what you see things as - dogs, faces etc (top-down, involves knowledge)
Types of recognition - object & face	>Object: between-category; "what" - eg apple >Face: within-category; "who" - eg Freud Typically, separate research for each
General recognition steps	>Basic sensory description (eg Marr's 2 1/2D) >Turned into 3D description/representation >Matches what's been seen before (IRRESPECTIVE of any angle)
Recognising 2D objects - different cognitive processes than using 3D?	>Template matching: sensed image compared w/ range of templates until match found - unlikely as very generic templates or large # templates req'd >Feature recognition: key features extracted & compared w/ internal representation until match found - more generic; issue with ambiguity eg lines & curves >Structural description: structural description & key features w/ how organised in relation to each other compared w/ internal representation until match found - good re variety & ambiguity; good for 3D version of 2D object
Marr's 2 1/2 Sketch - importance for recognition?	>Final stage of Marr's model of early visual processing - integration of outputs from various modules to form 2 1/2D sketch >Viewpoint specific >Enables interaction w/ environment (according to Marr) >STARTING point for models of object recognition - Marr & Biederman
Problem with trying to recognise 3D object?	>Retinal image is 2D & objects look very different depending on viewpoint - primal sketch changes with viewpoint >Either we need to store: 3D representation of an object (viewpoint independent/invariant) OR many 2D representations (viewpoint dependent) >Marr & Biederman: agree 3D representation generated BUT disagree on process
Marr (& Nishihara) Process?	>Object made of components = generalised cones 1) derive object's shape for 2 1/2D sketch. Assumptions made re points on the object which produces silhouette (Marr: contour generator) - eg each point on CG = different point on object 2) id the major axis/axes. (evidence: Warrington & Taylor '78; Humpreys & Riddoch '84) Areas of concavity link component cones/primitives Shape of object described in relation to axis/axes 3) Compare 3D structural description w/ mental catalogue to find match >Doesn't depend on viewing angle as description & model entries are 3D
Evaluation of Marr (& Nishihara)	For: >location of central axis key to recognition >Lawson & Humphreys '96 - rotation of line drawings not affect recognition unless major axis titled toward observer; result of major axis appearing foreshortened? >Warrington & Taylor '78; Humphreys & Riddoch '84 - Ps w/ right hemisphere damage recognise typical but not atypical view of objects; similar for photos of objects Against: >Works well for animals, not good for furniture, fruit etc >Within-category discrimination - hard to explain conversion to generalised cones mapping to all exemplars of category to same representation Suggests we cant tell difference between 1 instance of a thing and another (eg all Westies are the same?)
Biederman process?	>Same as M(&N) >Complex objects represented as hierarchies of simpler shapes >Concavity used to sub-divide objects >Recognition via comparing 3D representation w/ previous exemplars >Different to M(&N) >Geons not generalised cones >5 invariant non-accidental properties - no axis req'd >Clues from lines/outlines >Matching geons assembled into 3D representation
Evaluation of Biederman	For: >Explains why objects harder to recognise when parts of image w/ greater concavity removed (Biederman) >Biederman & Gerhardstein Object priming - only works if viewpoint <135 degrees apart Performance decline if geons hidden between views Against: >Bulthoff & Edelman - Ps unable to recognise objects from novel POV >Tarr - Recognition may not rely on forming object-centred model; some factors viewpoint dependent (eg faces?) >Within-categorisation discrimination hard per M(&N) ie allows id of dog or cat but not 'our' dog or cat >Bottom-up focus=de-emphasis on import of top-down processes >Theory states objects consist of invariant geons - but object recognition more flexible w/ some objects not having identifiable geons
3 stages of object recognition?	1) Structural - id object as familiar 2) Semantic - access semantic knowledge about object 3) Naming - access object's name >Evidence from CNP studies - patients with breaks @ different stages
Similarities of Marr (&Nishihara) and Biederman	>Similarity in approach >Info processing approach >Marr is more broad (covering all stages) - Biederman developed later stages of Marr's theory >Both assume shape is critical w/ starting point being outline >Both see representations being built from component parts - not holistic >Both deal with invariance via generation of 3D representation - ie not storing lots of 2D views >Both assume concavity important >Both assume recognition via matching 3D representation to representations in LTM
Differences between Marr (&Nishihara) and Biederman	>How cues used to generate 3D representation - contour generator vs non-accidental properties >basic components (primitives) of 3D representation - gereralised cones vs geons >overall 3D representation - Marr = id of central axes w/ the representation being a description of components relative to central axis Biederman = geons & how they fit together
Viewpoint-Dependent Challenge - Tarr & Bultoff	>Viewpoint invariant models cant explain all object recognition evidence >Tarr & Bultoff '95 - viewpoint-dependent representations req'd where object representations involve collections of views from specific viewpoints >Tarr - RTs & errors for naming familiar objects from unfamiliar viewpoint increase systematically w/ increased rotation distance from nearest familiar viewpoint >T&B - response to Biederman & Gerhardstein arguing evidence for viewpoint-dependence is strong Also - demos of invariance only work due to small exemplar set w/ few key features OR use common everyday objects where previous experience comes into play >T&B problems with Biederman: Geon structured descriptions (GSD) not sensitive enough - cant distinguish cow & horse >GSDs cant distinguish @ subordinate level - eg makes of car; faces
Reasons for perceiving faces other than recognition?	>Expression analysis: determine people's emotions CNP - some lose ability to recognise facial expressions; BUT not same patients who cant recognise faces - suggests different systems >Lip reading: McGurk Effect Some patients don't show it & cant lip read - suggests different system
Face recognition - 3 stages?	1) Structural - id face as familiar 2) Semantic - access semantic knowledge about person 3) Naming - access person's name >Evidence - Diary study (Young) >CNP - patients w/ problems @ different stages >Experiments - naming slower than category judgements which are slower than familiarity judgements
Familiar vs Unfamiliar faces?	>Good pictorial memory - Shepherd '67 - Ps 98.55 correct in force choice recognition of 600 pictures >Unfamiliar faces ≠ good >Burton & Bruce '99 - Watch video, then try to id 10 still pics. Mis-id 20% of cases >Kemp et al '97 - cashiers poor @ matching faces to photographs
Problems with unfamiliar faces?	>If shown photo of someone we've just seen but from different angle, lighting etc then difficult to id >When only viewed once, a face will be represented pictorially rather than structurally - SO recognition dominated by external features & viewing conditions
Bruce & Young model?	>Bruce & Young 1986 >Conceptual model (not computer) >Separate sub-systems for: facial speech analysis expression analysis *directed visual processing analysis >When face well known, it's encoded structurally. Recognition less dependent on viewing conditions
Burton et al IAC model	>Based on Burton & Bruce 1986 >Connectionist model - interactive activation & competition network >Info flows top down >All about node activation - have to be active enough. If not then issues such as prospagnosia >FRU - face recognition unit >PIN - personal identifier node - info about them >SIU - semantic information unit - types of info eg names of occupation >WRU - word recognition unit (like FRU for words) >NRU - name recognition unit - linked to PIN >3 pathways >Face - FRU> PIN> SIU> Lexical output >Name - WRU> NRU> PIN> SIU> Lexical output >Other info - WRU> SIU> Lexical output
Covert face recognition	>Recognition based on what someone tells us - ie conscious awareness >Recognition w/o awareness? >Prospagnosics cant consciously regonise faces - but they show different physiological responses to pictures of people they know - implicit/covert recognition >Modelled in Burton et al model
3 questions re whether face recognition separate from object recognition...	1) Special area of brain involved? 2) Ability learned or innate? 3) Do specific features help face recognition or do we recognise "whole face"?
Are faces special Q1 Special area of the brain?	>Prosopagnosia = double dissociation; suggests different neural pathways >Capgras delusion - separate for objects & faces >Monkey temporal lobes - cells respond differently to monkeys & humans >fMRI - FFA activation
Are faces special Q2 Learned or innate?	>Johnson & Morton ('91) - >newborns notice faces more than other stimuli >So, look @ faces a lot >So, learn more about expressions etc. >Different skill to other recognition - ie generic features @ 1 level (2 eyes, nose etc) >Yin ('69) - better memory for faces than other objects; inverting faces has more impact on recognition than inverting object - BUT different types of categorisation involved! >Diamond & Carey ('86) - dog experts disproportionately affected by inversion (care - results not replicated!) >Configural/holistic processing used by experts? We're all experts @ face recognition >Thompson ('80) - Thatcher effect
Are faces special Q3 use of specific features or whole face?	>Yin ('69) - inverted faces delay suggests processing upright faces holistically >Farah ('93) - features better recognised in context of the whole face then when presented alone >Bruce ('94) - moving features up/down impacts recognition >All suggest holistic/configural processing
Bruce & Young model (1986) - 8 components?	1) Structural encoding - produces various representations of faces 2) Expression analysis - other people's emotional states inferred from facial expression 3) Facial speech analysis - speech perception aided by watching lip movements 4) Directed visual processing - specific facial info processed selectively 5) Face recognition nodes - contain structural info about known faces 6) Person identity nodes - provide info about these individuals (eg occupation) 7) Name generation - person's name stored separately 8) Cognitive system - contains additional info (eg most actors/actresses have attractive faces); influences which other components receive attention
Predictions from Bruce & Young 1986 model?	1) Major differences processing familiar & unfamiliar faces - unfamiliar requires more processing 2) Separate processing routes for processing facial id & processing facial expression 3) When looking @ familiar face, familiarity info from the FRU accessed 1st, then info about that person (eg occupation) from PIN, then person's name from name generation component Thus - familiarity decisions faster than decisions based on PINs, in turn faster than decisions re person's name
Limitations of Bruce & Young (1986) model?	1) Omits 1st stage of processing - ie when observers detect that they're looking at a face (Duchaine & Nakayama '06) 2) Assumption re facial id & facial expression involving separate routes too extreme? (Calder & Young '05). Majority of prosopagnosics have problems processing expression as well as id - 2 processing routes probably only partially separate 3) Assumption that processing names is always after processing other personal info is too rigid (Bredart et al '05). More flexibility req'd - proved in various models (eg Burton, Bruce & Hancock '99)

Next up

Description

Resource summary

Similar

	Created by Ken Adams over 10 years ago