Using evoked magnetoencephalographic responses for the ...
Using Evoked Magnetoencephalographi c Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of Linguistics and Philosophy From Cog Sci to Cog Neurosci Cognitive Science, including Linguistics, has used behavioral data to develop computational theories of language representation and use
These theories play out along the dimensions of time (sequential processing stages), space (separation of processing functions) and complexity (difficulty of processing) Cognitive Neuroscience of Language Cognitive Science moves to Cognitive Neuroscience when the temporal, spatial, and complexity dimensions of cognitive theories are mapped onto the time course, localization, and intensity of brain activity However, because of the lack of temporal information, the development of
Neurolinguistics with fMRI and PET techniques has tended to flatten theories of the Cognitive Neuroscience of Language Cognitive Science: Taft & Forster 1977 (traditional articulated Cog Sci) Affix stripping, followed by recombination of stem and affix sample prediction from model: -semble is a stem, since assemble, resemble, dissemble are words
-sassin (assasin) is not a stem, since only assassin is a word It should take longer to reject semble as a non-word than sassin, since semble is a lexical item (semble requires looping from box 4 through box 5 in the model before reaching box 7, while sassin pushes directly from box 4 to box 7, No) Taft 2004: further behavioral support for articulated model of processing stages More contemporary instantiation of
model -- makes predictions about RTs based, e.g., on a theory of the experimental task Flattened computational model: Gonnerman & Plaut (2000) Masked priming experiment compares responses to
Semantic sofa-COUCH Morphological hunter-HUNT Orthographic passive-PASS Unrelated award-MUNCH Claim: failure to find special location for the morphological condition (using fMRI) supports flat model in which morphology is an emergent property of semantic and phonological/orthographic relatedness fMRI experiment consistent
with flattened computational model. Temporal/sequential processing not at issue. But the masked priming experimental design is confounded with respect to predictions from a Taft-style model with affix-stripping since the orthographic items consist of possible stems and stripable affixes (e.g., tenable/ten passive/pass) Articulated vs. Flattened Model Tafts articulated affix-stripping model predicts that tenable and bendable
should be processed in the same places (in the model/brain) and in the same temporal sequence (affix stripping followed by stem activation followed by recombination), with differences in complexity (measured, e.g., by level of brain activity or latency of brain events) Thus the cognitive science model predicts the fMRI results and makes further predictions testable with techniques that allow exploration of the latency of brain responses MEG allows cognitive neuroscience to fully embrace cognitive science MEG records the magnetic fields
generated by electrical activity in the brain, millisecond by millisecond MEG has the spatial resolution, the temporal resolution and the sensitivity necessary to test predictions from cognitive science along the space, time and complexity dimensions Plot Examples of MEG experiments exploiting the temporal, spatial, and intensity resolution of the technique A return to Tafts stages The future: even closer ties
between experimental designs in cognitive science and cognitive neuroscience KIT/MIT MEG Lab Magnetoencephalography (MEG) = study of the brains magnetic fields http://www.ctf.com/Pages/page33.html Liina Pylkknen, Aug 03, Tateshina Magnetoencephalography (MEG) Distribution of magnetic field at 93 ms (auditory M100)
Averaged epoch of activity in all sensors, overlapping wave forms, one line/sensor Outgoing Ingoing MEG exemplified Parametric variation in letter string length and in added visual noise Categorical symbol vs letter manipulation M100 response varies in intensity with
visual noise; M170 response varies in intensity with string length Note separation in space and temporal sequence (M100 vs. M170) consistent with sequential processing model M100 response
M170 response Intensity of M170 response to letters as compared to symbols confirms function of processing at M170 time & location (visual word form or letter string area) Reaction time to read words predicted by combination of M170 amplitude and latency Latency coding? Response latency correlates with stimulus properties. Auditory M100 (from auditory
cortex) Frequency of tone predicts latency of M100 peak Temporal Coding?: Shape of response over time at M100 latency and source location correlates with phonetic category of stimulus Voiced (b,d) vs. voiceless (p,t) consonant auditory evoked response Different ways of measuring the shape of the M100 response to voiced vs. voiceless consonants
yield good computational experts that can classify data from a single response as either a pa/ta or a ba/da with significantly greater than chance accuracy Sequential processing of words What happens in the brain when we read words? Letter string processing Lexical activation (Tarkiainen et al. 1999)
Pylkknen and Marantz, Trends in Cognitive Sciences Note left lateralization of responses in standard M350 Latency of M350 sensitive to lexical factors such as lexical frequency and repetition Frequency Repetition Behavioral Data: Reaction Time
700 140 30 6 1 .2 6 Frequency Category (Frequent -- Infrequent) Latency of m350 Component Categories (n/Million): 1: 2:
3: 4: 5: 6: 1 2 3 4 5 700 140
30 6 1 .2 6 Frequency Category (Frequent -- Infrequent) (Embick, Hackl, Shaeffer, Kelepir, Marantz, Cognitive Brain Research, 2001) (Pylkknen, Stringfellow, Flagg, Marantz, Biomag2000 Proceedings, 2000) M350 is (in time and place) the locus of lexical activation; lexical decision modulated by competition
among activated items occurs later and elsewhere Vitevich and Luce (1998), stages of word processing Phonotactic probability (sub-lexical frequency of bits of words) affects lexical activation, with frequency being facilitory Phonological neighborhood density affects lexical decision (after activation), with density being inhibitory Phonotactic probability and neighborhood density are usually highly correlated, so the same items that facilitate activation inhibit decision So, words with high phonotactic probabilities from dense neighborhoods should show quicker M350 latencies but slower RTs in lexical
decision Words and non-words with high probability sound sequences, from dense neighbors, show quicker M350s and slower RTs Pylkknen et al. (2002) M350: not sensitive to competition from phonological neighbors, RT is High phon. prob. word (LINE) Low phon. prob. word (PAGE) 700 **
EFFECT RT NEIGHBORHOOD COMPETITION EFFECT Irregular Past Tense Priming: Stockall & Marantz (to appear in Mental Lexicon) In cross-modal priming (hear one word, make a lexical decision on a letter string presented immediately after), irregulars dont generally prime their stems behaviorally: gave-GIVE taught-TEACH Allen & Badecker show that orthographic overlap in this experimental design leads to RT inhibition
and that past-tense/stem pairs with higher orthographic overlap yield less priming than those with less overlap Prediction of linguistic theories (e.g., Distributed Morphology) Irregular past tense/stem priming paradigms (gave/give, taught/teach) should yield identity priming at the stage of root/stem activation (the M350) and form competition effects among allomorphs subsequently, slowing reaction time relative to pure stem/stem identity priming.
MEG irregular past-tense priming experiment Design: Visual-visual immediate priming, lexical decision on the target (see Pastizzo and Feldman 2002 ) prime + 450 50 200
Duration of trial (ms) target 0 2500ms MEG Results: M350 Priming for Past Tense/Stem equivalent to identity priming Amount of Priing Amount of Priming n=8 Significant priming for Identity condition (*p=0.01)
TAUGHT-TEACH vs. SMACK-TEACH (*p=0.04) GAVE-GIVE vs. PLUM-GIVE (*p=0.05) No reliable effect for: STIFF-STAFF vs GRAB-STAFF (p=0.13) RT Results: Competition effects; no significant priming for TAUGHT-TEACH 80 Significant priming for Identity condition
Significant inhibition for STIFF-STAFF (*p=0.01) No reliable effect for TAUGHT-TEACH (p=0.21) (but trend MEG & RT Results: MEG taps stem activation; RT reflects decision in the face of competition M350 Latency RT
** 70 40 * * n.s. 10 -20 -50 gave-give
ident stiff-staff taught-teach Follow-up: Add regulars and ritzy/glitzy condition Regulars walk-walked Orthographic & Semantic Overlap: boil-broil Reverse order, stem before past tense ritzy-glitzy items
boil-broil Order effect on RT; i.e., on form competition -20 Amount of Priming (ms/fT) -50 gave-give ident stiff-staff
taught-teach Linguistic Computational Models of Morphology fully supported Relation between irregular past tense form and stem is like that between regular past tense form and stem (or between identical stems), not like that between words phonologically/orthographically and semantically related (boil - broil) Root priming separates from form competition (between allomorphs of stem) in time course of lexical access Taft (2004), Morphological Decomposition and the Reverse Base
Frequency Effect. Claim: Base frequency effects (RT to complex word correlates with freq of stem) reflect access of the stem of morphological complex forms whereas surface frequency effects (RT to complex word correlates with freq of complex word) reflect stage of checking recombination of stem and affix for existence and/or well-formedness. The suggestion being made, then, is that the advantage at the early stages of processing of having a relatively high base frequency could be potentially obscured by counterbalancing factors happening at later stages of processing. [750-1] Lexical Decision Task
non-word foils consisting of existing words with ungrammatical affixes (mirths, kettled, joying, redly, iratest) (just like the Devlin orthographic cases) three classes of words mending class: seeming class: growing class: low surface frequency low base frequency low surface frequency high base frequency mid surface frequency high base frequency Claim: advantage of high base frequency
for seem at stem access stage (indexed by the M350) is offset in RT by a disadvantage for the low-frequency of the use of the ing with the seem stem, i.e., at the post-affix recombination stage, indexed by RT (For Taft, manipulating the foils in lexical decision attenuated the surface frequency effect, arguing for two stages of processing in the indirect fashion typical of good cognitive science ) Reilly and Holt 2004, with the KIT/MIT MEG Team Replicate Tafts experiment in the MEG Lab Predict:
base frequency affects root access and thus M350 latency surface frequency affects postM350 recombination stage and thus RT Results: M350 Latency tracks Base Frequency, RT tracks Surface Frequency Surface frequency Base frequency RT Taft RT MIT M350 MIT Mending class low surface
low base 7.8 36.5 687 780 375 > Seeming class low surface high base 7.7 460.3 701 805 362
> > Growing class mid-surface high base 75.9 456.9 653 746 356 Surface Frequency effect at RT (significant at .05 level), Mending and Seeming slower than Growing Base Frequency effect at M350 Latency
(significant at .05 level), Mending Conclusion MEG serves as a tool to upgrade cognitive science (& linguistics) to cognitive neuroscience without losing the empirically motivated richness of cognitive computational theories Cog Sci notions of space, time, and complexity map onto brain space, latency and magnitude of neural activity Whats the next step?
Traditional approaches to MEG analysis involve averaging together many responses (repeated from an experimental bin) prior to computing differences in responses by condition within each subject This contrasts with standard cognitive science practice (e.g., with RT) of including a dependent measure from each trial in the ANOVA. To fully incorporate cognitive theories into cognitive neuroscience, including the correlation of continuous variables with continuous response measures and the use of item analyses in complex designs, we need to include
single trial MEG data in our analyses Why not single trial MEG? For the type of experiment discussed in this talk, we would need to extract response amplitude and latency information from each trial, given a response defined in terms of source localization So, we would look at each single response for dipole source activation (latency of peak response, amplitude of response) for a source identified from grand averaged data for a subject M100 Latency, Single Trials (Marantz, in preparation)
Left hemisphere M100 source computed via single dipole model from grand averaged response to 60 tones, 30 at 200Hz, 30 at 1KHz Weight matrix from dipole source used as spatial filter over raw data to derive dipole activation latency for each tone individually Single trial M100 latencies Latency of left hemisphere M100 latency as a function of stimulus tone frequency 160 150
8 200Hz 10 12 14 Tone Frequency 200Hz vs. 1KHz 16 1 KHz
18 Single trial analysis as in behavioral studies is possible using only normal MEG techniques and tools No fancy pre-processing No fancy localization or statistical tools For responses less automatic than the M100, expect overlap in scatter plots to be greater (approaching that for RTs in e.g. lexical decision experiments) Taft & Forster re-visited
Is RT slow-down for -semble (bound stem) over -sassin (pseudo-stem) attributable to lexical access for semble but not for sassin, as Taft claims, or to response competition from words (resemble, dissemble, assemble vs. assassin)? Prediction: slow-down at lexical access should show up at M350 while slow-down for response competition should occur after (as shown by neighborhood density and past tense studies) Brown & Marantz (in preparation) 3 subjects 20 real stems, 20 pseudo stems
(matched by Taft & Forster along various dimensions) per condition Single trial analysis of MEG data: M350 dipole activation peak analysis, with M350 dipole fitted over left-hemisphere sensors on the grand average to all stimuli in the experiment Slow-down is observed at M350: for 3 subjects and 108 observations, difference is significant over the single trial MEG data but not yet for RT Real Stems (semble)
Pseudo Stems (sassin) Reaction time 784ms 719ms p=0.16 M350 Latency (over single trials)
356ms 339ms p=0.005 Taft theory of decomposition in which bound stems have lexical entries is fully supported by the MEG data Single trial MEG data is at least as consistent as reaction time data MEG can be used on par with RT to add additional dependent variables to experiments testing computational theories within cognitive neuroscience
The pale brown peppered moths are no longer safe resting on tree trunks. Their numbers decline. Occasionally there is a mutation in the colour of the peppered moth and it is born a darker colour. However, the mutant black moths...
Genre Characteristics How Does Nonfiction Look? Provides an outline of important information in a table of contents, index, or glossary How Does Nonfiction Look? Each page has words in a variety of fonts and type sizes. Bold or italic fonts...
AEDT Online Pedagogical Model (Modified from Garrison, Anderson & Archer, 2000) The AEDT program design philosophy is informed by the Technology Competency and Use (TCU) framework which builds from a Community of Inquiry (COi) model (Garrison, Anderson and Archer, 2000)...
DAP-1522 can be simply position as a . Wireless Bridge. to upgrade original wired or 11a/g wireless network to 11n wireless network with easy setup. DAP-1522 can also be position as a . Wireless Media Bridge. to provide high-performance wireless...