Development of a Naïve Bayesian Classifier for Item Domains

Development of a Naïve Bayesian Classifier for Item Domains

Automatic Generation of Verbal Analogy Items Alan D. Mead Illinois Institute of Technology AIG in employment testing Rise of unproctored Internet testing (UIT) UIT may cause many security problems One is item theft and coaching Solution: Generate entire test from scratch for each examinee Item theft less of a problem Coaching less effective Items could be watermarked Also reduces cost and speeds deployment AIG in employment testing (cont.) Need a variety of test content Verbal analogies Vocabulary Math Perceptual speed and accuracy Spatial ability Personality Situational Judgment Etc. Verbal Analogies Shovel:Dig a) Bag:Buy

b) Baby:Cry c) Fork:Eat d) Car:Stop Pair responses Shovel:Dig::Fork a) Buy b) Cry c) Eat d) Stop Word Responses Identify a bridge; you DIG with a SHOVEL Find a matching answer; you EAT with a FORK Generating Verbal Analogies Identified database of relationships (e.g., RIDER operates a BIKE) Identified additional bridge relationships (BOVINE means COW-like & ABSENT is the opposite of PRESENT) Gathered data on word frequency and (part of this study) word familiarity Generating Verbal Analogies (cont.) 1. Randomly select a bridge 2. Randomly select TWO pairs for this bridge (one for the stem, one for the key) 3. Randomly select 2-3 additional pairs from other bridges 4. Randomly assign key pair; fill in remaining pairs

Sample Items 1. paternal:father:: ? a. juvenile:child b. microphone:sound c. chalk:writer d. unfold:fold 3. rocket:astronaut:: ? a. lamp:light b. stick:skating rink c. jet:pilot d. demand:supply Alternative format 1. paternal:father:: juvenile:? a. child b. sound 3. rocket:astronaut::jet:? c. writer a. light d. fold b. skating rink c. pilot d. supply Keys 1. paternal:father:: ? [Bridge: FATHER is described by PATERNAL] a. juvenile:child *** b. microphone:sound (unrelated: sound is a (typical) theme of microphone)

c. chalk:writer (unrelated: writer is a (typical) agent of chalk) d. unfold:fold (unrelated: unfold and fold are opposites/opposed) 3. rocket:astronaut:: ? [Bridge: ASTRONAUT operates ROCKET] a. lamp:light (unrelated: lamp is a (typical) result of light) b. stick:skating_rink (unrelated: skating_rink is a (typical) location of stick) c. jet:pilot *** d. demand:supply (unrelated: supply and demand are opposites/opposed) Present Study H1: Two forms of AIG analogies (word responses and pair responses) will have comparable reliability & validity H2: AIG scales will have reliability comparable to manually-written scale H3: AIG scales will have construct and criterion validity comparable to manuallywritten scale Method Sample of N=251 gathered online and from psychology classes Measures: n=20 AIG & human-written verbal analogy scales N=40 vocabulary Self-reported performance at work & school Feasibility Manually examined items for feasibility 40/64 (63%) items were feasible Reasons for infeasibility Over-use of a bridge or a pair (some bridges have few pairs)

Ambiguous pairs (drum:drum?) Foil inadvertently a correct key Results for H1 Variable Mean SD n 1 2 3 4 1 Vocabulary 0.75 0.14 40 (0.86)

0.66 0.66 0.69 2 Human-written items 0.65 0.14 20 0.46 (0.57) 0.97 1.04 3 AIG items with pairs responses 0.73 0.16

20 0.52 0.63 (0.73) 0.94 4 AIG items with word responses 0.81 0.14 19 0.54 0.67 0.68 (0.72) 5 Self-Rated Performance

3.72 0.61 6 -0.04 -0.01 0.05 0.10 6 Academic Performance 0.02 0.72 3 0.14 0.22 0.20 0.14

H1: Two forms of AIG analogies (word responses and pair responses) will have comparable reliability & validity CONFIRMED Results for H2 Variable Mean SD n 1 2 3 4 1 Vocabulary 0.75 0.14 40 (0.86)

0.66 0.66 0.69 2 Human-written items 0.65 0.14 20 0.46 (0.57) 0.97 1.04 3 AIG items with pairs responses 0.73 0.16

20 0.52 0.63 (0.73) 0.94 4 AIG items with word responses 0.81 0.14 19 0.54 0.67 0.68 (0.72) 5 Self-Rated Performance

3.72 0.61 6 -0.04 -0.01 0.05 0.10 6 Academic Performance 0.02 0.72 3 0.14 0.22 0.20 0.14

H2: AIG scales will have reliability comparable to manually-written scale NOT CONFIRMED because the AIG scales had better reliability Results for H3 Variable Mean SD n 1 2 3 4 1 Vocabulary 0.75 0.14 40

(0.86) 0.66 0.66 0.69 2 Human-written items 0.65 0.14 20 0.46 (0.57) 0.97 1.04 3 AIG items with pairs responses 0.73

0.16 20 0.52 0.63 (0.73) 0.94 4 AIG items with word responses 0.81 0.14 19 0.54 0.67 0.68 (0.72) 5

Self-Rated Performance 3.72 0.61 6 -0.04 -0.01 0.05 0.10 6 Academic Performance 0.02 0.72 3 0.14 0.22 0.20

0.14 H3: AIG scales will have construct and criterion validity comparable to manually-written scale CONFIRMED Predicting Item Difficulty Predictor Correlation Automatically generated (1) or manually written (0) 0.28* Familiarity of least familiar word in item 0.33* Familiarity of second least familiar word in item 0.39** Mean familiarity of all words in item 0.37** Lowest log(count(word)) 0.14 Second lowest log(count(word))

-0.06 Mean log(count(word)) 0.17 Future Directions Better handling of senses (DRUM is for DRUMMING) Better difficulty calculations based on larger sample of items Automated feasibility checking Enhanced database of relationships Choosing foils to have more semantic similarity to other words Thank you! [email protected]

Recently Viewed Presentations

  • Telemedicine-delivered treatment interventions for substance ...

    Telemedicine-delivered treatment interventions for substance ...

    Telemedicine-delivered treatment interventions for substance use disorders. Lewei (Allison) Lin MD, MS. Research Investigator and Staff Psychiatrist, Center for Clinical Management Research, VA Ann Arbor Healthcare System
  • Folie 1 - HRB Trials Methodology Research Network |HRB-TMRN

    Folie 1 - HRB Trials Methodology Research Network |HRB-TMRN

    Professor Brian Lawlor, Coordinator, NILVAD www.nilvad.eu Alzheimer's Disease 35 million people worldwide have Alzheimer's disease Possible causes of AD: Tau, Amyloid, Inflammation β Amyloid comes from a large protein which is found in the fatty membrane surrounding nerve cells. β...
  • 2017 Fall Media Monday Awards Ceremony

    2017 Fall Media Monday Awards Ceremony

    Individual Student Awards. All name will not be read for time's sake. If you win first place, come up front to accept your award. If you win second, third, or honorable mention, your awards will be in the teachers' packets...
  • Introduction to Traffic Engineering

    Introduction to Traffic Engineering

    In order to establish the density at which maximum flow occurs, is differentiated and set equal to zero as follows: the term within the brackets must equal zero, therefore: km, the density at maximum flow, is thus equal to half...
  • Ling 390 - Intro to Linguistics - Winter 2005 Class 1 ...

    Ling 390 - Intro to Linguistics - Winter 2005 Class 1 ...

    Sometimes, old phonological alternations become leveled (historical process that eliminate certain alternations in favor of a more productive one) and we are left with a handful of alternations. If they don't show any productivity, then we must assume the rule...
  • Ralis par Rosemine ALI Version franaise approuve par

    Ralis par Rosemine ALI Version franaise approuve par

    tari` ;Ithas klas Î pa5 ÉÑ Réalisé par Rosemine ALI Version française approuvée par Moulla Nissar
  • Presentation to Eye Community Eye Town Hall September

    Presentation to Eye Community Eye Town Hall September

    Eye Partnership: Background. Page . To explore the potential for local people in Eye to play a far more significant role in the design, commissioning and delivery of public services or reducing demand for these services.
  • Frog Dissection Inquiry: Background and Questions

    Frog Dissection Inquiry: Background and Questions

    Frog Dissection Study Guide. Be able to identify these structures and describe their functions and interactions with other parts: ventral-belly side, dorsal-back side, thumb pads, 4 digits-forelimb, 5 digits-hindlimb, no ribs-cartilage, nictitating membrane, tympanum, Eustachian tube, maxillary teeth, vomerine teeth,...