Research Methods for the Learning Sciences

Research Methods for the Learning Sciences

Core Methods in Educational Data Mining HUDK4050 Fall 2014 The Homework

Lets go over the homework Was it harder or easier than basic homework 1?

What was the answer to Q1? What tool(s) did you use to compute it? What was the answer to Q2? What tool(s) did you use to compute it?

What was the answer to Q3? What tool(s) did you use to compute it? What was the answer to Q4?

What tool(s) did you use to compute it? What was the answer to Q5? What tool(s) did you use to compute it?

What was the answer to Q6? What tool(s) did you use to compute it? What was the answer to Q7? What tool(s) did you use to compute it?

What was the answer to Q8? What tool(s) did you use to compute it? What was the answer to Q9?

What tool(s) did you use to compute it? What was the answer to Q10? Who did Q11? Challenges?

Questions? Comments? Concerns? Textbook/Readings

Detector Confidence Any questions about detector confidence? Detector Confidence What are the pluses and minuses of making

sharp distinctions at 50% confidence? Detector Confidence Is it any better to have two cut-offs?

Detector Confidence How would you determine where to place the two cut-offs? Cost-Benefit Analysis

Why dont more people do cost-benefit analysis of automated detectors? Detector Confidence Is there any way around having intervention

cut-offs somewhere? Goodness Metrics Exercise

Detector Academic Suspension Detector No Academic Suspension

Data Suspension 2

3 Data No Suspension

5 140 What is accuracy?

Exercise Detector Academic Suspension

Detector No Academic Suspension Data Suspension

2 3

Data No Suspension 5 140

What is kappa? Accuracy Why is it bad?

Kappa What are its pluses and minuses? ROC Curve

Is this a good model or a bad model? Is this a good model or a bad model?

Is this a good model or a bad model? Is this a good model or a bad model? Is this a good model or a bad model?

ROC Curve What are its pluses and minuses? A

What are its pluses and minuses? Any questions about A? Precision and Recall

Precision = TP TP + FP Recall =

TP TP + FN Precision and Recall

What do they mean? What do these mean? Precision = The probability that a data point classified as true is actually true

Recall = The probability that a data point that is actually true is classified as true Precision and Recall What are their pluses and minuses?

Correlation vs RMSE What is the difference between correlation and RMSE? What are their relative merits?

What does it mean? 1. 2. 3.

4. High correlation, low RMSE Low correlation, high RMSE High correlation, high RMSE

Low correlation, low RMSE RMSE vs MAE RMSE vs MAE Radek Pelanek argues that MAE is inferior to

RMSE (and notes this opinion is held by many others) Radeks Example

Take a student who makes correct responses 70% of the time And two models Model A predicts 70% correctness Model B predicts 100% correctness

In other words 70% of the time the student gets it right Response = 1

30% of the time the student gets it wrong Response = 0 Model A Prediction = 0.7 Model B Prediction = 0.3

MAE 70% of the time the student gets it right Response = 1 Model A (0.7) Absolute Error = 0.3

Model B (1.0) Absolute Error = 0 30% of the time the student gets it wrong Response = 0 Model A (0.7) Absolute Error = 0.7

Model B (1.0) Absolute Error = 1 MAE Model A (0.7)(0.3)+(0.3)(0.7) 0.21+0.21

0.42 Model B (0.7)(0)+(0.3)(1) 0+0.3

0.3 MAE Model A (0.7)(0.3)+(0.3)(0.7)

0.21+0.21 0.42 Model B is better. (0.7)(0)+(0.3)(1)

0+0.3 0.3 MAE Model A

(0.7)(0.3)+(0.3)(0.7) 0.21+0.21 0.42 Model B is better. Do you buy that?

(0.7)(0)+(0.3)(1) 0+0.3 0.3 RMSE

70% of the time the student gets it right Response = 1 Model A (0.7) Squared Error = 0.09 Model B (1.0) Squared Error = 0 30% of the time the student gets it wrong

Response = 0 Model A (0.7) Squared Error = 0.49 Model B (1.0) Squared Error = 1 RMSE

Model A (0.7)(0.09)+(0.3)(0.49) 0.063+0.147 0.21

Model B (0.7)(0)+(0.3)(1) 0+0.3 0.3

RMSE Model A is better. (0.7)(0.09)+(0.3)(0.49) 0.063+0.147 0.21

Model B (0.7)(0)+(0.3)(1) 0+0.3 0.3

RMSE Model A is better. Does this seem more reasonable? (0.7)(0.09)+(0.3)(0.49)

0.063+0.147 0.21 Model B (0.7)(0)+(0.3)(1) 0+0.3

0.3 AIC/BIC vs Cross-Validation AIC is asymptotically equivalent to LOOCV BIC is asymptotically equivalent to k-fold cv

Why might you still want to use crossvalidation instead of AIC/BIC? Why might you still want to use AIC/BIC instead of cross-validation? AIC vs BIC

Any comments or questions? LOOCV vs k-fold CV Any comments or questions?

Other questions, comments, concerns about textbook? Creative HW 2

Creative HW 2 Due October *8* Creative HW 2 Yes, you get to breathe for a few days

Creative HW 2 Yes, you get to breathe for a few days (Sorry about assignment timing; my getting sick the second week of class threw off the

class timeline a little) Questions about Creative HW 2? Other questions or comments?

No Class Next Week Next Class Monday, October 6 Feature Engineering -- What

Baker, R.S. (2014) Big Data and Education. Ch. 3, V3 Sao Pedro, M., Baker, R.S.J.d., Gobert, J. (2012) Improving Construct Validity Yields Better Models of Systematic Inquiry, Even with Less Information. Proceedings of the 20th International Conference on User Modeling,

Adaptation and Personalization (UMAP 2012),249-260. The End

Recently Viewed Presentations

  • FileNewTemplate - University of Connecticut

    FileNewTemplate - University of Connecticut

    The Information Warehouse at the Ohio State University Medical Center is a comprehensive repository of business, clinical, and research data from various source systems. Data collected here is a valuable resource that facilitates both translational research and personalized healthcare.
  • Language, Mind, and Brain by Ewa Dabrowska

    Language, Mind, and Brain by Ewa Dabrowska

    Language, Mind, and Brain by Ewa Dabrowska Chapter 7: Words
  • Pollution Problems 4  2007 Thomson South-Western Economics of

    Pollution Problems 4 2007 Thomson South-Western Economics of

    When the market works as it should… The invisible hand of the marketplace leads self-interested buyers and sellers to maximize the net benefit that society can derive from a market. Is this always the case?
  • Click on the number that makes me equal

    Click on the number that makes me equal

    Click on the number that makes me equal ten 9 10 + = 5 8 1 3 2 4 7 9 6 0 Click on the number that makes me equal ten 8 + = 10 1 2 3 4...
  • Sistemi Elettorali Stati Uniti D'America

    Sistemi Elettorali Stati Uniti D'America

    Elezioni Presidenziali 2016 Partito Candidato Presidente Preferenze % Grandi elettori % Partito Repubblicano Donald Trump 62 984 825 46,09% 304 56,5 Partito Democratico Hillary Clinton 65 853 516 48,18% 227 42,2 Partito Libertariano Gary Johnson 4 489 235 3,27% 0...
  • Private Sector Engagement SECOND INVESTORS GROUP, St Albans,

    Private Sector Engagement SECOND INVESTORS GROUP, St Albans,

    Management and logistics: Supply chain/distribution companies; ... World Bank Group institutions such as IFC, etc.), based on the comparative advantage of each institution in working with the private sector. Objectives of GFF engagement with the private sector.
  • Administrative Leadership Meeting Tuesday, Sept. 15, 2015 Chancellor

    Administrative Leadership Meeting Tuesday, Sept. 15, 2015 Chancellor

    Outsourcing small-scale projects and expansion of facilities convenience contractor program. Motor pool evaluation . Recommendationsdue Winter 2015. Update in Fall 2015. Update in Fall 2015. Update in Fall 2015. Position Analysis, Budgeting and Tracking .
  • Human Digestive System

    Human Digestive System

    Peristalsis begins in the esophagus when you swallow. It continues into the stomach and then to the small intestine and large intestine. Peristalsis helps move food along the digestive tract so that it can be digested, absorbed, and ultimately the...