Can You Automate OER Evaluation With The RISE Framework?

The RISE Framework is a learning analytics methodology for identifying OER resources in a course that may need improvement. On one level, this is an interesting development, since so few learning analytics projects are actually getting into how to improve the actual education of learners. But on the other hand, I am not sure if this framework has a detailed enough understanding of instructional design, either. A few key points seem to be missing. It’s still early, so we will see.

The basic idea of the RISE Framework is that analytics will create a graph that plots page clicks in OER resources on the x-axis, and grades on assessments on the y-axis. This will create a grid that shows where there were higher than average grades with higher than average clicks, higher than average grades with lower than average clicks, lower than average grades with higher than average clicks, and lower than average grades with lower than average clicks. This is meant to identify the resources that teachers should consider examining for improvement (especially focusing on the ones that got a high number of clicks but lower grade scores). Note that this is not meant to definitely say “this is where there is problem, so fix it now” but more ” there may or may not be a problem here, so check it out.” Keep that in mind while I explore some of my doubts here, because I would be a lot harsher on this if it was presented as a tool to definitely point out exact problems rather than what it is: a way to start the search for problems.

Of course, any system of comparing grades with clicks itself is problematic on many fronts, and the creators of the RISE Framework do take this into consideration when spelling out what each of the four quadrants could mean. For example, in the quadrant that specifies high grades with low content usage, they not only identify “high content quality” as the cause of this, but also “high prior knowledge,” “poorly written assessment,” and so on. So this is good – many factors outside of grades and usage are taken into account. This is because, on the grade front, we know that scores are a reflection of a massive number of factors – the quality of the content being only one of those (and not always the biggest one). As  noted, prior knowledge can affect grades (sometimes negatively – not always positively like the RISE framework appears to assume). Exhaustion or boredom or anxiety can impact grades. Again, I am glad that these are in the framework, but the affect these have on grades is assumed in one direction – rather than the complex directions they take in real life. For example, students that game the test or rubric can inflate scores without using the content much – even on well-designed assessments (I did that all of the time in college).

However, the bigger concern with the way grades are addressed in the RISE framework is that they are plotting assessment scores instead of individual item scores. Anyone that has analyzed assessment data can tell you that the final score on a test is actually an aggregate of many smaller items (test questions). That aggregate grade can mask many deficiencies at the micro level. That is why instructors prefer to analyze individual test questions or rubric lines than the aggregate scores of the entire test. Assessments could cover, say 45 questions of content that were well covered in the resources, and then 5 questions that are poorly covered. But the high scores on the 45 questions, combined with the fact that many will get some questions right by random guessing on the other 5, could result in test scores that mask a massive problem with those 5 questions. But teachers can most likely figure that out quickly without the RISE framework, and I will get to that later.

The other concern is with clicks on the OER. Well, they say that you can measure “pageviews, time spent, or content page ratings”… but those first two are still clicks, and the last one is a bit too dependent on the happiness of the raters (students) at any given moment to really be that quantitative. I wouldn’t outright discount it as a factor, but I will state that you are always going to find a close alignment with the test scores on that one for many reasons. In other words, it is a pre-biased factor – students that get a high score will probably rate the content as effective even if it wasn’t, and students that get a low score will probably blame the content quality whether it was really a factor or not.

Also, now that students know their clicks are being recorded, they are more and more often clicking around to make sure they get good numbers on those data points. I even do that when taking MOOCs, just in case: click through the content at a realistic pace even if I am really doing something else other than reading. People have learned to skim resources while checking their phone, clicking through at a pace that makes it seem like they are reading closely. Most researchers are very wary of using click data like pageviews or time spent to tell anything other than where students clicked, how long between clicks, and what was clicked on. Guessing what those mean beyond that? More and more, that is being discouraged in research (and for good reason).

Of course, I don’t have time to go into how relying on only content and assessment is poor way to teach a course, but I think we all know that. A robust and helpful learning community in a class can answer learning questions and help learners overcome bad resources to get good grades. And I am not referring to cheating here – Q&A forums in courses can often really help some learners understand bad readings – while also possibly making them feel like they are the problem, not the content.

Still, all of that is somewhat or directly addressed in the framework, and because it is a guide rather than definitive answer, variations like those discussed above are to be expected. I covered them just to make sure I was covering all critical bases.

The biggest concern I have with the RISE framework really comes here: “The framework assumes that both OER content and assessment items have been explicitly aligned with learning outcomes, allowing designers or evaluators to connect OER to the specific assessments whose success they are designed to facilitate.”

Well, since that doesn’t happen in many courses due to time constraints, that eliminates large chunks of courses. I can also tell you as an instructional designer, many people think they have well-aligned outcomes…. but don’t.

But, let’s assume that you do have a course with “content and assessment items have been explicitly aligned with learning outcomes.” If you have explicitly aligned assessments, you don’t need the RISE framework. To explicitly align assessment with a content is not just a matter of making sure the question tests exactly what is in the content, but to also point to exactly where the aligned content is for each question. Not just the OER itself, but the chapter and page number. Most testing systems today will give you an item-by-item breakdown of each assessment (because teachers have been asking for it). Any low course score on any specific question indicates some problem. At that point, it is best (and quickest) to just ask your learners:

  1. Did the question make sense? Was it well written?
  2. Did it connect to the content?
  3. Did the content itself make sense?

Plus, most content hosting systems have ways to track page clicks, so you can easily make your own matrix using clicks if you need to. The matrix in the framework might give you a good way to organize the data to see where your problem lies…. but to be honest, I think it would be quicker and more accurate to focus on the assessment questions instead of the whole test, and ask the learners about specific questions.

Also, explicit alignment can itself hide problems with the content. An explicit alignment would require that you test what is in the content, even if the content is bad. This is one of the many things you learn as an ID: don’t test what students don’t learn; write your test questions to match the content no matter what. A decently-aligned assessment can still produce grades from a very bad content source. One of my ID professors once told me something along the lines of “a good instructional designer can help students pass even with bad textbooks; a bad instructional designer can help them fail with the best textbook.”

Look – instructional designers have been dealing with good and bad textbooks for decades now. Same goes for instructors that serve as their own IDs. We have many ways to work around those.

I may be getting the RISE framework wrong, but comparing overall scores on assessments to certain click-stream activity in OER (sometimes an entire book) comes across like shooting fish in a barrel with a shotgun approach. Especially when well-aligned test questions can pinpoint specific sources of problems at a fairly micro-fine level.

Now then, if you could actually compare the grades on individual assessment items with the amount of time spent on the page or area that that specific item came from, you might be on to something. Then, if you could group students into the four quadrants on each item, and then compare quadrant results on all items in the same assessment together, you could probably identify the questions that are most likely to have some kind of issue. Then, have the system send out a questionnaire about the test to each student – but have the questionnaire be custom-built depending on which quadrant the student was placed in. In other words, each learner gets questions about the same, say, 5 test questions that were identified as problematic, but the specific question they get about each question will be changed to match which quadrant they were placed in for that quadrant:

We see that you missed Question 4, but you did spend a good amount of time on page 25 of the book, where this question was taken from. Would you say that:

  • The text on page 25 was not well-written
  • Question 4 was not well-written
  • The text on page 25 doesn’t really match Question 4
  • I visited page 25, but did not spend the full time there reading the text

Of course, writing it out this ways sounds creepy. You would have to make sure that learners opt-in for this after fully understanding that this is what would happen, and then you would probably need to make sure that the responses go to someone that is not directly responsible for their grade to be analyzed anonymously. Then report those results in a generic way: “survey results identified that there is probably not a good alignment between page 25 and question 4, so please review both to see if that is the case.”

In the end, though, I am not sure if you can get detailed enough to make this framework effective without diving deep into surveillance monitoring. Maybe putting the learner in control of these tools, and give them the option of sharing the results with their instructor if they feel comfortable?

But, to be honest, I am probably not in the target audience for this tool. My idea of a well-designed course involves self-determined learning, learner autonomy, and space for social interaction (for those that choose to do so). I would focus on competencies rather than outcomes, with learners being able to tailor the competencies to their own needs. All of that makes assessment alignment very difficult.