Using Learning Analytics to Predict Cheating Has Been Going on for Longer Than You Think

Hopefully by now you have heard about the Dartmouth Medical School Cheating Scandal, where Dartmouth College officials used questionable methods to “detect” cheating in remote exams. At the heart of the matter is how College officials used click-stream data to “catch” so-called “cheaters.” Invasive surveillance was used to track student’s activity during the exams, officials used the data without really understanding it to make accusations, and then students were pressured to quickly react to the accusations without much access to the “proof.” Almost half of those accused (7 of 17 or 41%) have already had their cases dismissed (aka – they were falsely accused. Why is this not a criminal act?). Out of the remaining 10, 9 plead guilty, but 6 of those have now tried to appeal that decision because they feel they were forced to plead guilty. FYI – that is 76%(!) that are claiming they are falsely accused. Only one of those six wanted to be named – the other 5 are afraid of reprisals from the College if they speak up.

That is intense. Something is deeply wrong with all of that.

The frustrating thing about all of this is that plenty of people have been trying to warn that this is a very likely inevitable outcome of Learning Analytics research studies that look to detect cheating from the data. Of course, this particular area of research focus is not a major aim of Learning Analytics in general, but several studies have been published through the years. I wanted to take a look at a few that represent the common themes..

The first study is a kind of pre-Learning Analytics paper from 2006 called “Detecting cheats in online student assessments using Data Mining.” Learning Analytics as a field is usually traced back to about 2011, but various aspects of it existed before that. You can even go back to the 1990s – Richard A. Schwier describes the concept of “tracking navigation in multimedia” (in the 1995 2nd edition of his textbook Instructional Technology: Past, Present, and Future – p. 124, Gary J. Anglin editor). Schwier really goes beyond tracking navigation into foreseeing what we now call Learning Analytics. So all of that to say: tracking students’ digital activity has a loooong history.

But I start with this paper because it contains some of the earliest ways of looking at modern data. The concerning thing with this study is that the overall goal is to predict which students are most likely to be cheating based on demographics and student perceptions. Yes – not only do they look at age, gender, and employment, but also a learner’s personality, social activities, and perceptions (did they think the professor was involved or indifferent? Did they find the test “fair” or not? etc).

You can see by the chart on p.207 that males with lower GPAs are mostly marked as cheating, while females with higher GPAs are mostly marked as not cheating. Since race is not considered in the analysis, systemic discrimination could create incredibly racist oppression from this method.

Even more problematic is the “next five steps to data mining databases,” with one step recommending the collection of “responses of online assessments, surveys and historical information to detect cheats in online exams.” This includes the clarification that:

  • “information from students must be collected from the historical data files and surveys” (hope you didn’t have a bad day in the past)
  • “at the end of each exam the student will be is asked for feedback about exam, and also about the professor and examination conditions” (hope you have a wonderful attitude about the test and professor)
  • “professor will fill respective online form” (hope the professor likes you and isn’t racist, sexist, transphobic, etc if any of that would hurt you).

Of course, one might say this is pre-Learning Analytics and the current field is only interested in predicting failure, retention, and other aspects like that. Not quite. Lets look at the 2019 article “Detecting Academic Misconduct Using Learning Analytics.” The focus in this study is bit more specific: they seek to use keystroke logging and clickstream data to tell if a student is writing an authentic response or transcribing a pre-written one (which is assumed to only be from contract cheating).

The lit review of this study also shows that this study is not the only one digging into this idea. The idea goes back several years through multiple studies.

While this study does not get to the same Minority Report-level concerns that the last one did, there are still some problematic issues here. First of all is this:

“Keystroke logging allows analysis of the fluency and flow of writing, the length and frequency of pauses, and patterns of revision behaviour. Using these data, it is possible to draw conclusions about students’ underlying cognitive processes.”

I really need to carve out some time to write about how you can’t use clickstream data of any kind to detect cognitive processes in any way, shape or form. Most people that read this blog know why this is true, so I won’t take the time now. But the Learning Analytics literature is full of people that think they can detect cognitive activities, processes, or presence through clickstream data… and that is just not possible.

The paper does address the difficulties in using keystroke data to analyze writing, but proposes analysis of clickstream data as a much better alternative. I’m not really convinced by the arguments they present – but the gist is they are looking to detect revision behaviors, because authentic writing involved pauses and deletions.

Except that is not really true for everyone. People that write a lot (like, say, by blogging) can get to a place where they can write a lot without taking many pauses. Or, if they really do know the material, they might not need to pause as much. On the other hand, the paper assumes that transcription of an existing document is a mostly smooth process. I know it is for some, but it is something that takes me a while.

In other words, this study relies on averages and clusters of writing activities (words added/deleted, bursts of writing activity, etc) to classify your writing as original or copied. Which may work for the average, but what about students with disabilities that affect how they write? What about people that just work differently than the average? What about people from various cultures that approach writing in a different method, or even those that have to translate what they want to write into English first and then write it down?

Not everyone fits so neatly into the clusters.

Of course, this study had a small sample size. Additionally, while they did collect demographic data and had students take self-regulated learning surveys, they didn’t use any of that in the study. The SRL data would seem to be a significant aspect to analyze here. Not to mention at least mentioning some details on the students who didn’t speak English as a primary language.

Now, of course, writing out essay exam answers is not common in all disciplines, and even when it is, many instructors will encourage learners to write out answers first and then copy them into the test. So these results may not concern many people. What about more common test types?

The last article to look at is “Identifying and characterizing students suspected of academic dishonesty in SPOCs for credit through learning analytics” from 2020. There are plenty of other studies to look at, but this post is already getting long. SPOC here means “Small Private Online Course”… a.k.a. “a regular online course.” The basic gist is that they are clustering students by how close their answers are to each other and how close their submission times are. If they get the exact same answers (including choosing the same wrong choice) and turn in their test at about the same time, they are considered “suspect of academic dishonesty.” It should also be pointed out that the Lit Rreview here also shows they are the first or only people to be looking into this in the Learning Analytics realm.

The researchers are basically looking for students that meet together and give each other answers to the test. Which, yes – it is suspicious if you see students turn in all the same answers at about the same time and get the same grade. Which is why most students make sure to change up a few answers, as well as space out submissions. I don’t know if the authors of this study realized they probably missed most cheaters and just caught the ones not trying that hard.

Or… let me propose something else here. All students are trying to get the right answers. So there are going to be similarities. Sometimes a lot of students getting the same wrong answer on a question is seen as a problem to fix on the teaching side (it could have been taught wrong). Plus, students can have similar schedules – working the same jobs, taking the same other classes that meet in the morning, etc. It is possible that out of the 15 or so they flagged as “suspect,” 1 or 2 or even 3 just happened to get the same questions wrong and submit at about the same time as the others. They just had bad luck.

I’m not saying that happened to all, but look: you do have this funnel effect with tests like these. All of your students are trying to get the same correct answer and finish before the same deadline. So its quite possible there will be overlap that is very coincidental. Not for all, but isn’t it at least worth a critical examination if even a small number of students could get hurt by coincidentally turning in their test at the same time others are?

(This also makes a good case for ungrading, authentic assessment, etc.)

Of course, the “suspected” part gets dropped by the end of the paper: “We have applied this method in a for credit course taught in Selene Unicauca platform and found that 17% of the students have performed academic dishonest actions, based on current conservative thresholds.” How did they get from “suspected” to “have performed?” Did they talk to the students? Not really. They looked at five students and felt that there was no way their numbers could be anything but academic dishonesty. Then they talked to the instructor and found that three students had complained about low grades. The instructor looked at their tests, found they had the exact same wrong answers, and… case closed.

This is why I keep saying that Learning Analytics research projects should be required to have an instructional designer or learning research expert on the team. I can say after reviewing course results for decades that it is actually common for students to get the same wrong answers and be upset about it because they were taught wrong. Instructors and Instructional Designers do make mistakes, so always find out what is going on. Its also possible that there was a conversation weeks ago where one student with the wrong information spread that information to several students when discussing the class. It happens.

But this is what happens when you don’t investigate fully and assume the data is all you need. Throwing in a side of assuming that cheaters act a certain way certainly goes a long way as well. So you can see a direct line from assumptions made about personality and demographics of who cheaters are, to using clickstream data to know what is going on in the brain, to assuming the data is all you need…. all the way to the Dartmouth Medical School scandal. Where there is at least a 41%-76% false accusation rate currently.

The Problem of Learning Analytics and AI: Empowering or Resistance in the Age of “AI”

So where to begin with this series I started on Learning Analytics and AI? The first post started with a basic and over-simplified view of the very basics. I guess the most logical place to jump to is… the leading edge of the AI hype? Well, not really… but there is an event in that area happening this week, so I need to go there anyways.

I was a bit surprised that the first post got some attention – thank you to those that read it. Since getting booted out of academia, I have been unsure of my place in the world of education. I haven’t really said much publicly or privately, but it has been a real struggle to break free from the toxic elements of academia and figure out who I am outside of that context. I was obviously surrounded by people that weren’t toxic, and I still adjunct at a university that I feel supports its faculty… but there were still other systemic elements that affect all of us that are hard to process once you are gone.

So, anyway, I just wasn’t sure if I could still write anything that made a decent point, and I wasn’t too confident I did that great of a job writing about such a complex topic in a (relatively) short blog post last time. Maybe I didn’t, but even a (potentially weak) post on the subject seems to resonate with some. Like I said in the last post, I am not the first to bring any of this up. In fact, if you know of any article or post that makes a better point than I do, please feel free to add it in the comments.

So, to the topic at hand: this week’s Empowering Learners in the Age of AI conference in Australia. My concern with this conference is not with who is there – it seems to be a great group of very knowledgeable people. I don’t know some of them, but many are big names in the field that know their stuff. What sticks out to me is who is not there, as well as how AI is being framed in the brief descriptions we get. But neither of those points is specific to this conference. In fact, I am not really looking at the conference as much as some parts of the field of AI, with the conference just serving as proof that the things I am looking at are out there.

So first of all, to address the name of the conference. I know that “empowering learners” is a common thing to say not just in AI, but education in general. But it is also a very controversial and problematic concept as well. This is one concern that I hang on all of education and even myself as I like the term “empower” as well. No matter what my intentions (or anyone else’s), the term still places the institution and the faculty as the center of the power in the learning process – there to decide whether the learners get to be empowered or not. One of the best posts on this topic is by Maha Bali: The Other Side of Student Empowerment in a Digital World. At the end of the post, she gets to some questions that I want to ask of the AI field, including these key ones:

“In what ways might it reproduce inequality? How participatory has the process been? How much have actual teachers and learners, especially minorities, on the ground been involved in or consulted on the design, implementation, and assessment of these tools and pedagogies?”

I’ll circle back to those throughout the post.

Additionally, I think we should all question the “Age of AI” and “AI Society” part. It is kind of complicated to get into what AI is and isn’t, but the most likely form of AI we will see emerge first is what is commonly called “Artificial General Intelligence” (AGI), which a is deceptive way of saying “pretending to act like humans but not really be intelligent like we are.” AGI is really a focus on creating something that “does” the same tasks humans can, which is not what most people would attribute to an “Age of AI” or “AI Society.” This article on Forbes looks at what this means, and how experts are predicting that we are 10-40 years away from AGI.

Just as an FYI, I remember reading in the 1990s that we were 20-40 years away from AGI then as well.

So we aren’t near an Age of AI, probably not in many of our lifetimes, and even the expert options may not end up being true. The Forbes articles fails to mention that there were many problems with the work that claimed to be able to determine sexuality from images. In fact, there is a lot to be said about differentiating AI from BS that rarely gets brought up by the AI researchers themselves. Tristan Greene best sums it up in his article about “How to tell the difference between AI and BS“:

“Where we find AI that isn’t BS, almost always, is when it’s performing a task that is so boring that, despite there being value in that task, it would be a waste of time for a human to do it.”

I think it would have been more accurate to say you are “bracing learners for the age of algorithms” than empowering for an age of AI (that is at least decades off but may never actually happen according to some). But that is me, and I know there are those that disagree. So I can’t blame people for being hopeful that something will happen in their own field sooner than it might in reality..

Still, the most concerning thing about the field of AI is who is not there in the conversations, and the Empowering Learners conference follows the field – at least from what I can see on their website. First of all, where are the learners? Is it really empowering for learners when you can’t really find them on the schedule or in the list of speakers and panelists? Why is their voice not up front and center?

Even bigger than that is the problem that has been highlighted this week – but one that has been there all along:

The specific groups she is referring to are BIPOC, LGBTQA, and Disabilities. We know that AI has discrimination coded into it. Any conference that wants to examine “empowerment” will have to make justice front and center because of long existing inequalities in the larger field. Of course, we know that different people have different views of justice, but “empowerment” would also mean each person that faces discrimination gets to determine what that means. Its really not fair to hold a single conference accountable for issues that long existed before the conference did, but by using the term “empowerment” you are setting yourself up to a pretty big standard.

And yes, “empowerment” is in quotes because it is a problematic concept here, but it is the term the field of AI and really a lot of the world of education uses. The conference web page does ask “who needs empowering, why, and to do what?” But do they mean inequality? And if so, why not say it? There are hardly any more mentions of this question after it is brought up, much less anything connecting the question to inequality, in most of the rest of the program. Maybe it will be covered in conference – it is just not very prominent at all as the schedule stands. I will give them the benefit of the doubt until after the conference happens, but if they do ask the harder questions, then they should have highlighted that more on the website.

So in light of the lack of direct reference to equity and justice, the concept of “empowerment” feels like it is taking on the role of “equality” in those diagrams that compare “equality” with “equity” and “justice”:

Equality vs equity vs justice diagram
(This adaption of the original Interaction Institute for Social Change image by Angus Macguire was found on the Agents of Good website. Thank you Alan Levine for helping me find the attribution.)

If you aren’t going to ask who is facing inequalities (and I say this looking at the fields of AI, Learning Analytics, Instructional Design, Education, all of us), then you are just handing out empowerment the same to all. Just asking “who needs empowering, why, and to do what?” doesn’t get to critically examining inequality.

In fact, the assumption is being made by so many people in education that you have no choice but to utilize AI. One of the best responses to the “Equality vs Equity vs Justice” diagrams has come from Bali and others: what if the kids don’t want to play soccer (or eat an apple or catch a fish or whatever else is on the other side of the fence in various versions)?

Resistance is a necessary aspect of equity and justice. To me, you are not “empowering learners” unless you are teaching them how to resist AI itself first and foremost. But resistance should be taught to all learners – even those that “feel they are safe” from AI. This is because 1) they need to stand in solidarity with those that are the most vulnerable, to make sure the message is received, and 2) they aren’t as safe as they think.

There are many risks in AI, but are we really taking the discrimination seriously? In the linked article, Princeton computer science professor Olga Russakovsky said

“A.I. researchers are primarily people who are male, who come from certain racial demographics, who grew up in high socioeconomic areas, primarily people without disabilities. We’re a fairly homogeneous population, so it’s a challenge to think broadly about world issues.”

Additionally, (now former) Google researcher Timnit Gebru said that scientists like herself are

“some of the most dangerous people in the world, because we have this illusion of objectivity.”

Looking through the Empowering Learner event, I don’t see that many Black and Aboriginal voices represented. There are some People of Color, but not near enough considering they would be the ones most affected by discrimination that would impede any true “empowerment.” And where are the experts on harm caused by these tools, like Safiya Noble, Chris Gilliard, and many others? The event seems weighted towards those voices that would mostly praise AI, and it is a very heavily white set of voices as well. This is the way many conferences are, including those looking at education in general.

Also, considering that this is in Australia, where are the Aboriginal voices? Its hard to tell on the schedule itself. I did see on Twitter that the conference will start with an Aboriginal perspective. But when is that? In the 15 minute introductory session? That is no where near enough time for that. Maybe they are elsewhere on the schedule and just not noted well enough to tell. But why not make that a prominent part of the event rather than part of a 15 minute intro (if that is what it is)?

There are some other things I want to comment on about the future of AI in general:

  • The field of AI is constantly making references to how AI is affecting and improving areas such as medicine. I would refer you back to the “How to tell the difference between AI and BS” article for much  of that. But something that worries me about the entire AI field talking this way is that the are attributing “artificial intelligence” to things that boil down to advanced pattern recognition mainly using human intelligence. Let’s take, for example, recognizing tumors in scans. Humans program the AI to recognize patterns in images that look like tumors. Everything that the AI knows to look for comes directly from human intelligence. Just because you can then get the algorithm to repeat what the humans programmed it to thousands of times per hour, that doesn’t make it intelligence. It is human intelligence pattern recognition that has been digitized, automated, and repeated rapidly. This is generally what is happening with AI in education, defense, healthcare, etc.
  • Many leaders in education in general like to say that “institutions are ill-prepared for AI” – but how about how ill-prepared AI is for the equity and reality?
  • There is also often talk in the AI community about building trust between humans and machines that we see examples of at the conference as well: “can AI truly become a teammate in group learning or a co-author of a ground-breaking scientific discovery?” I don’t know what the speaker plans to say, but the answer is no. No we shouldn’t build trust and no we shouldn’t anthropomorphize AI. We should always be questioning it. But we also need to be clear, again, that AI is not the one that is writing (or creating music or paintings). This is the weirdest area of AI – they feed a bunch of artistic or music or literary patterns into AI, tell it how to assemble the patterns, and when something comes out it is attributed to AI rather than the human intelligence that put it all together. Again, the machine being able to repeat and even refine what the human put there in the first place is not the machine creating it. Take, for example, these different AI generated music websites. People always send these to me and say “look how well the machine put together ambient or grindcore music or whatever.” Then I  listen… and it is a mess. They take grindcore music and chop it up in to bits and then run those bits through pattern recognition and spit out this random mix – that generally doesn’t sound like very good grindcore. Ambient music works the best to uninitiated ears, but to fans of the music it still doesn’t work that great.
  • I should also point out about the conference that there is a session on the second day that asks “Who are these built for? Who benefits? Who has the control?” and then mentions “data responsibility, privacy, duty of care for learners” – which is a good starting point. Hopefully the session will address equity, justice, and resistance specifically. The session, like much of the field of AI, rests on the assumption that AI is coming and there is nothing you can do to resist it. Yes the algorithms are here, and it is hard to resist – but you still can. Besides, experts are still saying 10-40 years for the really boring stuff to emerge as I examined above.
  • I also hope the conference will discuss the meltdown that is happening in AI-driven proctoring surveillance software.
  • I haven’t gotten much into surveillance yet, but yes all of this relies on surveillance to work. See the first post. Watch the Against Surveillance Teach-In Recording.
  • I was about to hit publish on this when I saw an article about a Deepfake AI Santa that you can make say whatever you want. The article says “It’s not nearly as disturbing as you might think”… but yes, it is. Again, people saying something made by AI is good and realistic when it is not. The Santa moves and talks like a robot with zero emotion. Here again, they used footage of a human actor and human voice samples and the “AI” is an algorithm that chops it up into the parts that makes your custom message. How could this possibly be misused?
  • One of the areas of AI that many in the field like to hype are “conversational agents” aka chatbots. I want to address that as well since that is an area that I have (tried) to research. The problem with researching agents/bots is that learners just don’t seem to be impressed with them – it’s just another thing to them. But I really want to question how these count as AI after having created some myself. The process for making a chatbot is that you first organize a body of information into chunks of answers or statements that you want to send as responses. You then start “training” the AI to connect what users type into the agent (aka “bot”) with specific statements or chunks of information. The AI makes a connection and sends the statement or information or next question or video or whatever it may be back to the user. But the problem is, the “training” is you guessing dozens of ways that the person might ask a question or make a statement (including typos or misunderstandings) that matches with the chunk of information you want to send back. You literally do a lot of the work for the AI by telling it all the ways someone might type something into the agent that matches each chunk of content. They want at least 20 or more. What this means is that most of the time, when you are using a chatbot, it gives you the right answer because you typed in one of the most likely questions that a human guessed and added to the “training” session. In the rare cases where some types something a human didn’t guess, then the Natural Language Processing kicks in to try and guess the best match. But even then it could be a percentage of similar words more than “intelligence.” So, again, it is human intelligence that is automated and re-used thousands of times a minute – not something artificial that has a form of intelligence. Now, this might be useful in a scenario when you have a large body of information (like an FAQ bot for the course syllabus) that could use something better than a search function. Or maybe a branching scenarios lesson. But it takes time to create a good chatbot. There is still a lot of work and skill to creating the questions and responses well. But to use chatbots for a class of 30, 50, 100? You probably will spend so much time making it that it would be easier to just talk to your students.
  • Finally, please know that I realize that what I am talking about still requires a lot of work and intelligence to create. I’m not doubting the abilities of the engineers and researchers and others that put their time into developing AI. I’m trying to get at the pervasive idea that we are in an Age of AI that can’t be avoided. Its a pervasive idea that was even made in a documentary web series a year ago. I also question whether “artificial intelligence” is the right term for all of this, rather than something more accurate like “automation algorithms.”

Again, everything I touch on here is not as much about this conference, as it is about the field of AI since this conference is really just a lot of what is in the AI field concentrated into two days and one website. The speakers and organizers might have already planned to address everything I brought up here a long time ago, and they just didn’t get it all on the website. We will see – there are some sessions with no description and just a bio. But still, at the core of my point, I think that educators need to take a different approach to AI than we have so far (maybe by not calling it that when it rarely is anything near intelligent) by taking justice issues seriously. If the machine is harming some learners more than others, the first step is to teach resistance, and to be successful in that all learners and educators need to join in the resistance.

The Problem of Learning Analytics and AI

For some time now, I have been wanting to write about some of the problems I observed during my time in the Learning Analytics world (which also crosses over into Artificial Intelligence, Personalization, Sentiment Analysis, and many other areas as well). I’m hesitant to do so because I know the pitchforks will come out, so I guess I should point out that all fields have problems. Even my main field of instructional design is far from perfect. Examining issues with in a field (should be) a healthy part of the growth of a field. So this will probably be a series of blog posts as I look at publications, conferences, videos, and other aspects of the LA/PA/ML/AI etc world that are in need of a critical examination. I am not the first or only person to do this, but I have noticed a resistance by some in the field to consider these viewpoints, so hopefully adding more voices to the critical side will bring more attention to these issues.

But first I want to step back and start with the basics. At the core of all analytics, machine learning, AI, etc are two things: surveillance and algorithms. Most people wouldn’t put it this way, but let’s face it: that is how it works. Programs collect artifacts of human behavior by looking for them, and then process those through algorithms. Therefore, the core of all of this is surveillance and algorithms.

At the most basic level, the surveillance part is a process of downloading a copy of data from a database that was intentionally recording data. That data is often a combination of click-stream data, assignment and test submissions, discussion forum comments, and demographic data. All of this is surveillance, and in many cases this is as far as it goes. A LOT of the learning analytics world is based on click stream data, especially with an extreme focus on predictive analytics. But in a growing number of examples, there are also more invasive forms of surveillance added that rely on video recordings, eye and motion detection, bio-metric scans, and health monitoring devices. The surveillance is getting more invasive.

I would also point out that none of this is accidental. People in the LA and AI fields like to say that digital things “generate” data, as if it is some unintentional by-product of being digital: “We turned on this computer, and to our surprise, all this data magically appeared!”

Data has to be intentionally created, extracted, and stored to exist in the first place. In fact, there usually is no data in any program until programmers decide they need it. They will then create a variable to store that data for use within the program. And at this moment is where bias is introduced. The reason why certain data – like names, for example – are collected and others aren’t has to do with a bias towards controlling who has access and who doesn’t. Then that variable is given a name – it could be “XD4503” for all the program cares. But to make it easier for programmers to work together, they create variables names that can be understood by everyone on the team: “firstName,” “lastName,” etc.

Of course, this designation process introduces more bias. What about cultures that have one name, or four names? What about those that have two-part names, like the “al” that is common in the Arabic names, but isn’t really used for alphabetizing purposes? What about cultures that use their surname as their first name? What about random outliers? When I taught eighth grade, I had two students that were twins, and their parents gave them both nearly identical sets of five names. The only difference between the two was that the third name was “Jevon” for one and “Devon” for the other. So much of the data that is created – as well as how it is named, categorized, stored, and sorted – is biased towards certain cultures over others.

Also note here that there is usually nothing that causes this data to leave the program utilizing it. In order for some outside process or person to see this data, programmers have to create a method for displaying and / or storing that data in database. Additionally, any click stream, video, or bio-metric data that is stored has to be specifically and intentionally captured in ways that can be stored. For example, a click in itself is really just an action that makes a website execute some function. It disappears after that function happens – unless someone creates a mechanism for recording what was clicked on, when it was clicked, what user was logged in to do the click, and so on.

All of this to say that none of this is coincidental, accidental, or unplanned. There is a specific plan and purpose for every piece of data that is created and collected outside of the program utilizing them. None of the data had to be collected just because it was magically “there” when the digitials were turned on. The choice was made to create the data through surveillance, and then store it in a way that it could be used – perpetually if needed.

Therefore, different choices could be made to not create and collect data if the people in control wanted it that way. It is not inevitable that data has to be generated and collected.

Of course, most of the few people that will read this blog already know all of this. The reason I state this all here is for anybody that might still be thinking that the problems with analytics and AI is created during the design of the end user products. For example, some believe that the problems that AI proctoring has with prejudice and discrimination started when the proctoring software was created… but really this part is only the continuation of problems that started when the data that these AI systems utilized was intentionally created and stored.

I think that the basic fundamental lens or mindset or whatever you want to call it for publishing research or presenting at conferences about anything from Learning Analytics to AI has to be a critical one rooted in justice. We know that surveillance and algorithms can be racist, sexist, ablest, transphobic, and the list of prejudices goes on. Where people are asking the hard questions about these issues, that is great. Where the hard questions seem to be missing, or people are not digging deep enough to see the underlying biases as well, I want to blog about it. I have also noted that the implementation of LA/ML/AI tools in education too often lacks input from the instructional design / learning sciences / etc fields – so that will probably be in the posts as well.

While this series of posts is not connected to the Teach-In Against Surveillance, I was inspired to get started on this project based on reflecting on why I am against surveillance. Hopefully you will join the Teach-In tomorrow, and hopefully I will get the next post on the Empowering Learners for the Age of AI conference written in this lifetime. :)

Is Learning Analytics Synonymous with Learning Surveillance, or Something Completely Different?

It all started off simply enough. Someone saw a usage of analytics that they didn’t like, and thought they should speak up and make sure that this didn’t cross over into Learning Analytics:

The responses of “Learning Analytics is not surveillance” came pretty quickly after that:

[tweet 1187857679206637568 hide_thread=’true’]

But some disagreed with the idea, feeling they are very, very similar:

[tweet hide_thread=’true’]

(a couple of protected accounts that I can’t really embed here did come out and directly say they see Learning Analytics and Learning Surveillance as the same thing)

I decided to jump in the conversation and ask some questions about the difference between the two, and see if anyone could given definitions of the two that explained their difference, or perhaps prove they are they same.

My main point was that there is a lot of overlap between the two ideas. Both Learning Analytics and Learner Surveillance collect a lot of student data (grades, attendance, click stream, demographics, etc). If you look at the dictionary definition of surveillance (“close watch kept over someone or something (as by a detective)”), the overlap between the two only grows. Both rely on the collection of data to detect, keep watch, and predict future outcomes, all under the banner of being about the learning itself. Both Learning Analytics researchers and Learning Surveillance companies claim they do their work for the greater good of helping us to understand and optimize learning itself and/or the environments we learn in. The reality is that all surveillance (learning or otherwise) is now based on data that has been analyzed. If we don’t define the difference between Learning Analytics and Learner Surveillance, then the surveillance companies will continue to do what they want with Learning Analytics. Just saying “they are not the same” or “they are the same” without providing quantitative definitions of how they are or are not the same is not enough.

It seems that the questions that were raised in replies to my thread showcase how there is not a clear consensus on many aspects of this discussion. Some of the questions raised that need to be acknowledged and hashed out include:

  1. What counts as data, especially throughout the history of education?
  2. What exactly counts as surveillance and what doesn’t?
  3. Is surveillance an inherently evil, oppressive thing; a neutral force that completely depends on the answer to other questions here; or a benign overall positive force in society? Who defines this?
  4. Does the purpose of data collection (which is driven by who has access to it and who owns it) determine it’s category (analytics or surveillance)?
  5. Does the intent of those collecting data determine it’s category?
  6. Does consent change the nature of what is happening?
  7. Is Learning Analytics the same, similar in some ways but not others, or totally different than Learning Surveillance?
  8. What do we mean by the word “learning” in Learning Analytics?
  9. Are the benefits of Learning Analytics clear? Who gets to determine what is a “benefit” or not, or what counts as “clear”?

I am sure there are many other questions (feel free to add in the comments). But lets dig into each of these in turn.

The Historical Usage of Data in Education

There have been many books and papers written on the topic of what data is, but I got the sense that most people recognize that data has been used in education for a long time. Many took issue with equating Learning Analytics with collecting one data point:

This is a good point. Examining one data factor falls well short of any Learning Analytics goal I have ever read. Seeing that certain data points such as grades, feedback, attendance, etc have always been used in education, at what point or level does the historically typical analysis of information about learners become big data or Learning Analytics? If someone is just looking at one point of data, or they are looking at a factor related to the educational experience but not at learning itself, do we count it as “Learning Analytics”? If not, at what point does statistical information cross the line into becoming data that can be analyzed? How many different streams of data does one have to analyze before it becomes learning analytics? How close does the data have to be to the actual educational process to be considered Learning Analytics (or something else)? Does Learning Analytics even really ever look at actual learning? (more on that last one later)

What is Surveillance Anyways?

It seems there is a range of opinions on this, from surveillance meaning only specific methods of governmental oppression, to the very broad general definition in various dictionaries. Some would say that if you make your data collection research (collected in aggregate, de-identified, and protected by researchers), then it is not surveillance. Others say that analytics requires surveillance. Others take those ideas in a different direction:

I don’t know if I would ever go that far (and if you know George, this is not his definitive statement on the issue. I think.), or if I even feel the dictionary definition is the most helpful in this case. But you also can’t disagree with Miriam-Webster, right? Still, there are some bigger questions about what exactly is the line between surveillance and other concepts:

[tweet 1188147893246410752 hide_thread=’true’]

Oversight, supervision, corporate interest, institutional control, etc… don’t they all affect where we draw the line between analytics and surveillance (if we even do)? Or even deeper still….

Is All Surveillance Evil?

It seems there is an assumption that all surveillance is evil in some corners. Some even equate it with oppression and governmental control. However, if that is what everyone thinks of the idea, then why do grocery stores and hotels and other businesses blatantly post signs that say “Surveillance in Progress“? My guess is that this shows there are a lot of people that don’t see it as automatically bad, and even more that don’t care that it is happening. Or do they really not care, or just think there is nothing they can do about it? Either way, these signs would be a PR disaster for the companies if there was consensus that all surveillance is evil. Then again, I’m not so sure many would be so accepting of surveillance if we really knew all of the risks.

However, many do see surveillance as evil. Or at least, something that has gone too far and needs to stop:

But taking attendance and tracking bathroom breaks for points are two different things, right? So does that mean that…

Does the Purpose of Data Collection Change Anything?

Many people pointed out that the purpose for why data was collected would change whether we label the actions “Learning Analytics” or “Learning Surveillance.” Of course, the purpose of data collection is also driven by who has access to the data, who owns it, and what they need the data for (control? make money? help students? All of the above?). There is sometimes this assumption that research always falls into the “good” category, but that would ignore the history of why we have IRBs in the first place. Research can still cause harm even with the best of intentions (and not everyone has the best of intentions). This is the foundation of why we do the whole IRB thing, and that is not a perfect system. But the bigger view is that research is all about detective work, watching others closely to see what is going on, etc. Bringing the whole “purpose” angle into the debate will just cause the definition of Learning Analytics to move closer to the dictionary definition of surveillance.

On the other hand, a properly executed research project does keep the data in the hands of the researchers – and not in the hands of a company that wants to monetize the data analysis. Does the presence of a money making purpose cross the line from analytics to surveillance? Maybe in the minds of some, but this too causes confusion in that some analytics researchers are making sell-able products from their research. They may not be monetizing the product itself, but they may sell their services to help people use the tools. And its not wrong to sell your expertise on something you created. But many see a blurry line there. Purpose does have an effect, but not always a clear cut or easy to define one. Plus, some would point out that purpose is not as important as your intentions…

The Road to Surveillance is Paved With Good Intentions

Closely related to purpose is intent – both of which probably influence each other in most cases. While some may look at this as a clear-cut issue of “good” intentions versus “bad” intentions, I don’t personally see that as the reality of how people view themselves. Most companies view themselves as doing a good thing (even if they have to justify some questionable decisions). Most researchers see themselves as doing a good thing with their research. But we have IRBs and government regulation for a reason. We still have to check the intentions of researchers and businesses all the time.

But even beyond that – who gets to determine which intentions are good and which aren’t? Who gets to says what intentions still cause harm and which ones don’t? The people with the intentions, or the people affected by the intentions? What if there are different views among those that are affected? Do analytics researchers or surveillance companies get to choose who they listen to? Or if they listen at all? And are the lines between “harmful good intention” and “positive results of intention” even that clear? Where do we draw the line between harm and okay?

Some would say that the best way to deal with possibly harmful good intentions is to get consent….

Does the Line Between Analytics and Surveillance Change Due to Consent?

Some say one of the lines between Learning Surveillance and Learning Analytics is created by consent. Learning Analytics is research, and ethical research can not happen without consent.

[tweet 1188124784942551040 hide_thread=’true’]

Of course, the surveillance companies would come back and point to User Agreements and Terms of Service. So they are okay with consent, right?

Well, no. Who really reads the Terms of Service, anyways? Besides, they typically don’t clearly spell out what they do with your data anyways, right?

While this is often true, we see the same problem in research. We often don’t spell out the full picture for research participants, and then don’t bother to check to see if they really read the Informed Consent document or not. To be honest, consent in research as well as agreement with Terms of Service is more of a rote activity than a true consent process. We are really fooling ourselves if we think these processes count as consent. They really count more as a legal “covering the buttocks” than anything else.

Of course, many would point out that Learning Surveillance is often decided at the admin level and forced on all students as a condition of participating in the institution. And sadly, this is often the case. Since research is always (supposed to be) voluntary, there is some benefit to Informed Consent over Terms of Service, even if both are imperfect. But after all of this…

So, For Real, What is the Difference Between Analytics and Surveillance?

I think some people see the difference as:

Learning Analytics: informed consent, not monetized, intending to help education/learners, based on multiple data points that have been de-identified and aggregated.

Learning Surveillance: minimal consent sought from end users (forced by admin even), monetized, intending to control learners, typically focused on fewer data points that can identify individuals in different ways.

…or, something like that. But as I have explored above, this is not always the clear-cut case. Learning Analytics is sometimes monetized. Learning Surveillance often sells itself as helping learners more than controlling. De-identified data can be re-identified easier and easier as technology advances. Learning Surveillance can utilize a lot of data points, while some Learning Analytics studies focus in on a very small number. Both Learning Analytics and Learning Surveillance have consent systems that are full of problems. Learning Analytics can be used to control rather than help. And so on.

And we haven’t even touched on the problem of Learning Analytics not really even analyzing actual “learning” itself…

Learning Analytics or Click Stream Analytics?

Much of the criticism of Learning Surveillance focuses on how these tools and companies seek to monitor and control learning environments (usually online), while having very little effect on the actual learning process. A fair point, one that most Surveillance companies try to downplay with research of their own. That’s not really an admission of guilt as much as it is just the way the Ed-Tech game goes: any company that wants to sell  a product to any school is going to have to convince the people with the money that there is a positive affect on learning. Some how.

But does Learning Analytics actually look at learning itself?

[tweet 1188243487071834112 hide_thread=’true’]

So while Learning Analytics does often get much closer to examining actual learning than Learning Surveillance usually does, it is generally still pretty far away. But so is most of educational research, to be honest. It is not possible yet to tap into brains and observe actual learning in the brain. And a growing number of Learning Analytics papers are taking into account the fact that they are looking at artifacts or traces of learning activities, not the learning activities themselves or the actual learning process.

However, the distinction that “Analytics is looking at learning itself” and “Surveillance is looking at factors outside of learning” still comes apart to some degree when you look at what is really happening. Both of them are examining external traces or evidence of internal processes. This leaves us with the idea that there has to be a clear benefit to one or other if there is a true difference between the two….

What is Clear and What is a Benefit Anyways?

Through the years, I have noticed that many say that the benefits of analytics and/or surveillance are clear. The problem is, who gets to say they are clear, or that they are beneficial? All kinds of biases have been found in data and algorithms. If you are a white male, there are fewer risks of bias against you… so you may see the benefits as clear. To those that see a long history of bias being programmed into the systems… not so much. Is it really a “benefit” if it leaves out large parts of society because a bias was hard-coded into the system?

Where some people see benefits of analytics, other see reports tailored for upper level admin that tells them what we already know from research. Having participated in a few Learning Analytics research projects myself, I know that it takes a lot of digging to find results, and then an even longer time to explain to others what is there. And then, if you create some usable tool out of it, how long does it take to train people to use those results in “user-friendly” dashboards? Obviously, in academia we don’t have a problem with complex processes in and of themselves. But we should also be reluctant to call them “clear” if they are time-consuming to discover, understand, communicate, and make useful for others.

Then, on top of all of this, what we have had so far is a bunch of instructors, admins, and researchers arguing over whether analytics is surveillance, and if either one of them are okay or not. Do the students get a say? When are we going to take the time to see if students clearly understand what all this is about (and then clearly explain it to them if they don’t), and then see what they have to say? Some already understand the situation very well, but we need to get to place where most understand it fairly well, and then include their voice in the discussion.

So Back to the Question: How Do You Define These Two?

Like many have stated, analytics and surveillance have existed for a long time, especially in formal educational settings:

If you really think about it, Instructivism has technically been based on surveillance and analysis all along. This has kind of been baked into educational systems from the beginning. We can’t directly measure learning in the brain, so education has traditionally chosen to keep close watch over students while searching for evidence that they learned something (usually through tests, papers, etc). Our online tools have just replicated instructor-centered structures for the most part, bringing along the data analysis and user surveillance that those structures were known for before the digital era. Referring to teachers as “learning detectives” is an obscure trope, but one that I have heard from time to time.

(Of course, there are those that choose other ways of looking at education, utilizing various methods to support learner agency. This is outside the focus of this rambling article. But it is also the main focus of the concepts I research, even when digging into data analytics.)

So if you are digging through large data sets of past student work and activity like a detective, in order to find ways to improve educational environments or the learning process…. am I describing Learning Analytics, or Learning Surveillance?

Yes, I intentionally choose a sentence that could easily describe both on purpose.

To be honest, I think if we pull back too far and compare any type of data analysis in learning with any form of student surveillance in learning, there won’t be much difference between the two terms. And some people that only work occasionally with either one will probably be okay with that.

I think we need to start looking at Learning Analytics (with capital L-A) vs. analytics (little a), and Learning Surveillance (capital L-S) vs. surveillance (little s). This way, you can look at the more formal work of both fields, as well as general practices of the general ideas. For example, you can look at the problems with surveillance in both Learning Analytics as well as in Learning Surveillance.

However, if I was really pressed, I would say that Learning Analytics (with capital L-A) seeks to understand what is happening in the learning process, in a way that utilizes surveillance (little s) of interface processes, regardless of monetary goals of those analyzing the data. Learning Surveillance (capital L-S) seeks to create systems that control aspects of the learning environment in a way that monetizes the surveillance process itself, utilizing analytics (little a) from learning activities as a primary source of information.

You may look at my poor attempt at definitions and feel that I am describing them as the exact same thing. You may look at my definitions and see them as describing two totally different ideas. Maybe the main true difference between the two is in the eye of the beholder.

So What Do You Want From Learning Analytics?

If you haven’t noticed lately, there is a growing area of concern surrounding the field of learning analytics (also sometimes combined with artificial intelligence). Of course, there has always been some backlash against analytics in general, but I definitely noticed at the recent Learning Analytics and Knowledge (LAK) conference that it was more than just a random concern raised here and there that you usually get at any conference. There were several voices loudly pointing out problems both online and in the back channel, as well as during in-person conversations at the conference. Many of those questioning what they saw were people with deep backgrounds in learning theory, psychology, and the history of learning research. But its not just people pointing out how these aspects are missing from so much of the Learning Analytics field – it is also people like Dr. Maha Bali questioning the logic of how the whole idea is supposed to work in blog posts like Tell Me, Learning Analytics…

I have been known to level many of the current concerns at the Learning Analytics (LA) field myself, so I probably should spell out what exactly it is that I want from this field as far as improvement goes. There are many areas to touch on, so I will cover them in no particular order. This is just what comes to mind off the top of my head (probably formed by my own particular bias, of course):

  • Mandatory training for all LA researchers in the history of educational research, learning theory, educational psychology, learning science, and curriculum & instruction. Most of the concerns I heard voiced at any LAK I have attended was that these areas are sorely missing in several papers and posters. Some papers were even noticed as “discovering” basic educational ideas, like students that spend more time in a class perform better. We have known this from research for decades, so… why was this researched in the first place? And why was none of this earlier research cited? But you see this more than you should in papers and posters in the LA field – little to no theoretical backing, very little practical applications, no connection to psychology, and so on. This is a huge concern, because the LAK Conference Proceedings is in the Top 10 Educational Technology journals as ranked by Google. But so many of the articles published there would not even go beyond peer review in many of the other journals in the Top 10 because of their lack of connection to theory, history, and practice. This is not to say these papers are lacking rigor for what they include – it is just that most journals in Ed-Tech require deep connections to past research and existing theory to even be considered. Other fields do not require that, so it is important to note this. Also, as many have pointed out, this is probably because of the Computer Science connection in LA. But we can’t forego a core part of what makes human education, well… human… just because people came from a background where those aspects aren’t as important. They are important to what makes education work, so just like a computer engineer that wants to get into psychology would have to learn the core facets of psychology to publish in that area, we should require LA researchers to study the core educational topics that the rest of us had to study as well. This is, of course, something that could be required to change many areas in Education itself as well – just having an education background doesn’t mean one knows a whole lot about theory and/or educational research. But I have discussed that aspect of the Educational world in many places in the past, so now I am just focusing on the LA field.
  • Mandatory training for all LA researchers in structural inequalities and the role of tech and algorithms in creating and enforcing those inequalities. We have heard the stories about facial recognition software not recognizing black faces. We know that algorithms often contain the biases of their creators. We know that even the prefect algorithms have to ingest imperfect data that will contain the biases of those that generated it. But its time to stop treating equality problems as an after thought, to be fixed only when they get public attention. LA researchers need to be trained in recognizing bias by the people that have been working to fight the biases themselves. Having a white male instructor mention the possibility of bias here and there in LA courses is not enough.
  • Require all LA research projects to include instructional designers, learning theorists, educational psychologists, actual instructors, real students, people trained in dealing with structural inequalities, etc as part of the research team from the very beginning. Getting trained in all of the fields I mentioned above does not make one an expert. I have had several courses on educational psychology as part of my instructional design training, but that does not make me an expert in educational psychology. We need a working knowledge of other fields to inform our work, but we also need to collaborate with experts as well. People with experience in these fields should be a required part of all LA projects. These don’t all have to separate people, though. A person that teaches instructional design would possibly have experience in several areas (practical instruction, learning theory, structural inequality, etc). But you know who’s voice is incredibly rare in the LA research? Students. Their data traces DO NOT count as their voice. Don’t make me come to a conference with a marker and strike that off your poster for you.
  • Be honest about the limitations and bias of LA. I read all kinds of ideas for what data we need in analytics – from the idea that we need more data to capture complex ways learning manifests itself after a course ends, to the idea that analytics can make sense of the word around us. The only way to get more (or better) data is to increase surveillance in some way or form. The only way to make more sense is to get more data, which means… more surveillance. We should be careful not to turn our entire lives into one mass of endless data points. Because even if we did, we wouldn’t be capturing enough to really make sense of the world. For example, we know that click stream data is a very limited way to determine activity in a course. A click in an online course could mean hundreds of different things. We can’t say that this data tells us what learners are doing or watching or learning – only just what they are clicking on. Every data point is just that – a click or contact or location or activity with very little context and very little real meaning by itself. Each data point is limited, and each data point has some type of bias attached to it. Getting more data points will not overcome limitations or bias – it will collect and amplify them. So be realistic and honest with those limitations, and expose the bias that exists.
  • Commit to creating realistic practical applications for instructors and students. So many LA projects are really just ways to create better reports for upper level admin. Either that, or ways to try and decrease drop-outs (or increase persistence across courses as the new terminology goes). The admin needs their reports and charts, so you can keep doing that. But educators need more than drop-out/persistence stuff. Look, we already have a decent to good idea what causes those issues and what we can do to improve them. Those solutions take money, and throwing more data at them is not going to decrease the need for funding once a more data-driven problem (which usually look just like the old problems) is identified. Please: don’t make “data-driven” become a synonymy for “ignore past research and re-invent the wheel” in educators eyes. Look for practical ways to address practical issues (within the limitations of data and under the guiding principle of privacy). Talk to students, teachers, learning theorists, psychologists, etc while you are just starting to dig into the data. See what they say would be a good, practical way to do something with the data. Listen to their concerns. Stop pushing for more data when they say stop pushing.
  • Make protecting privacy your guiding principle. Period. So much could be said here. Explain clearly what you are doing with the data. Opt-in instead of opt-out. Stop looking for ways to squeeze every bit of data out of every thing humans do and say (its getting kind of gross). Remember that while the data is incomplete and biased, it is still a part of someone else’s self-identity. Treat it that way. If the data you want to collect was actual physical parts of a person in real life – would you walk around grabbing it off of them the way you are collecting data digitally now? Treat it that way, then. Or think of it this way: if data was the hair on our heads, are you trying to rip or cut it off of peoples’ heads without permission? Are you getting permission to collect the parts that fall to the floor during a haircut, or are you sneaking in to hair cutting places to try and steal the stuff on the floor when no one is looking? Or even worse – are you digging through the trash behind the hair salon to find your hair clippings? Also – even when you have permission – are you assuming that just because the person who got the hair cut is gone, that this means the identity of each hair clipping is protected… or do you realize that there are machines that can identify DNA from those hair clippings still?
  • Openness. All of what I have covered here will require openness – with the people you collect data from, with the people you report the analytical results to, with the general public about the goals and results, etc. If you can’t easily explain the way the algorithms are working because they are so complex, then don’t just leave it there, Spend the time to make the algorithms make sense, or change the algorithm.

There are probably more that I am missing, or ways that I failed to explain the ones I covered correctly. If you are reading this and can think of additions or corrections, please let me know in the comments. Note: the first bullet point was updated due to misunderstandings about the educational journal publishing system. Also see the comments below for good feedback from Dr. Bali.

Can You Automate OER Evaluation With The RISE Framework?

The RISE Framework is a learning analytics methodology for identifying OER resources in a course that may need improvement. On one level, this is an interesting development, since so few learning analytics projects are actually getting into how to improve the actual education of learners. But on the other hand, I am not sure if this framework has a detailed enough understanding of instructional design, either. A few key points seem to be missing. It’s still early, so we will see.

The basic idea of the RISE Framework is that analytics will create a graph that plots page clicks in OER resources on the x-axis, and grades on assessments on the y-axis. This will create a grid that shows where there were higher than average grades with higher than average clicks, higher than average grades with lower than average clicks, lower than average grades with higher than average clicks, and lower than average grades with lower than average clicks. This is meant to identify the resources that teachers should consider examining for improvement (especially focusing on the ones that got a high number of clicks but lower grade scores). Note that this is not meant to definitely say “this is where there is problem, so fix it now” but more ” there may or may not be a problem here, so check it out.” Keep that in mind while I explore some of my doubts here, because I would be a lot harsher on this if it was presented as a tool to definitely point out exact problems rather than what it is: a way to start the search for problems.

Of course, any system of comparing grades with clicks itself is problematic on many fronts, and the creators of the RISE Framework do take this into consideration when spelling out what each of the four quadrants could mean. For example, in the quadrant that specifies high grades with low content usage, they not only identify “high content quality” as the cause of this, but also “high prior knowledge,” “poorly written assessment,” and so on. So this is good – many factors outside of grades and usage are taken into account. This is because, on the grade front, we know that scores are a reflection of a massive number of factors – the quality of the content being only one of those (and not always the biggest one). As  noted, prior knowledge can affect grades (sometimes negatively – not always positively like the RISE framework appears to assume). Exhaustion or boredom or anxiety can impact grades. Again, I am glad that these are in the framework, but the affect these have on grades is assumed in one direction – rather than the complex directions they take in real life. For example, students that game the test or rubric can inflate scores without using the content much – even on well-designed assessments (I did that all of the time in college).

However, the bigger concern with the way grades are addressed in the RISE framework is that they are plotting assessment scores instead of individual item scores. Anyone that has analyzed assessment data can tell you that the final score on a test is actually an aggregate of many smaller items (test questions). That aggregate grade can mask many deficiencies at the micro level. That is why instructors prefer to analyze individual test questions or rubric lines than the aggregate scores of the entire test. Assessments could cover, say 45 questions of content that were well covered in the resources, and then 5 questions that are poorly covered. But the high scores on the 45 questions, combined with the fact that many will get some questions right by random guessing on the other 5, could result in test scores that mask a massive problem with those 5 questions. But teachers can most likely figure that out quickly without the RISE framework, and I will get to that later.

The other concern is with clicks on the OER. Well, they say that you can measure “pageviews, time spent, or content page ratings”… but those first two are still clicks, and the last one is a bit too dependent on the happiness of the raters (students) at any given moment to really be that quantitative. I wouldn’t outright discount it as a factor, but I will state that you are always going to find a close alignment with the test scores on that one for many reasons. In other words, it is a pre-biased factor – students that get a high score will probably rate the content as effective even if it wasn’t, and students that get a low score will probably blame the content quality whether it was really a factor or not.

Also, now that students know their clicks are being recorded, they are more and more often clicking around to make sure they get good numbers on those data points. I even do that when taking MOOCs, just in case: click through the content at a realistic pace even if I am really doing something else other than reading. People have learned to skim resources while checking their phone, clicking through at a pace that makes it seem like they are reading closely. Most researchers are very wary of using click data like pageviews or time spent to tell anything other than where students clicked, how long between clicks, and what was clicked on. Guessing what those mean beyond that? More and more, that is being discouraged in research (and for good reason).

Of course, I don’t have time to go into how relying on only content and assessment is poor way to teach a course, but I think we all know that. A robust and helpful learning community in a class can answer learning questions and help learners overcome bad resources to get good grades. And I am not referring to cheating here – Q&A forums in courses can often really help some learners understand bad readings – while also possibly making them feel like they are the problem, not the content.

Still, all of that is somewhat or directly addressed in the framework, and because it is a guide rather than definitive answer, variations like those discussed above are to be expected. I covered them just to make sure I was covering all critical bases.

The biggest concern I have with the RISE framework really comes here: “The framework assumes that both OER content and assessment items have been explicitly aligned with learning outcomes, allowing designers or evaluators to connect OER to the specific assessments whose success they are designed to facilitate.”

Well, since that doesn’t happen in many courses due to time constraints, that eliminates large chunks of courses. I can also tell you as an instructional designer, many people think they have well-aligned outcomes…. but don’t.

But, let’s assume that you do have a course with “content and assessment items have been explicitly aligned with learning outcomes.” If you have explicitly aligned assessments, you don’t need the RISE framework. To explicitly align assessment with a content is not just a matter of making sure the question tests exactly what is in the content, but to also point to exactly where the aligned content is for each question. Not just the OER itself, but the chapter and page number. Most testing systems today will give you an item-by-item breakdown of each assessment (because teachers have been asking for it). Any low course score on any specific question indicates some problem. At that point, it is best (and quickest) to just ask your learners:

  1. Did the question make sense? Was it well written?
  2. Did it connect to the content?
  3. Did the content itself make sense?

Plus, most content hosting systems have ways to track page clicks, so you can easily make your own matrix using clicks if you need to. The matrix in the framework might give you a good way to organize the data to see where your problem lies…. but to be honest, I think it would be quicker and more accurate to focus on the assessment questions instead of the whole test, and ask the learners about specific questions.

Also, explicit alignment can itself hide problems with the content. An explicit alignment would require that you test what is in the content, even if the content is bad. This is one of the many things you learn as an ID: don’t test what students don’t learn; write your test questions to match the content no matter what. A decently-aligned assessment can still produce grades from a very bad content source. One of my ID professors once told me something along the lines of “a good instructional designer can help students pass even with bad textbooks; a bad instructional designer can help them fail with the best textbook.”

Look – instructional designers have been dealing with good and bad textbooks for decades now. Same goes for instructors that serve as their own IDs. We have many ways to work around those.

I may be getting the RISE framework wrong, but comparing overall scores on assessments to certain click-stream activity in OER (sometimes an entire book) comes across like shooting fish in a barrel with a shotgun approach. Especially when well-aligned test questions can pinpoint specific sources of problems at a fairly micro-fine level.

Now then, if you could actually compare the grades on individual assessment items with the amount of time spent on the page or area that that specific item came from, you might be on to something. Then, if you could group students into the four quadrants on each item, and then compare quadrant results on all items in the same assessment together, you could probably identify the questions that are most likely to have some kind of issue. Then, have the system send out a questionnaire about the test to each student – but have the questionnaire be custom-built depending on which quadrant the student was placed in. In other words, each learner gets questions about the same, say, 5 test questions that were identified as problematic, but the specific question they get about each question will be changed to match which quadrant they were placed in for that quadrant:

We see that you missed Question 4, but you did spend a good amount of time on page 25 of the book, where this question was taken from. Would you say that:

  • The text on page 25 was not well-written
  • Question 4 was not well-written
  • The text on page 25 doesn’t really match Question 4
  • I visited page 25, but did not spend the full time there reading the text

Of course, writing it out this ways sounds creepy. You would have to make sure that learners opt-in for this after fully understanding that this is what would happen, and then you would probably need to make sure that the responses go to someone that is not directly responsible for their grade to be analyzed anonymously. Then report those results in a generic way: “survey results identified that there is probably not a good alignment between page 25 and question 4, so please review both to see if that is the case.”

In the end, though, I am not sure if you can get detailed enough to make this framework effective without diving deep into surveillance monitoring. Maybe putting the learner in control of these tools, and give them the option of sharing the results with their instructor if they feel comfortable?

But, to be honest, I am probably not in the target audience for this tool. My idea of a well-designed course involves self-determined learning, learner autonomy, and space for social interaction (for those that choose to do so). I would focus on competencies rather than outcomes, with learners being able to tailor the competencies to their own needs. All of that makes assessment alignment very difficult.