Updates on the Never-Ending Reclaim Project

So with all of the weirdness that is going on in the Ed Tech world recently and the general world today, I needed something to take my mind off of things. I wanted to add a quick update about my Never-Ending Reclaim Project at the end of that post… but it ended up being too long! So, in the interest of archiving the good, the bad, and the ugly of what I am finding out there (not all of it is being kept even if I am reclaiming access)…. here are some interesting (to me, at least) updates of where things are.

First of all, its pretty weird trying to make sure you have ownership of every account you have created. Random things in life suddenly remind you of things you had totally forgotten. Walking by a store one day reminds you “oh, hey – RedBox still exists and I think I had an online account there as well.” Or a random link reminds you that you also had a Reddit account at one time. All reclaimed!

I finally came to a place of acceptance with the not-quite-perfect html exports of WordPress sites. It seems that everything from site suckers to WP plugins just don’t get what relative truly means. Or maybe I just don’t get the settings correct? Anyways – it seems they always add a slash at the beginning of base level files like this: “/images/picture.jpg” or “/css/style.css” or whatever.  That forces my computer and the websites where I deposit them to look in the base directory for everything – but I am trying to get them to go in a sub-folder of an “archive” folder. So the browser just sits there forever trying to figure out what is going on. For less complex websites, its easy enough to remove that slash quickly (“images/picture.jpg” or “css/style.css” or whatever) – and boom! instant relative website that can work online or offline where ever I put it. When archiving WordPress sites with complicated folder structures, it takes a bit of thinking to know how many “../” or “../../” etc to replace those “/” with – and time consuming if you have to think through all “/” in your document.

There is one workaround to make it a bit easier. I have found exporting from within WordPress to be a bit better than external site suckers, because WordPress will still get you all of your orphaned files and pages. This means that bad link you didn’t realized was there can be fixed with one edit, rather than jumping into archive.org to hope and pray that the file is there (only about 50/50 record of that so far for me, unfortunately). Plus, you can hard code a long link with your website address in there – making finding and replacing absolute links with relative “../../” links very, very quick and easy per page. Which I wrote about before – but it’s the best option I have found so far.

The reason this is important is because the old LINK website has bit the dust for now it seems. This was apparently a problem with Google and not the people running the site. They tried everything they could to renew the website registration – but it was originally registered through Google. Let me warn you: don’t do that. It starts easy enough to register… but renewals get harder and more complicated each time. I experienced this a couple of years myself – and it just got worse after that.

Anyways, I was able to get html archives of all LINK lab sites just in case something went wrong (again, it just seemed inevitable the way Google was going). So I have html back-ups of DALMOOC, Pivot to Online Learning MOOC, Open Ed MOOC, etc. Most of these are hard coded to work on my personal website – but I have been able to get DALMOOC converted over to true relative html. I can easily move that folder where ever I want – or send the files to whatever archive site the good folks still bearing the LINK torch set up for LINK Lab. I will work on the other courses as I get time as well.

The other weird thing that happened is that I actually got control of my MySpace account back! The form that I linked to in the last post… actually worked? I mean, it took over a month to hear anything, but I am back in. And it is a sad wasteland in there. Almost all real data is gone – and only a few pictures remain of the many I uploaded. But I now control my corner of the wasteland at least.

I also was able to somewhat re-create the custom profile I made back in the day. The html template I found on GitHub was cool, but also several years beyond the last version I had used. My resurrected custom code didn’t work. But I poked around in archive.org and found a save of Tom’s profile from the date that I saved my custom code. I put the two together, and BAM! I had my profile back in html! Well, it was Tom’s profile styled like mine. So I started replacing Tom’s information with mine as best as I could remember it (or using Latin sample text where I couldn’t). I also found a way to make an image of the profile music player that plays the sample of music that I had on there if you click it.

Now… before I share the link, please keep in mind that I realize this profile has some cultural appropriation. At the time, I was married to someone that traced their heritage back to India, so I was trying to mix her heritage and mine (Irish) on my MySpace page. But anyways – today I would replace the Hindi and sitar (yes I did actually learn to play a few songs on it, even though I have forgotten how) with something from my cultural background. But this is what it was back in the day.

Now, if only I could get the the Foursquare/Swarm people to be as…. umm… “responsive” as the MySpace team…

I also seem to have found some of the limitations of Ruffle – you can’t really import external files (images, other SWF files, etc), which I did a lot in tthe E-SPY X-500. So I just had to link to an external list of the lessons that I wanted to import into the game. I set it up that way because we wanted to be able to upgrade the lessons as needed without re-doing the entire game. For example, the Tobacco Lesson 11 lets the student build a simple Tobacco awareness website – it was pretty basic, but we had bigger plans to make it more robust. But at least it works as originally designed now. Oh, and you have to use the back button to get back to the list.

I also found that many ActionScript functions don’t work in Ruffle, like the code that makes text scroll within small boxes. Oh, well. Maybe there has been an update that I need to look into.

After doing some poking around on Digg and Delicious, it seems that my original Digg account is gone forever (unless someone knows of a way to log in with email?), but Delicious is still around. Kind of. I was able to log in and export my posts from there. It seems like it is just a data repository of your old stuff (can’t add new stuff), but that is a start. You can export to JSON and HTML formats – if you can remember your password (it seems like the password reset function is not implemented yet). The html format also doesn’t look that great, and it saved the tags and dates even though they aren’t displayed. So I decided to grab the html and CSS from their site to make my archive look a lot cleaner. I also decided to go for 60 resultsw per page rather than 20, because mine were all short “Ed Tech new updates” type things anyways.

Anyways, I find this type of stuff fascinating. Some of you might think I am trying too hard to get stuff that should be forgotten, and maybe you are right. Especially after seeing how my old MySpace profile looked. I still need to find a way to convert old Flash files to html5 (without buying an Adobe subscription). I also wonder if I can find a site that emulates old installs of LAMP so I can get a 14 year old export of WordPress working again (WP tells me its too old to import now – boo!). More things to look into!

The Quick(ish) Guide to Why Some People Don’t Like Course Hero

If you are part of certain circles in the education world (especially on Twitter), you probably saw the controversy yesterday about a well-known education critic being hired by the Ed-Tech company Course Hero. I really don’t want to wade into that controversy too much – I don’t know the people involved well enough to comment on their motives. I have never witnessed the whole “change a company from within” strategy ever work, but I know there is no shortage of people who will try. However, Course Hero has run under many people’s radars for a while, and I thought I would go into why some people don’t like the company’s product or business model.

So what exactly is Course Hero? Well, if you read the company hype, you will find things like “partnering with, connecting, learning from and teaching educators in support of them in empowering learners.” Which doesn’t really mean anything specific to be honest. The reality is that they are a resource sharing website, primarily driven by student labor. Students can find answers to test questions, past papers, course documents, and all kinds of materials related to courses they are taking (including entire chapters and courses). After free trials of various kinds, they also have to pay for this access. In turn, they are encouraged to upload documents for other students.

Now, I will say that I am typically sympathetic of students that use websites like this – even though I will still warn them not to.

So before I dive into that problematic system, I will point out to students that using Course Hero can be dangerous. Your institution probably has strongly-worded “Academic Honesty” statements that spell out harsh possibilities for being caught sharing your work with other students or uploading your instructor’s copyrighted content to any website without their permission (and institutions also often claim copyright on course content as well). Even if your intent is to share examples to help other students (something many instructors even encourage), your institution might not see it that way. Plus, I did a quick search through Course Hero yesterday and found a large number of papers that still had the students’ name on them. That means that a random school official could be surfing through their website, see your name, and get you in trouble for a course from a couple of years ago. Course Hero does not appear to be doing much to protect the students that it uses for free labor, so “user beware.”

However, like I said, it is important to understand why students use Course Hero. So many of our institutions still promote high stakes assessment (tests, essays, etc) as the main mode for “weeding out” students (side note: never refer to your students as “weeds”). Sometimes this even comes wrapped in poorly designed courses that don’t do enough to prepare students for these assessments. Students are then given the impression that cheating is the only way other students survive the gauntlet (and in many cases, this is probably true). Focusing on the students that use Course Hero misses the real problem of an institutional system that created the pressure to cheat in the first place.

But remember students – if you are caught using Course Hero, your institution will most likely not do any soul searching on the way they created the pressure to move you in that direction. They will just punish you and move on. Again – user beware.

I see nothing in Course Hero that pushes back against this problematic pedagogy. In fact, it only seems geared to empower that system. I really don’t see a way that Course Hero could co-exist with ungrading, or if students would even bother to use it if grades were low-stakes in any way.

What you have is a company that utilizes free labor (yes, just like other companies like Facebook) and a “freemium” model to get users to start paying. It also has an internal tokening system that creates rewards for uploading content (search Twitter for Course Hero, and most of what you get is users claiming to sell these tokens for cheap). Because most of the users at some point or another are desperate to survive a harsh academic system somewhere, many feel Course Hero is a predatory service relying on student fear. Yes, they do position themselves as a pro-student company, but honestly I don’t see how they are more pro-student than anyone else.

Also of note is the general legality of Course Hero – it’s pretty easy to find many, many examples of how they are in violation of NC licenses. But on top of that, since all material (in the U.S. at least) is automatically copyrighted once it is created – I don’t see how much on their website is technically legal at all (outside of the occasional rare public domain license). You don’t have to agree with copyright laws – I am just pointing out the statues here as they currently stand. In addition, most institutions have added copyright rules that require you to at least get the permission of the instructor, if not the entire institution, before uploading to any external website. Since it would take a massive legal fund to challenge any one of these points, Course Hero probably enjoys a relative “freedom” from legal prosecution. From many accounts I can find online, it is very difficult to get copyrighted material taken down with a simple take down notice. Course Hero does not have a great record of responding to critics of any kind (despite what some might say), including direct legal challenges.

Plus, many institutions will directly name Course Hero as a reason why they have to get proctoring surveillance solutions. Course Hero may not like it (or maybe they do – who knows?), but they are a major player in the course surveillance system. You will hear Course Hero directly named by institutions as one reason they need to increase surveillance. As many people have put it, dealing with a nuclear arms race by adding more nuclear missiles is a step in the wrong direction.

You may disagree with all of these assertions about Course Hero (I am sure the company does). I would refer you back to the title of the post – these are reasons why people don’t like Course Hero. There are many other reasons as well. I’m not here to weigh the praise alongside the criticism.

One of the oldest, cliche moves in the book for tech companies in general is to hire a critic into a high level position at their company. They hope to borrow that critic’s reputation to clean up their image. It never works that way, but still companies try all the time Is that what Course Hero is doing now? Only time will tell. Every single critic that has ever been duped by a tech company in the past all claimed before hand that they were too smart to get duped. Sometimes, they were even hired by someone that really meant it… until that person got forced out by larger forces in the company.

Ultimately, companies don’t really care that much about any of that drama. Drama creates attention, and attention is what they need. They know that when they hire a critic, they also get the loyalty of some of that critic’s friends and colleagues along the way. They know they are getting multiple defenses of their company from many other respected voices… for free. And with Course Hero, you are already seeing that. These defenses range from the normal “I won’t attack someone just for taking a job” (agreed) to the questionable justifications of the company actions to the downright passive aggressive denigrations. One person even made me think “well, Headmaster Killjoy is here to swat down the plebes that dare have a different opinion!” Then there are the attacks and fights. I sincerely hope the people that become that aggressive will realize that they only make people hate Course Hero more when they do that.

Anyways, my only real message here is to please understand why there is so much distrust of Course Hero out there. Most of the disagreement with the recent announcement has been serious and respectful, despite what the defenders of the announcement will claim. Not all disagreements have been cordial, obviously… but the announcement came with the direct statement that “this will upset people.” Why tear into people when they are responding exactly as noted?

Or the bigger question: if Course Hero is a good company that truly engages with it’s critics… they why does it need to be subverted from within? Some people are saying both, and it really doesn’t match if you think about it.

We Know Why You Hate Online Learning – and It Has Nothing to Do With Quality

In some ways, I get why some people are saying they hate online learning. Almost everyone was forced into it – even those that didn’t choose it originally. We live in a time where most people that enter school (or teach at school) are aware that there is an online option. There are a few cases where people want to take or teach online courses when there aren’t any options to do so for the most part. But for everyone else, if you wanted to learn or teach online, you probably were able to choose that. The millions that were forced to switch suddenly last year did so against their first preference, and I get how that frustrates many of them.

Let’s face it – we all know that what has been happening the past two years is often not fully implemented, funded, and institutionally-supported online learning. Most tried hard to make it work, but due to shortages in training, prep time, or funding/support, a lot of it fell short of the true potential of online learning.

Of course, this was also true about face to face learning before the pandemic – even dedicated teachers are held back because of systems that don’t give them enough time, or train them well enough, or give them the money and resources they need. We just act like this is the “Facts of Life” for on campus learning… you take the good, you take the bad, you take them both, and there you have…. a gold standard….?

Nope. Any institutional leader or edu-celebrity that proclaims that on-campus learning is inherently superior to online learning is being disingenuous. They know that reality doesn’t support their claims. They just hate online learning… but not for quality reasons.

The real reason? It’s all about the power and control. Leaders can’t control their students, faculty, and staff remotely like they can on campus. And that control not only brings them a power trip – it also brings in big $$$ for schools when they can manipulate students into spending more money on campus.

And that’s it really: the real reason you have leaders (institutional, thought, and otherwise) claiming that online learning is inferior, and that on campus learning is the “gold standard,” is because they lose power (and the money that comes with that power).

Now – if a student or faculty or even University president proclaims that they hate online learning in and of itself – I get it. We all have personal preferences – I love online learning, but I get why it isn’t for everyone.

But there is a difference between saying one personally doesn’t like it, and saying online learning is inferior, failed, snake oil, etc.

The difference, of course, is research. There really is research showing that there is no significant difference between various outcomes of online learning and on campus learning. Probably one of the best sources to look at for research is the National Research Center for Distance Education and Technological Advancements (DETA) at the University of Wisconsin – Milwaukee’s “No Significant Different” database:

“This site is intended to function as an ever-growing repository of comparative media studies in education research. Both no significant differences (NSD) and significant differences (SD) studies are constantly being solicited for inclusion in the website. In addition to studies that document no significant difference (NSD), the website includes studies which do document significant differences (SD) in student outcomes based on the mode of education delivery.”

Current, the numbers in that database are categorized as:

  • 141 studies that show no significant difference
  • 51 studies that show “Significant Difference – Better Results with Technology” (online usually being said technology)
  • 2 studies that haven’t been indexed yet
  • 0 studies showing “Significant Difference – Better Results in the Classroom”
  • 0 entries showing mixed results

Maybe it is just my bias… but it seems that the results are starting to trend towards online maybe being… better?

Recently I was in a huge Twitter argument with a group of K-12 educational leaders from the UK that were demanding that I provide an article that proves that online learning could even work at all. They had already ignored two responses to these demands from a female colleague of mine – and still demanded that I provide a link of my own even though I had pointed them to those tweets and the DETA database already. So I just refused to give out any more links to people that weren’t going to look at them anyway – and got attacked in all kinds of horrible ways. But it seems like they were under the impression that I had website addresses to killer pro-online education studies memorized and I was just being a jerk in not spitting them out in a few seconds. Look – asking an online educator to provide one article proving that online learning is okay is like asking a Geologist to provide one study that proves that rocks exist within the Earth. A few might have something in mind, but most of us don’t spend a lot of time memorizing what we see as “proof of the obvious.” Others seemed to think that academics have all the time in the world to respond to tweet #54 demanding that one all-proving link. Look – no one owes you free labor. If you ask for something and they don’t give it, learn to respect people’s time enough to accept that maybe they are as busy as you. Especially if you were the one that came in swinging with the “online learning is a dying evil” rhetoric.

It’s all complicated. I will be the first person to tell you it comes down to personal preferences on whether you should do online learning or not, and for most people its not even an either/or. Different contexts call for different modalities for each person at any given moment. We just need to kill the dated and problematic “in-person learning is the gold standard” BS.

See also:

(Cover photo by Clem Onojeghuo on Unsplash)

The Never-Ending Reclaim Project Continues

Like many of you, I have been spending a considerable amount of time reclaiming my data and spaces online. A lot of that is focused on downloading and archiving my data (especially blog posts, reviews, comments, etc) from a myriad of websites I have used through the years. Well, decades now. I don’t know if this post will be of interest to anyone, but it will be a record (Jim Groom-style) for me – and hopefully someone will stumble across a couple of problems I have run into and have some suggestions for me.

So this all started several years (or more) ago when I ran into the idea of the IndieWeb and realized I didn’t have to lose data to dying websites like MySpace and Jaiku. I could take a proactive approach by collecting my information and storing it on my own (and the awesome folks at Reclaim Hosting make it super easy in many ways). So I started downloading data from various websites, and importing blog or informational posts from any website that I could. Then I realized two email addresses I used for a lot of websites through the years could possibly die someday, so I started going back to where ever I could find those email addresses and reclaimed access to those services. Which was mostly on a bunch of dead or dying websites, but it uncovered more posts and blogs to archive. Then several unexpected unfortunate events happened to me last year and this year. Finding out my job in academia was being eliminated caused me to comb through 15 years of signing up for all kinds of services and journals and all kinds of things to discover even more stuff to reclaim. Then an unexpected divorce also caused me to have to comb through even more stuff online, causing even more stuff to reclaim to come to light. So here are the basics of what I found out.

Downloading your data from websites is usually the most straight forward process, as long as the site offers a data download option or an export feature for your posts. One thing I have noticed is that the data that is downloaded does change from time to time – for instance, a good friend of mine suddenly died a few years ago and his family deleted all of his online accounts. So now there are posts on Facebook where he and I had long conversations that just look like I am arguing with myself. So instead of deleting previous data downloads with new, fresh downloads – I keep an archive of past exports. Did a past one capture those conversations that are now one-sided? I don’t know, but I should go look. I really hope so.

Then there were things like Jaiku that are long gone, but I never got a chance to download the data. Bummer. However, thanks to the work of the Internet Archive I did find a lot of my Jaiku posts in their archives. So I decided to copy the html and stitch together my own archive of some my jaikus – including a few comment that I could also find and some pages from the Jaiku site just for nostalgia. Clicking on any avatar on that page leads to me. Some of the other links work as well. But this little archive shows that even 12 years ago Jaiku was way more interesting than Twitter. I also archived as much as I could of the EduGeek Journal Jaiku channel as well. Interesting that this is where Twitter Hashtags directly got the # from (even though technically it came from older sources, it was Jaiku’s Channels that made Twitter users start using the # to mimic the function).

One site that is sadly long gone is MySpace. I can’t even sign in or reset my password anymore (probably hacked a long time ago). But the important data is gone – it seems MySpace lost or deleted most of it. I should have captured the html and custom CSS I worked for hours on way back in the day. But even the mighty Internet Archive didn’t capture any of that. However, after digging around some, I found this form to submit a support ticket, and then a GitHub project that has Tom’s MySpace profile html. And then searching through my files at home, of course I kept a copy of the CSS I created to customize my profile. So I might have to just make up a bunch of stuff about myself to replace the stuff about Tom, but I could actually have an archive of all of the time I wasted…. errr… “invested” in learning how to hack a custom MySpace profile.

Of course, the biggest project has been capturing my blogs. I thought I only had a handful of Blogger sites to import to WordPress, but then I kept digging up more. WordPress sites for several grad classes.  Old conference blogs. Old work blogs. Some attempts to use Known. Even a short attempt at Tumblr. So many short blogs. So I imported all that I could into one WordPress blog archive on my own site. All of that is easy. Some of the blogs that I liked I even created html archives of the layout. The one that I am having trouble with is Instagram. I would love to import all of my Instagram posts to WordPress blog with a template like the one I set up for my artwork gallery. I found some suggestions online for how to do that, but they only import the last 20 entries. I can import the rest one by one using copy and paste if I want to, but hopefully someone will come up with a way to automate it. Any ideas?

Of course, some of these blogs were older WordPress installations on my website, while others were attached to classes like the HumanMOOC that only make sense as a complete package. But its a pain to keep over a dozen WordPress installations updated and working. So I decided it was time to archive some sites as they are as html exports and shut down the WordPress version. The problem is, I really wanted a stand alone html export that could be moved to any folder or website and still work. The most recommended WordPress html export tool that I found when I started a few years ago (WP Static) doesn’t really work well for the relative links needed to do that. I could export to a defined folder on my site and it would hard code those specific links into every page, but then I can’t move it around (the Jaiku archive I created above can work any where I put it, or even offline if needed).  WP Static does have a relative link function, but it keep messing up the number of “../”‘s you need to make links work. Half the time, it just gets lost and serves up a blank page. Even a quick search and replace on a page doesn’t fix it.

So I looked around at other options, none worked any different. Even desktop based site suckers well… they suck too much. What I mean is, if there is a link to another website on your site, it will try to suck that entire site as well! Finally, I found Simply Static. It has a relative link function as well, and it doesn’t work right out of the download either. But it only messes up in one way, and a quick find and replace on a page makes your archived page spring to life. The only problem is that because of the layers upon layers of sub directories that WordPress uses, you have to do a find and replace per page to get the correct number of “../”‘s right. So it’s a quick process on simple sites… but a longer process on more complex sites. But it works in the end. I have a standalone html archive of the HumanMOOC that I helped to co-design and co-teach that will work where ever I put it. A bonus feature is that I got to finally fix some of the things that I didn’t have time to get right in the WordPress version. The activity bank images never worked right, but now I can have an image per activity. The blog hub now has individual avatars per person so you can see who posted what. The DALMOOC, OpenEdMOOC, and Pivot MOOC should be coming soon. ish.

Then there were other random things I needed to archive. All of my Storify archives, which neatly exported to html, but are slowly dying out as people close accounts, or Twitter changes how they display pictures, or a hundred other reasons. Is it worth going through each one and grabbing what is left? Several chatbots I created are still kicking around, but also falling apart as I need to apparently update the code to not point to the dead LINK Lab website. Add that one to my massive to-do list. Even an old OLC presentation that I did “choose your own presentation topic” style with the audience.

Oh, and going way back there are a good number of html websites I designed 1999-2005 that I am still keeping around for memory sake. Most are too embarrassing to link to, but the one I like the most is the one that I mention in several bios – the website I created to help students when I was an 8th grade Science teacher: Mr. Crosslin’s Class Online. Also my first serious attempt at putting course work online.

Speaking of old sites, I have so many sites that I built in Flash that I have been trying to figure out what to do with for years. I can still open Flash on an ancient computer I have, so I have exported all of my Flash files to image and/or movie files. But some are still a bit complex for that, and even the less complex ones are no fun to watch as a movie. Is there a way to convert FLA files to HTML5? I have looked a little and didn’t like what I found. If anyone knows of a way, even if I have to pay, please let me know.

So I thought for a while that my archives of several websites I created with Flash would be limited to still images of what happened. But then I came across Ruffle. You drop a couple of files on your site, and a few lines of code on your page, and – BAM! – your Flash files start magically working. So now I can get the old U Monthly Magazine archives back online (a lot is still missing, but I will dig it out eventually). My favorite Flash website I (mostly) created is the E-SPY X-500 – a goofy attempt at an educational game that I created for a company that I worked for after teaching.  Go ahead and kick around in there – not every thing works (yet, but on the list), but see if you can find the hidden Easter eggs. You can log in with any username or password over three characters. It has been totally disconnected from the MySQL database, so no data is collected. I should point out that the cartoon characters you will see once inside were not drawn by me, but our staff artist at the time Samuel Torres.

Of course, I have also be going through and making sure that my main portfolio is up to date, because it really serves as an archive of papers, presentations, videos, artwork, and other projects as well. I have also been working on things like a games archive. All kinds of random attempts to create games are in there, including some of the ones I mentioned above (I still need to create a Twine environment for the This Picture app game idea). Oh, and somewhere in the middle of all of this, I am also trying to work with my Mom to create a tribute site to my Grandfather’s artwork, since he sold paintings and worked as a staff artist for a newspaper in a major city.

Changing over email address is quite the chore. I had to look for old accounts with two old email addresses in them, and then I had to go through 15 years of work emails to see which accounts I would want to keep after leaving (mostly access to journals I published in, review accounts, professional website accounts, and others like that). Most places were pretty straight forward. Some places were not. It took a lot of work to get control of my Flickr account. I still can’t get control of my MySpace account – does their support team still even exist? A lot of these accounts I will probably shut down. But I was surprised at how haphazard I was in using whatever email address to sign up for whatever account. At least its all back with me again. And, of course, trying to separate 20 years of joint accounts from my former marriage was a huge undertaking. Some places make it nearly impossible to do that. But then I had to go back through all of these accounts I got back or websites I created and update bio listings about family where needed.

So, even though there isn’t a light at the end of tunnel, I know that a sighting of that light should come soon. Despite all that is left, I still feel that I have cut back my online presence to a streamlined, manageable amount. Someday I will be shutting down some massive websites like this one, so I hope to find even better ways to convert WordPress to html as well. Which I guess I will… give to my son some day? Donate to a museum? Will be people even care about archives like this in a few decades? I guess I will figure that out someday…

Using Learning Analytics to Predict Cheating Has Been Going on for Longer Than You Think

Hopefully by now you have heard about the Dartmouth Medical School Cheating Scandal, where Dartmouth College officials used questionable methods to “detect” cheating in remote exams. At the heart of the matter is how College officials used click-stream data to “catch” so-called “cheaters.” Invasive surveillance was used to track student’s activity during the exams, officials used the data without really understanding it to make accusations, and then students were pressured to quickly react to the accusations without much access to the “proof.” Almost half of those accused (7 of 17 or 41%) have already had their cases dismissed (aka – they were falsely accused. Why is this not a criminal act?). Out of the remaining 10, 9 plead guilty, but 6 of those have now tried to appeal that decision because they feel they were forced to plead guilty. FYI – that is 76%(!) that are claiming they are falsely accused. Only one of those six wanted to be named – the other 5 are afraid of reprisals from the College if they speak up.

That is intense. Something is deeply wrong with all of that.

The frustrating thing about all of this is that plenty of people have been trying to warn that this is a very likely inevitable outcome of Learning Analytics research studies that look to detect cheating from the data. Of course, this particular area of research focus is not a major aim of Learning Analytics in general, but several studies have been published through the years. I wanted to take a look at a few that represent the common themes..

The first study is a kind of pre-Learning Analytics paper from 2006 called “Detecting cheats in online student assessments using Data Mining.” Learning Analytics as a field is usually traced back to about 2011, but various aspects of it existed before that. You can even go back to the 1990s – Richard A. Schwier describes the concept of “tracking navigation in multimedia” (in the 1995 2nd edition of his textbook Instructional Technology: Past, Present, and Future – p. 124, Gary J. Anglin editor). Schwier really goes beyond tracking navigation into foreseeing what we now call Learning Analytics. So all of that to say: tracking students’ digital activity has a loooong history.

But I start with this paper because it contains some of the earliest ways of looking at modern data. The concerning thing with this study is that the overall goal is to predict which students are most likely to be cheating based on demographics and student perceptions. Yes – not only do they look at age, gender, and employment, but also a learner’s personality, social activities, and perceptions (did they think the professor was involved or indifferent? Did they find the test “fair” or not? etc).

You can see by the chart on p.207 that males with lower GPAs are mostly marked as cheating, while females with higher GPAs are mostly marked as not cheating. Since race is not considered in the analysis, systemic discrimination could create incredibly racist oppression from this method.

Even more problematic is the “next five steps to data mining databases,” with one step recommending the collection of “responses of online assessments, surveys and historical information to detect cheats in online exams.” This includes the clarification that:

  • “information from students must be collected from the historical data files and surveys” (hope you didn’t have a bad day in the past)
  • “at the end of each exam the student will be is asked for feedback about exam, and also about the professor and examination conditions” (hope you have a wonderful attitude about the test and professor)
  • “professor will fill respective online form” (hope the professor likes you and isn’t racist, sexist, transphobic, etc if any of that would hurt you).

Of course, one might say this is pre-Learning Analytics and the current field is only interested in predicting failure, retention, and other aspects like that. Not quite. Lets look at the 2019 article “Detecting Academic Misconduct Using Learning Analytics.” The focus in this study is bit more specific: they seek to use keystroke logging and clickstream data to tell if a student is writing an authentic response or transcribing a pre-written one (which is assumed to only be from contract cheating).

The lit review of this study also shows that this study is not the only one digging into this idea. The idea goes back several years through multiple studies.

While this study does not get to the same Minority Report-level concerns that the last one did, there are still some problematic issues here. First of all is this:

“Keystroke logging allows analysis of the fluency and flow of writing, the length and frequency of pauses, and patterns of revision behaviour. Using these data, it is possible to draw conclusions about students’ underlying cognitive processes.”

I really need to carve out some time to write about how you can’t use clickstream data of any kind to detect cognitive processes in any way, shape or form. Most people that read this blog know why this is true, so I won’t take the time now. But the Learning Analytics literature is full of people that think they can detect cognitive activities, processes, or presence through clickstream data… and that is just not possible.

The paper does address the difficulties in using keystroke data to analyze writing, but proposes analysis of clickstream data as a much better alternative. I’m not really convinced by the arguments they present – but the gist is they are looking to detect revision behaviors, because authentic writing involved pauses and deletions.

Except that is not really true for everyone. People that write a lot (like, say, by blogging) can get to a place where they can write a lot without taking many pauses. Or, if they really do know the material, they might not need to pause as much. On the other hand, the paper assumes that transcription of an existing document is a mostly smooth process. I know it is for some, but it is something that takes me a while.

In other words, this study relies on averages and clusters of writing activities (words added/deleted, bursts of writing activity, etc) to classify your writing as original or copied. Which may work for the average, but what about students with disabilities that affect how they write? What about people that just work differently than the average? What about people from various cultures that approach writing in a different method, or even those that have to translate what they want to write into English first and then write it down?

Not everyone fits so neatly into the clusters.

Of course, this study had a small sample size. Additionally, while they did collect demographic data and had students take self-regulated learning surveys, they didn’t use any of that in the study. The SRL data would seem to be a significant aspect to analyze here. Not to mention at least mentioning some details on the students who didn’t speak English as a primary language.

Now, of course, writing out essay exam answers is not common in all disciplines, and even when it is, many instructors will encourage learners to write out answers first and then copy them into the test. So these results may not concern many people. What about more common test types?

The last article to look at is “Identifying and characterizing students suspected of academic dishonesty in SPOCs for credit through learning analytics” from 2020. There are plenty of other studies to look at, but this post is already getting long. SPOC here means “Small Private Online Course”… a.k.a. “a regular online course.” The basic gist is that they are clustering students by how close their answers are to each other and how close their submission times are. If they get the exact same answers (including choosing the same wrong choice) and turn in their test at about the same time, they are considered “suspect of academic dishonesty.” It should also be pointed out that the Lit Rreview here also shows they are the first or only people to be looking into this in the Learning Analytics realm.

The researchers are basically looking for students that meet together and give each other answers to the test. Which, yes – it is suspicious if you see students turn in all the same answers at about the same time and get the same grade. Which is why most students make sure to change up a few answers, as well as space out submissions. I don’t know if the authors of this study realized they probably missed most cheaters and just caught the ones not trying that hard.

Or… let me propose something else here. All students are trying to get the right answers. So there are going to be similarities. Sometimes a lot of students getting the same wrong answer on a question is seen as a problem to fix on the teaching side (it could have been taught wrong). Plus, students can have similar schedules – working the same jobs, taking the same other classes that meet in the morning, etc. It is possible that out of the 15 or so they flagged as “suspect,” 1 or 2 or even 3 just happened to get the same questions wrong and submit at about the same time as the others. They just had bad luck.

I’m not saying that happened to all, but look: you do have this funnel effect with tests like these. All of your students are trying to get the same correct answer and finish before the same deadline. So its quite possible there will be overlap that is very coincidental. Not for all, but isn’t it at least worth a critical examination if even a small number of students could get hurt by coincidentally turning in their test at the same time others are?

(This also makes a good case for ungrading, authentic assessment, etc.)

Of course, the “suspected” part gets dropped by the end of the paper: “We have applied this method in a for credit course taught in Selene Unicauca platform and found that 17% of the students have performed academic dishonest actions, based on current conservative thresholds.” How did they get from “suspected” to “have performed?” Did they talk to the students? Not really. They looked at five students and felt that there was no way their numbers could be anything but academic dishonesty. Then they talked to the instructor and found that three students had complained about low grades. The instructor looked at their tests, found they had the exact same wrong answers, and… case closed.

This is why I keep saying that Learning Analytics research projects should be required to have an instructional designer or learning research expert on the team. I can say after reviewing course results for decades that it is actually common for students to get the same wrong answers and be upset about it because they were taught wrong. Instructors and Instructional Designers do make mistakes, so always find out what is going on. Its also possible that there was a conversation weeks ago where one student with the wrong information spread that information to several students when discussing the class. It happens.

But this is what happens when you don’t investigate fully and assume the data is all you need. Throwing in a side of assuming that cheaters act a certain way certainly goes a long way as well. So you can see a direct line from assumptions made about personality and demographics of who cheaters are, to using clickstream data to know what is going on in the brain, to assuming the data is all you need…. all the way to the Dartmouth Medical School scandal. Where there is at least a 41%-76% false accusation rate currently.

Video Content or Audio-Only Content For Online Courses: Which is Better?

Like many of you, I saw this Tweet about audio-only lectures making the rounds on Twitter:

https://twitter.com/sivavaid/status/1389592396820795397

Now, of course, many questioned “why lectures?” (which is a good question to ask), but the main discussion seemed to focus on the content of courses more than lectures specifically. Video content (often micro-content) is common in online courses. There were many points raised about accessibility (both of videos and audio-only lectures). Many seem to feel strongly that you should do either video content or audio-only content. My main thought was: instead of asking “either/or”… why not think “both/and”?

From certain points of view, audio-only content addresses some accessibility issues many rarely consider. When creating video content, the speaker will sometimes rely on visual-only clues and images without much narration, leaving those that are listening with gaps in their understanding. So while it is easy to say “if you don’t want video, then just play the video in the background and don’t watch,” sometimes the audio portion of a video leaves out key pieces of information. This is usually because when the content gets to a visual part, the speakers often assumes everyone playing the video can see.

“Look at what the red line does here…”

“When you see this, what do you think of?…”

And so on. People that record podcasts often know they have to describe any visuals they want to use so people listening know what they are talking about. For accessibility purposes, we really should be doing this in videos as well. Not to mention that it helps the information make more sense for every one regardless of disability.

There are other advantages to audio-only content as well, such as being able to download the audio file to various devices and take it with you where you go. Some devices do this with video files – but how often do we offer videos for download? And what if someone had limited access or storage capacity for massive video files? Auio-only mp3 files work for a wider variety of people on the technical level.

On the other hand, there are times when video is preferred. The deaf or hard of hearing often come to mind. Additionally, some people think that the focus that video requires helps them understand better. Video can also help increase teacher presence. Plus, video content is not the same as a Zoom call (or even a video lecture broadcast live), so its not really fair to throw both in the same bucket.

I would also point out that just because learners like audio-only one semester, that doesn’t mean the next semester of learners will. And I would guarantee that there are those in Vaidhyanathan’s course that didn’t really like the audio-only, but didn’t want to speak up and be the outlier.

Remember: Outliers ALWAYS exist in your courses. Never underestimate the silencing power of consensus.

But again, I don’t think it takes much extra time to give learners the option to choose for themselves what they want.

First of all, every video you post in a course should be transcribed and closed-captioned as aground rule – not only for accessibility, but also for Universal Design for Learning. But I also know that this is the ideal that often not supported financially at many institutions. For the sake of this article, I am not going to repeat the need to be proactive in making courses accessible.

So with that in mind, the main step that you will need to add into your course design process is to think through your video content (which is hopefully focused micro-content) and add in descriptions of any visual-only content. Don’t forget intro, transition, and ending graphics – speak out everything that will be on screen.

Then, while you are editing or finalizing the video, export to mp3 in addition to your preferred video format. Or use a tool that can extract the audio from the video (this is also helpful if you already have existing videos with no visual-only aspects). Offer that mp3 as a download on the page with the video (or even create a podcast with it). Now your students have the option to choose video or audio-only (or to switch as they like).

Also, once you get the video closed captioned, take the transcript and spend a few minutes collecting it into paragraphs to make it more readable. Maybe even add the images from the video in the document (you already would have full alt descriptions in the text). Then also put this file on the page with video as a downloadable file. You could even consider maybe collecting your transcripts into PressBooks and make your own OER. However you want to do it, just make it another option for learners to get the content.

Anyways… the idea here is that students can choose for themselves to watch the video, listen to the audio file, or read the transcript – all in the manner they want to on the device they want.

One of the questions that always comes up here is how to make the video content sound natural. Spontaneous/off-the-cuff recordings can miss material or go down a rabbit-hole. Plus you might forget to describe some visual content. But reading pre-written scripts sounds wooden and boring. One of my co-authors for Creating Online Learning Experiences (Brett Benham), wrote about how to approach this issue in Chapter 10: Creating Quality Videos. You can read more at the link, but the basic idea is to quickly record a spontaneous take on your content and have that transcribed (maybe even by an automatic service to save some money). Then take that transcript, edit out the side-trails, mistakes, and missteps, and use your edited document to record the final video. It will then be your spontaneous voice, but cleaned-up where needed and read for closed-captioning.

To recap the basics points:

  1. Think about which parts of your video content will have visual aspects, and come up with a description for those parts in words.
  2. Record your video content with the visual aspects, but make sure to cover those descriptions you came up with.
  3. Create mp3 files from your videos and add that to the course page with the video embed/link and transcription file.

If you want to go to the next level with this:

  1. Enable downloading of your videos (or store them in a service that allows downloads if that option is not possible in your LMS).
  2. Turn your mp3 files into a podcast so that learners can subscribe and automatically download to devices when you post new files.
  3. Take your transcriptions and re-format them (don’t change any words or add/delete anything) into readable text, along with the visuals from the video. Save this as an accessible PDF and let learners download if they like.
  4. Collect your PDF transcripts into a PressBook, where you can add the audio and video files/links/embeds as well.
  5. Maybe even add some H5P activities to your PressBooks chapters to make them interactive lessons.

Op-Ed: Online Proctoring is Not Essential

After one of my usual Twitter rants about proctoring software, I was asked to turn the rant into an Op-Ed. Elearning Inside liked it enough to publish it:

In a recent op-ed about online proctoring, ProctorU CEO Scott McFarland made some concerning claims about how he feels proctoring online exams is “essential” and “indispensable.” Many were quick to point out their skepticism of the owner of a proctoring company making such a claim.

One important detail that McFarland left out was that the exams or tests themselves are not essential. Not only that, he skipped over some of the largest concerns with proctoring, while also not accurately addressing the research that is happening in this area…..

You can read the rest of the article, where I make wild references to assessment gods, 5000% of students cheating, and general debunking of the current “cheating is everywhere” FUD. But the main point is that there is a better way based on solid course design.

The Problem of Learning Analytics and AI: Empowering or Resistance in the Age of “AI”

So where to begin with this series I started on Learning Analytics and AI? The first post started with a basic and over-simplified view of the very basics. I guess the most logical place to jump to is… the leading edge of the AI hype? Well, not really… but there is an event in that area happening this week, so I need to go there anyways.

I was a bit surprised that the first post got some attention – thank you to those that read it. Since getting booted out of academia, I have been unsure of my place in the world of education. I haven’t really said much publicly or privately, but it has been a real struggle to break free from the toxic elements of academia and figure out who I am outside of that context. I was obviously surrounded by people that weren’t toxic, and I still adjunct at a university that I feel supports its faculty… but there were still other systemic elements that affect all of us that are hard to process once you are gone.

So, anyway, I just wasn’t sure if I could still write anything that made a decent point, and I wasn’t too confident I did that great of a job writing about such a complex topic in a (relatively) short blog post last time. Maybe I didn’t, but even a (potentially weak) post on the subject seems to resonate with some. Like I said in the last post, I am not the first to bring any of this up. In fact, if you know of any article or post that makes a better point than I do, please feel free to add it in the comments.

So, to the topic at hand: this week’s Empowering Learners in the Age of AI conference in Australia. My concern with this conference is not with who is there – it seems to be a great group of very knowledgeable people. I don’t know some of them, but many are big names in the field that know their stuff. What sticks out to me is who is not there, as well as how AI is being framed in the brief descriptions we get. But neither of those points is specific to this conference. In fact, I am not really looking at the conference as much as some parts of the field of AI, with the conference just serving as proof that the things I am looking at are out there.

So first of all, to address the name of the conference. I know that “empowering learners” is a common thing to say not just in AI, but education in general. But it is also a very controversial and problematic concept as well. This is one concern that I hang on all of education and even myself as I like the term “empower” as well. No matter what my intentions (or anyone else’s), the term still places the institution and the faculty as the center of the power in the learning process – there to decide whether the learners get to be empowered or not. One of the best posts on this topic is by Maha Bali: The Other Side of Student Empowerment in a Digital World. At the end of the post, she gets to some questions that I want to ask of the AI field, including these key ones:

“In what ways might it reproduce inequality? How participatory has the process been? How much have actual teachers and learners, especially minorities, on the ground been involved in or consulted on the design, implementation, and assessment of these tools and pedagogies?”

I’ll circle back to those throughout the post.

Additionally, I think we should all question the “Age of AI” and “AI Society” part. It is kind of complicated to get into what AI is and isn’t, but the most likely form of AI we will see emerge first is what is commonly called “Artificial General Intelligence” (AGI), which a is deceptive way of saying “pretending to act like humans but not really be intelligent like we are.” AGI is really a focus on creating something that “does” the same tasks humans can, which is not what most people would attribute to an “Age of AI” or “AI Society.” This article on Forbes looks at what this means, and how experts are predicting that we are 10-40 years away from AGI.

Just as an FYI, I remember reading in the 1990s that we were 20-40 years away from AGI then as well.

So we aren’t near an Age of AI, probably not in many of our lifetimes, and even the expert options may not end up being true. The Forbes articles fails to mention that there were many problems with the work that claimed to be able to determine sexuality from images. In fact, there is a lot to be said about differentiating AI from BS that rarely gets brought up by the AI researchers themselves. Tristan Greene best sums it up in his article about “How to tell the difference between AI and BS“:

“Where we find AI that isn’t BS, almost always, is when it’s performing a task that is so boring that, despite there being value in that task, it would be a waste of time for a human to do it.”

I think it would have been more accurate to say you are “bracing learners for the age of algorithms” than empowering for an age of AI (that is at least decades off but may never actually happen according to some). But that is me, and I know there are those that disagree. So I can’t blame people for being hopeful that something will happen in their own field sooner than it might in reality..

Still, the most concerning thing about the field of AI is who is not there in the conversations, and the Empowering Learners conference follows the field – at least from what I can see on their website. First of all, where are the learners? Is it really empowering for learners when you can’t really find them on the schedule or in the list of speakers and panelists? Why is their voice not up front and center?

Even bigger than that is the problem that has been highlighted this week – but one that has been there all along:

The specific groups she is referring to are BIPOC, LGBTQA, and Disabilities. We know that AI has discrimination coded into it. Any conference that wants to examine “empowerment” will have to make justice front and center because of long existing inequalities in the larger field. Of course, we know that different people have different views of justice, but “empowerment” would also mean each person that faces discrimination gets to determine what that means. Its really not fair to hold a single conference accountable for issues that long existed before the conference did, but by using the term “empowerment” you are setting yourself up to a pretty big standard.

And yes, “empowerment” is in quotes because it is a problematic concept here, but it is the term the field of AI and really a lot of the world of education uses. The conference web page does ask “who needs empowering, why, and to do what?” But do they mean inequality? And if so, why not say it? There are hardly any more mentions of this question after it is brought up, much less anything connecting the question to inequality, in most of the rest of the program. Maybe it will be covered in conference – it is just not very prominent at all as the schedule stands. I will give them the benefit of the doubt until after the conference happens, but if they do ask the harder questions, then they should have highlighted that more on the website.

So in light of the lack of direct reference to equity and justice, the concept of “empowerment” feels like it is taking on the role of “equality” in those diagrams that compare “equality” with “equity” and “justice”:

Equality vs equity vs justice diagram
(This adaption of the original Interaction Institute for Social Change image by Angus Macguire was found on the Agents of Good website. Thank you Alan Levine for helping me find the attribution.)

If you aren’t going to ask who is facing inequalities (and I say this looking at the fields of AI, Learning Analytics, Instructional Design, Education, all of us), then you are just handing out empowerment the same to all. Just asking “who needs empowering, why, and to do what?” doesn’t get to critically examining inequality.

In fact, the assumption is being made by so many people in education that you have no choice but to utilize AI. One of the best responses to the “Equality vs Equity vs Justice” diagrams has come from Bali and others: what if the kids don’t want to play soccer (or eat an apple or catch a fish or whatever else is on the other side of the fence in various versions)?

Resistance is a necessary aspect of equity and justice. To me, you are not “empowering learners” unless you are teaching them how to resist AI itself first and foremost. But resistance should be taught to all learners – even those that “feel they are safe” from AI. This is because 1) they need to stand in solidarity with those that are the most vulnerable, to make sure the message is received, and 2) they aren’t as safe as they think.

There are many risks in AI, but are we really taking the discrimination seriously? In the linked article, Princeton computer science professor Olga Russakovsky said

“A.I. researchers are primarily people who are male, who come from certain racial demographics, who grew up in high socioeconomic areas, primarily people without disabilities. We’re a fairly homogeneous population, so it’s a challenge to think broadly about world issues.”

Additionally, (now former) Google researcher Timnit Gebru said that scientists like herself are

“some of the most dangerous people in the world, because we have this illusion of objectivity.”

Looking through the Empowering Learner event, I don’t see that many Black and Aboriginal voices represented. There are some People of Color, but not near enough considering they would be the ones most affected by discrimination that would impede any true “empowerment.” And where are the experts on harm caused by these tools, like Safiya Noble, Chris Gilliard, and many others? The event seems weighted towards those voices that would mostly praise AI, and it is a very heavily white set of voices as well. This is the way many conferences are, including those looking at education in general.

Also, considering that this is in Australia, where are the Aboriginal voices? Its hard to tell on the schedule itself. I did see on Twitter that the conference will start with an Aboriginal perspective. But when is that? In the 15 minute introductory session? That is no where near enough time for that. Maybe they are elsewhere on the schedule and just not noted well enough to tell. But why not make that a prominent part of the event rather than part of a 15 minute intro (if that is what it is)?

There are some other things I want to comment on about the future of AI in general:

  • The field of AI is constantly making references to how AI is affecting and improving areas such as medicine. I would refer you back to the “How to tell the difference between AI and BS” article for much  of that. But something that worries me about the entire AI field talking this way is that the are attributing “artificial intelligence” to things that boil down to advanced pattern recognition mainly using human intelligence. Let’s take, for example, recognizing tumors in scans. Humans program the AI to recognize patterns in images that look like tumors. Everything that the AI knows to look for comes directly from human intelligence. Just because you can then get the algorithm to repeat what the humans programmed it to thousands of times per hour, that doesn’t make it intelligence. It is human intelligence pattern recognition that has been digitized, automated, and repeated rapidly. This is generally what is happening with AI in education, defense, healthcare, etc.
  • Many leaders in education in general like to say that “institutions are ill-prepared for AI” – but how about how ill-prepared AI is for the equity and reality?
  • There is also often talk in the AI community about building trust between humans and machines that we see examples of at the conference as well: “can AI truly become a teammate in group learning or a co-author of a ground-breaking scientific discovery?” I don’t know what the speaker plans to say, but the answer is no. No we shouldn’t build trust and no we shouldn’t anthropomorphize AI. We should always be questioning it. But we also need to be clear, again, that AI is not the one that is writing (or creating music or paintings). This is the weirdest area of AI – they feed a bunch of artistic or music or literary patterns into AI, tell it how to assemble the patterns, and when something comes out it is attributed to AI rather than the human intelligence that put it all together. Again, the machine being able to repeat and even refine what the human put there in the first place is not the machine creating it. Take, for example, these different AI generated music websites. People always send these to me and say “look how well the machine put together ambient or grindcore music or whatever.” Then I  listen… and it is a mess. They take grindcore music and chop it up in to bits and then run those bits through pattern recognition and spit out this random mix – that generally doesn’t sound like very good grindcore. Ambient music works the best to uninitiated ears, but to fans of the music it still doesn’t work that great.
  • I should also point out about the conference that there is a session on the second day that asks “Who are these built for? Who benefits? Who has the control?” and then mentions “data responsibility, privacy, duty of care for learners” – which is a good starting point. Hopefully the session will address equity, justice, and resistance specifically. The session, like much of the field of AI, rests on the assumption that AI is coming and there is nothing you can do to resist it. Yes the algorithms are here, and it is hard to resist – but you still can. Besides, experts are still saying 10-40 years for the really boring stuff to emerge as I examined above.
  • I also hope the conference will discuss the meltdown that is happening in AI-driven proctoring surveillance software.
  • I haven’t gotten much into surveillance yet, but yes all of this relies on surveillance to work. See the first post. Watch the Against Surveillance Teach-In Recording.
  • I was about to hit publish on this when I saw an article about a Deepfake AI Santa that you can make say whatever you want. The article says “It’s not nearly as disturbing as you might think”… but yes, it is. Again, people saying something made by AI is good and realistic when it is not. The Santa moves and talks like a robot with zero emotion. Here again, they used footage of a human actor and human voice samples and the “AI” is an algorithm that chops it up into the parts that makes your custom message. How could this possibly be misused?
  • One of the areas of AI that many in the field like to hype are “conversational agents” aka chatbots. I want to address that as well since that is an area that I have (tried) to research. The problem with researching agents/bots is that learners just don’t seem to be impressed with them – it’s just another thing to them. But I really want to question how these count as AI after having created some myself. The process for making a chatbot is that you first organize a body of information into chunks of answers or statements that you want to send as responses. You then start “training” the AI to connect what users type into the agent (aka “bot”) with specific statements or chunks of information. The AI makes a connection and sends the statement or information or next question or video or whatever it may be back to the user. But the problem is, the “training” is you guessing dozens of ways that the person might ask a question or make a statement (including typos or misunderstandings) that matches with the chunk of information you want to send back. You literally do a lot of the work for the AI by telling it all the ways someone might type something into the agent that matches each chunk of content. They want at least 20 or more. What this means is that most of the time, when you are using a chatbot, it gives you the right answer because you typed in one of the most likely questions that a human guessed and added to the “training” session. In the rare cases where some types something a human didn’t guess, then the Natural Language Processing kicks in to try and guess the best match. But even then it could be a percentage of similar words more than “intelligence.” So, again, it is human intelligence that is automated and re-used thousands of times a minute – not something artificial that has a form of intelligence. Now, this might be useful in a scenario when you have a large body of information (like an FAQ bot for the course syllabus) that could use something better than a search function. Or maybe a branching scenarios lesson. But it takes time to create a good chatbot. There is still a lot of work and skill to creating the questions and responses well. But to use chatbots for a class of 30, 50, 100? You probably will spend so much time making it that it would be easier to just talk to your students.
  • Finally, please know that I realize that what I am talking about still requires a lot of work and intelligence to create. I’m not doubting the abilities of the engineers and researchers and others that put their time into developing AI. I’m trying to get at the pervasive idea that we are in an Age of AI that can’t be avoided. Its a pervasive idea that was even made in a documentary web series a year ago. I also question whether “artificial intelligence” is the right term for all of this, rather than something more accurate like “automation algorithms.”

Again, everything I touch on here is not as much about this conference, as it is about the field of AI since this conference is really just a lot of what is in the AI field concentrated into two days and one website. The speakers and organizers might have already planned to address everything I brought up here a long time ago, and they just didn’t get it all on the website. We will see – there are some sessions with no description and just a bio. But still, at the core of my point, I think that educators need to take a different approach to AI than we have so far (maybe by not calling it that when it rarely is anything near intelligent) by taking justice issues seriously. If the machine is harming some learners more than others, the first step is to teach resistance, and to be successful in that all learners and educators need to join in the resistance.

The Problem of Learning Analytics and AI

For some time now, I have been wanting to write about some of the problems I observed during my time in the Learning Analytics world (which also crosses over into Artificial Intelligence, Personalization, Sentiment Analysis, and many other areas as well). I’m hesitant to do so because I know the pitchforks will come out, so I guess I should point out that all fields have problems. Even my main field of instructional design is far from perfect. Examining issues with in a field (should be) a healthy part of the growth of a field. So this will probably be a series of blog posts as I look at publications, conferences, videos, and other aspects of the LA/PA/ML/AI etc world that are in need of a critical examination. I am not the first or only person to do this, but I have noticed a resistance by some in the field to consider these viewpoints, so hopefully adding more voices to the critical side will bring more attention to these issues.

But first I want to step back and start with the basics. At the core of all analytics, machine learning, AI, etc are two things: surveillance and algorithms. Most people wouldn’t put it this way, but let’s face it: that is how it works. Programs collect artifacts of human behavior by looking for them, and then process those through algorithms. Therefore, the core of all of this is surveillance and algorithms.

At the most basic level, the surveillance part is a process of downloading a copy of data from a database that was intentionally recording data. That data is often a combination of click-stream data, assignment and test submissions, discussion forum comments, and demographic data. All of this is surveillance, and in many cases this is as far as it goes. A LOT of the learning analytics world is based on click stream data, especially with an extreme focus on predictive analytics. But in a growing number of examples, there are also more invasive forms of surveillance added that rely on video recordings, eye and motion detection, bio-metric scans, and health monitoring devices. The surveillance is getting more invasive.

I would also point out that none of this is accidental. People in the LA and AI fields like to say that digital things “generate” data, as if it is some unintentional by-product of being digital: “We turned on this computer, and to our surprise, all this data magically appeared!”

Data has to be intentionally created, extracted, and stored to exist in the first place. In fact, there usually is no data in any program until programmers decide they need it. They will then create a variable to store that data for use within the program. And at this moment is where bias is introduced. The reason why certain data – like names, for example – are collected and others aren’t has to do with a bias towards controlling who has access and who doesn’t. Then that variable is given a name – it could be “XD4503” for all the program cares. But to make it easier for programmers to work together, they create variables names that can be understood by everyone on the team: “firstName,” “lastName,” etc.

Of course, this designation process introduces more bias. What about cultures that have one name, or four names? What about those that have two-part names, like the “al” that is common in the Arabic names, but isn’t really used for alphabetizing purposes? What about cultures that use their surname as their first name? What about random outliers? When I taught eighth grade, I had two students that were twins, and their parents gave them both nearly identical sets of five names. The only difference between the two was that the third name was “Jevon” for one and “Devon” for the other. So much of the data that is created – as well as how it is named, categorized, stored, and sorted – is biased towards certain cultures over others.

Also note here that there is usually nothing that causes this data to leave the program utilizing it. In order for some outside process or person to see this data, programmers have to create a method for displaying and / or storing that data in database. Additionally, any click stream, video, or bio-metric data that is stored has to be specifically and intentionally captured in ways that can be stored. For example, a click in itself is really just an action that makes a website execute some function. It disappears after that function happens – unless someone creates a mechanism for recording what was clicked on, when it was clicked, what user was logged in to do the click, and so on.

All of this to say that none of this is coincidental, accidental, or unplanned. There is a specific plan and purpose for every piece of data that is created and collected outside of the program utilizing them. None of the data had to be collected just because it was magically “there” when the digitials were turned on. The choice was made to create the data through surveillance, and then store it in a way that it could be used – perpetually if needed.

Therefore, different choices could be made to not create and collect data if the people in control wanted it that way. It is not inevitable that data has to be generated and collected.

Of course, most of the few people that will read this blog already know all of this. The reason I state this all here is for anybody that might still be thinking that the problems with analytics and AI is created during the design of the end user products. For example, some believe that the problems that AI proctoring has with prejudice and discrimination started when the proctoring software was created… but really this part is only the continuation of problems that started when the data that these AI systems utilized was intentionally created and stored.

I think that the basic fundamental lens or mindset or whatever you want to call it for publishing research or presenting at conferences about anything from Learning Analytics to AI has to be a critical one rooted in justice. We know that surveillance and algorithms can be racist, sexist, ablest, transphobic, and the list of prejudices goes on. Where people are asking the hard questions about these issues, that is great. Where the hard questions seem to be missing, or people are not digging deep enough to see the underlying biases as well, I want to blog about it. I have also noted that the implementation of LA/ML/AI tools in education too often lacks input from the instructional design / learning sciences / etc fields – so that will probably be in the posts as well.

While this series of posts is not connected to the Teach-In Against Surveillance, I was inspired to get started on this project based on reflecting on why I am against surveillance. Hopefully you will join the Teach-In tomorrow, and hopefully I will get the next post on the Empowering Learners for the Age of AI conference written in this lifetime. :)

People Don’t Like Online Proctoring. Are Institutional Admins Getting Why?

You might have noticed a recent increase in the complaints and issues being leveled against online proctoring companies. From making students feeling uncomfortable and/or violated, to data breaches and CEOs possibly sharing private conversations online, to a growing number of student and faculty/staff petitions against the tools, to lawsuits being leveled against dissenters for no good reason, the news has not been kind to the world of Big Surveillance. I hear the world’s tiniest violin playing somewhere.

It seems that the leadership at Washington State University decided to listen to concerns… uhhh… double down and defend their position to use proctoring technology during the pandemic. While there are great threads detailing different problems with the letter, I do want to focus in on a few statements specifically. Not to specifically pick on this one school, but because WSU’s response is typical of what you hear from too many Higher Ed administrations. For example, when they say…

violations of academic integrity call into question the meaningfulness of course grades

That is actually a true statement… but not in the way it was intended. The intention was to say that cheating hurts academic integrity because it messes up the grade structures, but it could also be taken to say that cheating calls into highlights the problem with the meaningfulness of grades because cheating really doesn’t affect anyone else.

Think about it: someone else cheats, and it casts doubt on the meaning of my grade if I don’t cheat? How does that work exactly? Of course, this is a nonsense statement that reals highlights how cheating doesn’t change the meaning of grades for anyone else. Its like the leaders at this institution are right there, but don’t see the forest for the trees: what exactly does a grade mean if the cheaters that get away with it don’t end up hurting anyone but themselves? Or does cheating only cause problems for non-cheaters when the cheaters get caught? How does that one work?

But let’s focus here: grades are the core problem. Yes, many people feel they are arbitrary and even meaningless. Still others say they are unfair, while some look at them as abusive. At the very least, you really should realize grades are problematic. Students can guess and get a higher grade than what they really actually know. Tests can be gamed. Questions have bias and discrimination built in too many times. And so on. Online proctoring is just an attempted fix for a problem that existed long before “online” was even an option.

But let’s see if the writers of the letter explain exactly how one person cheating harms someone else… because maybe I am missing something:

when some students violate academic integrity, it’s unfair for the rest. Not only will honest students’ hard work not be properly reflected…. Proctoring levels the playing field so that students who follow the rules are not penalized in the long run by those who don’t.

As someone that didn’t cheat in school, I am confused as to how this exactly works. I really never spent a single minute caring about other students’ cheating. You knew it happened, but it didn’t affect you, so it was their loss and not yours. In fact, you never lost anything in the short or long run from other student’s cheating. I have no clue how my hard work was not “properly reflected” by other students’ cheating.

(I would also note that this “level the playing field” means that they assume proctoring services catch all “cheaters” online, just like have instructors in the classroom on campus meant that all of the “cheaters” in those classes. But we all know that is not the case.)

I have never heard a good answer for how does supposed “penalization” works. Most of the penalization I know of from classes are systemic issues against BIPoC students that happens in ways that proctoring never deals with. You sometimes wish institutions would put as much money into fighting that as they would spying through student cameras….

But what about the specific concerns with how these services operate?

Per WSU’s contract, the recorded session is managed by an artificial intelligence “bot” and no human is on the other end at ProctorU watching the student. Only the WSU instructor can review the recorded session.

A huge portion of the concern about proctoring has been about the AI bots – which are here presented as an “it’s all okay because” solution…? Much of the real concern many have expressed is with the algorithms themselves and how they are usually found to be based on racist, sexist, and ableist norms. Additionally, the other main concern is what the instructor might see when they do review a recording of a student’s private room. No part of the letter in question addresses any of the real concerns with the bigger picture.

(It is probably also confusing to people whether or not someone is watching on the other side of the camera when there are so many complaints online from students that have had issues with human proctors, especially ones that were “insulting me by calling my skin too dark” as one complaint states.)

The response then goes on to talk about getting computers that will work with proctoring service to students that need them, or having students come in to campus for in-person proctoring if they just refuse to use the online tool. None of this addresses the concerns of AI bias, home privacy, or safety during a pandemic.

The moral of the point I am making here is this: if you are going to respond to concerns that your faculty and staff have, make sure you are responding to the actual concerns and not some imaginary set of concerns that few have expressed. There is a bigger picture as to why people are objecting to these services, which – yes – may start with feeling like they are being spied on by people and/or machines. But just saying “look – no people! (kind of)” is not really addressing the core concerns.