The threat to the integrity of educational assessments is not from ‘essay mills’ but from Artificial Intelligence (AI)

The threat to the integrity of educational assessments is no longer from ‘essay mills’ and contract cheating but from Artificial Intelligence (AI).

It is not so long ago that academics complained that essay mills, ‘contract cheating’ services, and commercial companies piecing together ‘bespoke’ answers to standard essay questions, were undermining the integration of higher education’s assessment processes. The outputs of these less than ethically justifiable endeavours tried to cheat the plagiarism detection software (such as Turnitin and Urkund) that so many institutions have come to rely on. This reliance, in part the result of the increase in the student-tutor ratio, the use of adjunct markers and poor assessment design, worked for a while. It no longer works particularly well.


If you are interested in reviewing your programme or institutional assessment strategy and approaches please get in touch. This consultancy service can be done remotely. Contact me


Many institutions sighed with relief when governments began outlawing these commercial operations (in April 2022 the UK passed the ‘Skills and Post-16 Education Act 2022’ following NZ and Australian examples) and went back to the business-as-usual. For the less enlightened this meant a return to setting generic questions, decontextualised, knowledge recitation essay tasks. Some have learnt to at least require a degree of contextualisation of their students’ work, introduced internal self-justification and self-referencing, requiring ‘both sides’ arguments rather than declared positions, and applied the ‘could this already have been written’ test in advance. Banning essay mills, or ‘contract cheating’, is necessary, but it is not enough to secure the integrity of assessment regimes.

Why students plagiarise is worthy of its own post, but suffice it to say it varies greatly depending on the student. A very capable student may simply be terrible at time management and fear running out of time or feel the assessment is unworthy of them. Another student may be fearful of their ability to express complex arguments and in pursuit of the best possible grade, plagiarise. Some may simply have not learnt to cite and reference, or to appreciate that rewording someone else’s thoughts without attributing them also constitutes plagiarism. And there is that category of students whose cultural reference point, deference to ‘the words of the master’, make plagiarism conceptually difficult for them to understand.

I remember receiving my most blatant example of plagiarism and academic malpractice back in 2006. A student submitted a piece of work that included 600 words copied wholesale from Wikipedia, complete with internal bookmarks and hyperlinks. I suspect the majority of students are now sufficiently digitally literate not to make that mistake, but how many are also now in a position to do what the essay mills used to do for them, stitch together, paraphrase and redraft existing material using freely available AI text generation tools.

As we encourage our students to search the web for sources, how easy is it for them now to access some of the easily accessible, and often free, online tools? These tools include https://app.inferkit.com/demo which allows you to enter a few sentences and then generate longer texts on the basis of that origin. You can enter merely a title, of at least five words, or a series of sentences into https://smodin.io/writer and have it generate a short essay, free of references. Professional writing tools aimed at marketers, such as https://ai-writer.com, would cost a subscriber to be effective but would allow students to generate passable work. This last tool actually tells you the sources from which its abstractions have been drawn, including academic journals.

You might find it enlightening to take something you have published and put it through one of these tools and evaluate the output.

It is insufficient to ask the student to generate their own question, or even to ask the student to contextualise their own work. Some of the emergent AI tools can take account of the context. There is a need to move away from the majority of long-form text assessments. With the exception of those disciplines where writing more than a thousand words at once is justifiable (journalism, policy studies, and some humanities subjects), there is a need to make assessments as close to real-world experience as possible. It needs to be evidently the product of an individual.

Paraphrasing is a skill. A valuable one in a world where most professions do not lack pure information. The issue is to evaluate the quality of that information and then be able to reduce it to a workable volume.

I’ve worked recently with an institution reviewing its postgraduate politics curriculum. I suggested that rather than try and stop students from ‘cheating’ by paraphrasing learned texts, they should encourage the students to learn what they need to do to enhance the output of these AI tools. Using one of these tools to paraphrase, and essentially re-write, a WHO report for health policy makers made it more readable, it also left out certain details that would be essential for effective policy responses. Knowing how to read the original, use a paraphrasing tool, and being able to explore the deficiencies of its output and correct them, was a useful skill for these students.

We cannot stop the encroachment of these kinds of AI text manipulation tools in higher education, but we can make their contemporary use more meaningful to the student.


If you are interested in reviewing your programme or institutional assessment strategy and approaches please get in touch. This consultancy service can be done remotely. Contact me.


Image was generated by DALL-e



Evaluation, Assessment and Feedback (Guidance to Educators)

Transcript

Welcome all. Please feel free to share this video if you think it would be of interest to your colleagues.

I want to talk today about some of the terminological differences that we have across the English language teaching world, particularly the terms, evaluation, assessment, and feedback. In North America, the word evaluation is very often used to describe the way we measure students’ performance. In United Kingdom, in Australia and New Zealand, we generally use the term assessment. So evaluation has a different meaning in parts of English-speaking world than it does in North America. Likewise, Assessment and evaluation are sometimes used more as synonyms in the North American context. And you need to be aware of that when you read literature, if you read any of the journals, you will find that sometimes those terms are used differently to perhaps your context. So, it’s worth being aware of that.

There’s also a distinction between evaluation and feedback, which is more conceptual rather than definitional. Which is that feedback is always what we give to the student. We should always be focusing on the feedback that’s given to students on their learning and evaluation in the UK, Canada, Canada, to some extent, but certainly in Australia and New Zealand, is used to describe what they tell us about our own performances tutors, or about the course or the institution. So, they provide evaluative comment, and we provide them with feedback.

I think it’s important that we try and stick to that use of language. If only because students need to value feedback in everything they do, and it’s much easier to label things as feedback for the benefit of your students if you’re consistent in the language that you use. So, feedback is given to students. Evaluation is provided by students, and evaluation in North America is sometimes synonymous with assessment. I hope that’s of interest.

Please feel free to like, share, and follow.

Be well.

Designing Courses: Thinking about Programming Assessments (5’52”)

In this short video (5’52”), Simon touches on three basic principles of programming assessments. The first is that it should be programme wide, the second that assessing outcomes not content provides future flexibility, and the third that summative (or credit-bearing) assessments do not have to be final or terminal assessments. Assessment is one of the most difficult areas for faculty to become comfortable with. Most will have experienced badly designed assessment themselves and their expectations of their academic managers, programme leaders and their students are often low. This is a shame because well-designed assessment can be a pleasure for both students and faculty.

These resources from 2013-2017 are being shared to support colleagues new to teaching online in the face of the COVID-19 pandemic.

Designing Courses: Assessment – First Principles (9’38”)

In this short lecture (9’38”), Simon outlines the basic structure of sound assessment. Describing reliability, validity, and fairness in assessment and exploring a range of different assessment forms. These range from diagnostic to synoptic (capstone), to formative and summative. Being familiar with some of the language around assessment is important in order to get the most of the literature and others’ experiences. I believe that well-designed assessment is something all faculty will want to be involved in grading and marking, rather than trying to pass those duties onto others. Assessing your own students should be a fulfilling experience, and well-designed assessment enable that to happen.

These resources from 2013-2017 are being shared to support colleagues new to teaching online in the face of the COVID-19 pandemic. Consultancy for International Higher Education from Simon Paul Atkinson

Developing a Meaningful Assessment Strategy (5/8-SLDF)

Student expectations are serious constraints on imaginative assessment processes. Students are taught through their educational experience that ‘final grades matter’ so it is natural that they become fixated on the final assessment. A transparent design that closely reflects the ILOs within the teaching activity and assessment will necessarily engage students in a broader and deeper understanding of their learning journey.

See Workshops  or the page associated with this stage of the 8-SLDF

Following the 8-Stage Learning Design Framework, we know at this stage what our intended learning outcomes (ILO) are. This enables us to design meaningful assessment that provides opportunities to students to evidence their learning against those ILOs. It is important to initially identify which outcomes across different domains of learning can be combined through assessment. This allows us to manage the assessment load, for both faculty and student, while ensuring all ILOs are assessed. Using taxonomy circles, we can then draft marking rubrics for the appropriate level that represent all the guidance that individual assessors and students need to guide their practice.

There is also a challenge to meet the needs for assessment, or the outcomes, sometimes even the content, dictated by external bodies.  In many programmes, there is pressure to assess a range of skills and behaviours beyond subject knowledge. The challenge is to design assessments that allow students to demonstrate a range of skills (across various domains) through a single assessment. Rather than abdicate our responsibility as learning designers, this is a call to understand how better to articulate the relationship between what the intended learning outcomes of a course are, how it is being assessed and what is being experienced as learning by the students.

 

Drafting an assessment framework is an iterative process. Ideally one designs the assessment at the same time as one writes module ILOs, ‘tweaking’ them to give them depth and flexibility at the same time. If students are properly guided to generate well-structured evidence, it can be a fascinating and engaging process to assess them.

Explore the 8-SLDF here for the fuller details of assessment and all other stages.

 

 

Four contractual agreements to make with your students about feedback (3’30”)

There are social conventions, unwritten rules, around feedback in a formal education setting. Most students associate feedback as coming from the voice of authority in the form of red marks on a written script! It is important to redefine feedback for university and professional learners.

In this short overview video (3’30”) Simon outlines four ‘contractual’ arrangements all faculty should establish at the outset of their course or module with respect to feedback for learning.

These are
1) ensuring that students know WHERE feedback is coming from
2) WHEN to expect feedback
3) WHAT you mean by feedback
4) WHAT to DO with the feedback when it’s received.

  1. Feedback is undoubtedly expected from the tutor or instructor but there are numerous feedback channels available to students if only they are conscious of them. These include feedback from their peers but most important from self-assessment and learning activities designed in class.
  2. Knowing where feedback is coming from as part of the learning process relieves the pressure on the tutor and in effect makes feedback a constant ‘loop’, knowing what to look out for and possibly having students document the feedback they receive supports their metacognitive development.
  3. Being clear with students as to what you regard as feedback is an effective way of ensuring that students take ownership of their own learning. My own personal definition is extremely broad, from the feedback one receives in terms of follow-up comments for anything shared in an online environment to the nods and vocal agreement shared in class to things you say. These are all feedback. Knowing that also encourages participation!
  4. Suggesting to students what they do with feedback will depend a little bit on the nature of the course and the formal assessment processes. Students naturally enough don’t do things for the sake of it so it has to be of discernable benefit to them. If there is some form of portfolio based coursework assessment you could ask for an annotated ‘diary’ on feedback received through the course. If its a course with strong professional interpersonal outcomes (like nursing or teaching for example) you might ask students to identify their favourite and least favourite piece of feedback they experienced during the course, with a commentary on how it affected their subsequent actions.

What’s important is to recognise that there are social conventions around feedback in a formal education setting, normally associated with red marks on a written script! It is important to redefine feedback for university and professional learners.

Simon Paul Atkinson (PFHEA)
https://www.sijen.com
SIJEN: Consultancy for International Higher Education

Four Types of Assessment (Video 4’42”)

In response to a question from a client, I put together this short video outlining four types of assessment used in higher education, formative, summative, ipsative and synoptic. It’s produced as an interactive H5P video. Please feel free to link to this short video (under 5 mins) as a resource if you think your students would find it of use.

Books links:

Book cover pokorny_warren_2016https://amzn.to/2INGIgq

Pokorny, H., & Warren, D. (Eds.). (2016). Enhancing Teaching Practice in Higher Education. SAGE Publications Ltd.

Book Cover for Irons 2007
https://amzn.to/2INh4sq

Irons, A. (2007). Enhancing Learning through Formative Assessment and Feedback (New edition). Routledge.

Book Cover for Hauhart 2014https://amzn.to/2IKdzD3

Hauhart, R. C. (2014). Designing and Teaching Undergraduate Capstone Courses (1 edition). San Francisco: Jossey-Bass.

Book Cover for Boud 2018https://amzn.to/2sgnTaz

Boud, D., Ajjawi, R., Dawson, P., & Tai, J. (Eds.). (2018). Developing Evaluative Judgement in Higher Education: Assessment for Knowing and Producing Quality Work (1 edition). Abingdon, UK: Routledge.

Simplifying the Alignment of Assessment

Some recent work with programme designers in other UK institutions suggests to me that quality assurance and enhancement measures continue to be appended to the policies and practices carried out in UK HEIs rather than seeing a revitalising redesign of the entire design and approval process.

This is a shame because it has produced a great deal of work for faculty in designing and administering programmes and modules, not least when it comes to assessment. Whatever you feel about intended learning outcomes (ILOs) and their constraints or structural purpose, there is nearly universal agreement that the purpose of assessment is not to assess students ‘knowledge of the content’ on a module. Rather the intention of assessment is to demonstrate higher learning skills, most commonly codified in the intended learning outcomes. I have written elsewhere about the paucity of writing effective ILOs and focusing them almost entirely the cognitive domain (intellectual skills), with the omission of other skill domains notably the effective (professional skills) and the psychomotor (transferable skills). Here I want to identify the need for close proximity between ILOs and assessment criteria.

It seems to me that well-designed intended learning outcomes lead to cogent assessment design. They also suggest that the use of a transparent marking rubric, used by both markers and students, creates a simpler process.

To illustrate this I wanted to share two alternative approaches to aligning assessment to the outcomes of a specific module. In order to preserve the confidentiality of the module in question some elements have been omitted but hopefully the point will still be clearly made.

Complex Attempt to Assessment Alignment

Complex Assessment AlignmentI have experienced this process in several Universities.

  1. Intended Learning Outcomes are written (normally at the end of the ‘design’ process)
  2. ILOs are mapped to different categorizations of domains, Knowledge & Understanding, Intellectual Skills, Professional Skills and Attitudes, Transferable Skills.
  3. ILOs are mapped against assessments, sometimes even mapped to subject topics or weeks.
  4. Students get first sight of the assessment.
  5. Assessment Criteria are written for students using different categories of judgement: Organisation, Implementation, Analysis, Application, Structure, Referencing, etc.
  6. Assessment Marking Schemes are then written for assessors. Often with guidance as to what might be expected at specific threshold stages in the marking scheme.
  7. General Grading Criteria are then developed to map the schemes outcomes back to the ILOs.

Streamlined version of aligned assessment

streamlined marking rubric

I realise that this proposed structure is not suitable for all contexts, all educational levels and all disciplines. Nonetheless I would advocate that this is the optimal approach.

  1. ILO are written using a clear delineation of domains; Knowledge, Cognitive (Intellectual), Affective (Values), Psychomotor (Skills) and Interpersonal. These use appropriate verb structures tied directly to appropriate levels. This process is explained in this earlier post.
  2. A comprehensive marking rubric is then shared with both students and assessors. It identifies all of the ILOs that are being assessed. In principle we should only be assessing the ILOs in UK Higher Education NOT content. The rubric will differentiate the type of responses expected to achieve varies grading level.
    • There is an option to automatically sum grades given against specific outcomes or to take a more holistic view.
    • It is possible to weight specific ILOs as being worth more marks than others.
    • This approach works for portfolio assessment but also for a model of assessment where there are perhaps two or three separate pieces of assessment assuming each piece is linked to two or three ILOs.
    • Feedback is given against each ILO on the same rubric (I use Excel workbooks)

I would suggest that it makes sense to use this streamlined process even if it means rewriting your existing ILOs. I’d be happy to engage in debate with anyone about how best to use the streamlined process in their context.

Will Lightwork make a Mark?

Why is it that whenever we want to reward academic staff, the incentive is to “buy yourself out of teaching”, or at the very least “offload some marking”? Of course, the answer is often that the alternatives are to remove yourself from service or administration (and the place grinds to a halt) or, God Forbid, let up on the research outputs. So teaching is the malleable element, and assessment all the more so.

Shame. How do you really know if your teaching is effective if you don’t see the results? How can you revise and improve your paper if you don’t complete that feedback loop for students?

Of course, marking can be a fairly tedious process, even a favourite movie gets tiresome after the twentieth viewing, but it’s a necessary process and anything that makes it a little easier has to be a good thing.

So I picked up this application here at Massey University called Lightwork. a development project led by Dr. Eva Heinrich, the desktop client integrates with Moodle and its gradebook. Once ‘paired’ the Lightwork downloads student details and allows the creation of marking rubrics and assigned markers, these are then synchronised back to Moodle so the end result is that approved grades in Lightwork are uploaded into the gradebook along with a PDF of the completed marking rubric. Well worth a look. I confess I’m playing in a paper with only 10 students, but just the admin time saved not having to save feedback forms under different student names etc, must be worth it.

Screenshots of Lightwork Assessment Tool
Lightwork: Rubrics and Student PDF Feedback form generated in Moodle
%d bloggers like this: