Skip to content

8. Evaluation

Student Profiles


Media Choices

Intended Learning Outcomes


Learning and Teaching Activities



In-course and Post Course Evaluation Strategies

Workshop aligned to UKPSF A5, K5, K6, V3

It may seem strange to design our evaluation structures before we have even recruited students onto our programmes. We need first to understand the distinction between assessment, feedback and evaluation. It is then important to explore both the evaluation of learning experiences, including the robustness of your course design, and evaluation for-learning, which I will refer to as in-class evaluation for the sake of consistency.

Basic Concepts

This stage explores five basic concepts that underpin the evaluation of learning.

  1. Distinguishing between Evaluation, Feedback and Assessment
  2. Measuring Student Performance versus Teacher Performance
  3. In-Class evaluation versus Post-Completion evaluation
  4. Learning Gain
  5. Progression: Access, Retention, Pass Rates, Grades, Completion and Destination

Distinguishing between Evaluation, Feedback and Assessment

The language is Higher Education is often inconsistent and less than helpful. American literature often refers to the evaluation of learning in place of the British English use of assessment for learning. It is important to watch out for these linguistic differences as your review international literature. Although it is rare to conflate the terms evaluation and assessment in a UK context it is not uncommon to confuse evaluation and feedback. See the table below for clarification.

Distinguishing between evaluation, feedback and assessment

In addition to the table above, here are three working definitions.

  • Evaluation is all communication designed to elicit evaluative comments provided by students on their learning experiences, and/or from academic peer review and self-reflection on our teaching practices.
  • Feedback is all communication designed to support the future learning capabilities of a student.
  • Assessment is all communication designed to enable the student to evidence their ability to meet a defined learning outcome.

We should try and be consistent in our language and use feedback only when concerned with all communication that students' receive designed to advance their learning, rather than what they give. This would greatly enhance students’ assessment and feedback comprehension, their 'academic literacy'.

Measuring Student Performance versus Teacher Performance

Module evaluation reports are a common feature of higher education. It is important to distinguish the sources of data used to establish ‘performance’. Both will feature in a comprehensive evaluation of a course or programme. These are often produced as an annual programme or module review.

Both staff and student performance are generally measured differently. There is evidence captured during the course itself, that captured at the end of the course delivery and that which is collected later the course has been completed. Some of these data sources are detailed here in the table below.

student versus staff performance

Staff performance is measurable during the course itself. The individual themselves can organise in-class evaluation activities with their own students (see below) and routinely recording personal reflections on their own performance. This could take the form a commentated micro-teaching recording or keeping a learning journal. In addition, there is a range of external perspectives on teacher performance that are possible. The most common are peer observations, either for developmental or management purposes and teaching scores.

At the end of the course, or before and after marking assessments, it is a good idea to make a teaching reflection. Take the four questions featured in the in-class evaluation process (see below) as a basis.

After a course has been completed it is common to have students score faculty performance. Such scores are almost ubiquitous despite being a somewhat flawed mechanism which fails to take account of the cohorts prior learning, competence, or the various factors that impact on student evaluation such as gender and ethnic bias. Nonetheless, most institutions persist and have moved to standardise and centralise data collection, using centrally administered evaluation software, so that adjustments and interpretations can be made. Better mechanisms for quality enhancement are to review marking moderation and to routinely peer review of feedback offered to students in the course of the learning as well as on final credit-bearing assessments. Peer reviewing our ability to assess and feedback is important to continuously enhance our capability to ‘teach’ in the classroom. The final external measurement is provided by external examiners and reviewers. These are normally made in incredibly diplomatic language and are unlikely to ‘name names’ but they do serve to contextualise other data sources. External examiners are also a common measurement data source for student performance.

Student Performance can also be evaluated during, at the end of, and after, a course.

Student Performance during a course is ordinarily measured by using all of the techniques for feedback throughout dealt with in stage 7 of the 8-SLDF. Faculty make use of this feedback as an evaluative instrument too. Increasingly there is also a range of learning analytics available for those studying entirely online. There is a mixed picture for those undertaking blended provision. Whilst in-class evaluation is designed to guide the teaching practitioner towards enhancing the student experience it also provides insights into student performance too.

Student performance can also be evaluated through the lens of the assessment tasks designed to support students to evidence their learning against the outcomes. Self-declared performance indicators are usually included in end-of-module questionnaires too.

After a course has been completed evaluations based on students perspectives are also possible through analysis of the grades awarded, the progression of individual students and cohorts and the number of students that withdraw from their studies.

None of these factors on its own tells anyone anything about the quality of learning design or its delivery.

In-Class evaluation vs Post-Completion evaluation

Course designers need to decide in advance where we anticipate the enhancement opportunities are for our course will be and design in-class and post-course evaluation instruments to capture them. Most institutions' NSS and end-of-module evaluation processes do not generate actionable data. We can design-in some of our own.

Most of you will be familiar with the concept of ‘end-of-module’ evaluations. They provide data on the prior experience of students on a specific module. Aggregated with other module results one can build a picture of the student experience across a specific programme. Most UK institutions collect data based on the current national survey instrument, the National Student Survey or NSS). Most Universities also centralise the data collection processes in order to provide consistency and common adjustments to be made. Increasingly data is collected online and immediately processed so that it can be re-presented back to students and faculty.

It is also important to ensure that we have efficient and effective in-course evaluation techniques already in mind to make sure there is an opportunity to enhance the course as it is underway. We need to avoid making knee-jerk adjustments to a module that appears not to be working. In-course evaluation needs to be appropriately positioned within a course, with the correct amount of time and preparation allowed. This invaluable evaluative process allows faculty to capture students experience of their learning whilst the module is still running. This has various advantages:

  • It allows faculty to make minor adjustments to suit specific cohorts
  • It gives students’ the feeling that they are listened to
  • It transfers some of the responsibility for learning back to the student

This technique is described as Small Group Instructional Diagnostics or SGID for short (Seldin, 1999). Here we describe it simply as In-Class Evaluation. There are two models:

  • A colleague who does not teach the current students undertakes an in-class discussion eliciting responses to four set questions and then writes a short report for those students and their tutor. This means sacrificing 30 minutes of your class time.
  • The Tutor distributes a paper-based questionnaire with the same four questions (or a digital equivalent), collects the responses and feeds back to the cohort at their next encounter.


The timing of such an intervention is important. Ideally, it will be early enough in a module for both students and tutors to benefit from the intervention. But, it also has to be grounded on ‘genuine’ learning and teaching activities so there is no point in doing it immediately after a typical induction week for example. I’d suggest about week 3 or 4 of a 10-week module, aim roughly for a third of the way in.


The order and structure of the questions are important. It ensures that the focus of the students’ evaluative comments is focussed on their learning rather than on your teaching. You also want students to provide actionable responses.

The set questions are:

  1. What is helping and supporting my learning on this module?
  2. What is stopping me from learning on this module?
  3. What could the tutor do differently to improve my learning on the module?
  4. What could I do differently to improve my learning on this module?


What is vital is that students' know and understand that their evaluative comments have been read and acted upon. That does not mean that everything a student says must be honoured or implemented but it should be acknowledged. The easiest way of doing this is to summarise the points made against each of the questions, aggregate the data and make sure it is anonymised, and then provide it back to the student with your considered responses.

For example, I ran an online module (focussed on reflective and professional practice) and a small but vocal minority was insistent that I should provide the Powerpoints used to illustrate my web-seminars in advance. I declined but it did give me an opportunity to show that the Powerpoints were designed not as a stand-alone learning resource but as a visual stimulus for debate.

On another occasion (a contemporary issues module), there was a majority of a cohort who wanted to have access to all of the readings in advance. I consented to provide those but made it clear that I reserved the right to substitute literature later in the course if something changed.

What is important is that you see the responses as valuable evaluative data about your practice without actually asking students to grade you or directly comment on your performance.

Design Tip: Build this process into your module design and allow time in-class for students to complete it. Normally a paper-based activity should take 3-5 minutes to explain and 10 minutes for students to complete.

Learning Gain

A highly disputed term in educational circles currently is that of ‘Learning Gain’. It is hardly a new concept and dates back even before the contemporary fascination with ipsative assessment (Stage 5) and progressive models of education (Hughes, 2017). In brief, learning gain is the ability to measure the progress made, or distance travelled, between a student’s 'command of a discipline' at one point in time to a later point in time. So, by way of example;

  1. An experienced legal secretary, Chris, with 10 years practical experience undertakes some learning activities in order to be able to complete a conveyancing legal process. In reality, they have already done this process dozens of times over the previous years and so all the 10-week module has done is to reconfirm prior knowledge. They scored 90% in the assessment based on this task. Their learning gain is minimal.
  2. A  student, Sam, with an undergraduate law degree with theoretical knowledge but no practical experience of conveyancing at all undertakes the same learning. They find it challenging, they face lots of new terminology and practical contexts. They acquire significant new knowledge and new skills. They scored 60% in the assessment. Their learning gain is significant.

Traditionally the way that we have assessed students is based on their ability to perform within set assessment tasks. On that basis Chris, above, clearly outperforms Sam. However, if we are to ask, ‘who learnt the most?’, 'who received the most learning gain?' We would come to a different conclusion.

Increasingly University funders and quality assurance agencies are concerned with not only the ability of excellent students who enter University to leave with excellent results but how below average students are supported to become above average graduates.

Learning gain as a concept is now forcing its way onto the design agenda. How can we design learning that allows students’ themselves to measure their own metacognitive and skills progression throughout a module, and importantly, through a programme? I believe all well-designed learning necessarily facilitates the analysis of learning gain.

Some design suggestions that a design team might want to consider:

  • Benchmarking or diagnostic assessments placed early on in a module and ‘repeated’ at the end.
  • Support the reflective processes through e-portfolio completion with structured reflections on learning gain throughout
  • Design synoptic assessment that includes statements of prior knowledge and learning gain integrated into reflections. Some of the US literature refers to this as ‘Capstone’ learning or assessment.
Design Tip: Coordinate the assessment models between all modules within a programme and draw synoptic assessments across modules. Is there room in your programme for a ‘synoptic’ module?

Progression: Access, Retention, Pass Rates, Grades, Completion & Destination.

Another inconsistent use of language is in the area of different quantitative measurements that serve to evaluate the effectiveness of learning. Falling broadly under the category of ‘Progression’ these denote the relative success of individual students. Relative in that they are frequently measured against internal institutional norms and external benchmarks.

The language is contested and you will see various representations of these inter-dependent issues. The important thing is that a design team should discuss the kind of data that is expected to be generated later in the reporting process and ensure that their design facilitates the easy capturing of this data wherever possible.

Progression language and data

Next Steps

Here are a few questions that you should consider in your design team. I suggest you review these each time you meet not at the end of the process.

  • Do the Design Team (and delivering Tutors) share a common language with respect to feedback and evaluation?
  • Do the Design Team (and delivering Tutors) share a common frame of reference with respect to the performance mechanisms in place for your module and programme?
  • Have you made provision for meaningful in-class evaluations in your design? Have tutors been adequately prepared to respond to evaluative comments?
  • Have the Design Team allowed for the prospect of being asked to report on Leaning Gain for their students?
  • Have the Design Team and the delivering Tutors been fully briefed on the likely progression measurement environment into whether your module or programme will emerge?


Ferrell, G. (2013). Changing assessment and feedback practice. Retrieved August 29, 2018, from

Hughes, G. (2017). Ipsative Assessment and Personal Learning Gain: Exploring International Case Studies. Springer.

Nicol, D. J., & Macfarlane‐Dick, D. (2006). Formative assessment and self‐regulated learning: a model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218.

Seldin, P. (1999). Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/Tenure Decisions. Wiley.


%d bloggers like this: