Victoria Hedlund, 18th Jan 2025
Both Aila and Teachmate offer lesson planning tools and will quickly and easily provide a lesson plan. Aila is free, whereas Teachmate is a subscription service (for the lesson plan tool). Here we put lesson plans generated by Aila and Teachmate through our Lesson Inspector and discuss the reports. We conclude that our Lesson Inspector should be further explored as a content alignment tool, and a prompt to promote critical reflective practice. The Inspector reports begin to indicate and consequently quantise the diversity of GenAI-produced lesson plans.
Even before GenAI tools burst into our pedagogically-informed lives, we pondered on how to judge the 'goodness' of a lesson plan. There is no definition for a 'good' lesson plan, as essentially this concept could be broadly thought of as dependent upon:
Context (the needs of your specific learners)
Teacher identity (your epistemological beliefs about knowledge etc.)
School ethos (what values does your school promote?)
Policy (what frameworks do you have to work within?)
When asked to define a 'good' lesson plan, most teachers struggle to condense their thoughts into a formula or set of beliefs. This is something that educationalists intuitively understand but may find challenging to operationalise. Yet in my experience, it is something student teachers and ECTs commonly yearn for.
Enter the Lesson Inspector; a tool we've developed to evaluate lesson plans based on the themes present in the Core Content Framework (CCF), Initial Teacher Training Core Content Framework (ITTECF), and the Teachers' Standards. Our Lesson Inspector offers two metrics: a quantitative score and qualitative evaluation/analysis by criteria. This qualitative analysis provided by Lesson Inspector offers potential for content alignment of the LLM with the aforementioned Statutory frameworks.
In this discussion, we use two lesson plan generators (Aila and Teachmate) to produce lesson plans on the same topic of KS4 waves, with the same objective (below). We then upload these to Lesson Inspector and create Inspection reports for them both. We will discuss:
User Process: The steps and methodology each tool uses and how the user inputs their lesson information.
Report content : Such as prior knowledge and other relevant themes included in the Lesson Inspector report.
For a fair comparison, both tools were given the same objective from the AQA (2015) GCSE Physics Specification: "Students should be able to describe wave motion in terms of their amplitude, wavelength, frequency, and period."
You could be forgiven for having missed the introduction of this term in the GenAI context. We are using the definition from AI for Education, as can be seen in the screenshot adjacent. Essentially the terms 'correctly calibrated' and 'curriculum standard' found within this definition could (and should) be debated. Here, we will introduce a 'curriculum standard' (the CCF/ITTECF and Teachers' Standards - although the latter is an assessment framework and not a curriculum) and discuss the notion of being 'correctly calibrated'.
But first, let's look at how these two interfaces are used to produce the lesson plans, what is produced, and what Lesson Inspector provides as reports.
Aila is free. Aila is Government funded. These are immediately reasons to consider using Aila. One positive about the process of using Aila is that it is designed to involve the teacher at every step of the process. This means that after each input and generation cycle, the user has the opportunity to refine the generated output. This can be quite versatile in terms of tailoring to specific needs, but does come at the expense of requiring more active user input into the process, meaning that it takes a bit longer to generate a plan. We will explore this user-centric functionality is in another post.
When considering the time scales though, do keep in mind that although quicker, teachmate is NOT free. It cost me £6.99 at the time of publishing this. However most non-government backed generators also charge, so this is not uncommon.
The interfaces are completely different: Teachmate has minimal inputs (see screenshot) and Aila seeks verification along each step of the generation process. See the Aila video below, to see how this process looked for our lesson plan generation on waves. This video is sped-up by a factor of 6, so keep this in mind! (Although this kind of duration is not uncommon in GenAI tools). In this video all that was entered into the interfaces was the topic, subject, phase and objective - the same as for Teachmate.
A screenshot of what was entered to produce the Teachmate lesson plan on Waves.
A video (x6) of the process of entering inputs into Aila for the generation of the Waves lesson plan.
You'll probably want to see what was produced. See below for the .pdfs. It's easy to see how comparing these two can start to sketch out a skeleton for the diversity of GenAI-produced lesson plans
See what was produced from the video of the Aila input process (above).
See what was produced from the Teachmate input process in the screenshot (previous section).
Both the lesson plans were put through our Lesson Inspector. See below for the .pdf versions of the reports (we are working on making these look a lot prettier!).There are several pages to each.
Starting from the same objective, it's interesting to see the paths that each tool has taken. This echoes what anyone in teacher ed will have noticed: there can be a huge diversity in planning, given the same initial boundary conditions. In some sense, it could be argued that lesson plan generation tools should mimic this. Certainly the probablistic nature of GenAI should mean that by definition a vast array of lesson plans should occur.
That brings us on to considering how useful, or correct this diversity of plans could be.
Many will wonder if the Inspector could score a perfect zero. We'd invite you to consider the (very brief) lesson plan below, and amuse yourself with the Inspector's report...
Once you've recovered from the hilarious analysis of that final lesson plan, it's clear that a perfect zero can indeed be scored. This brings us back to the initial idea of content alignment involving correct calibration of curriculum standards.
If we assume a static curriculum standard based on the CCF/ITTECF and the Teachers' Standards, we need to consider the ability of the Lesson Inspector to guide users towards the correct calibration of these GenAI tools when creating lesson plans. Our scoring metric and evaluation/analysis framework provide a teacher with a quick and easy sense check against the frameworks they work within. These two metrics have very different impacts - the quantitative score could be used as an immediate quality indicator, although we have designed it to essentially be a 'hook' to ensure the report reader reflects upon the qualitative analysis - a means to promote reflective practice upon curriculum goals.
If we return to our initial understanding of the values that contribute to a 'good' lesson plan, we can argue that content alignment is at least partially achieved if the Lesson Inspector reports facilitate and promote reflective practice. These reports provide a space for teachers to critically consider the output and reflect on how it aligns with their values, beliefs, and ideas about the curriculum.
Consider an example: a teacher who adheres to a very teacher-led approach and does not favour student-led practice. The Lesson Inspector report suggests student-led practice as an area for improvement and notes the reliance on teacher-led methods. Does this mean the content is not aligned? The teacher reflects on these comments from the report and critically thinks about their ideal use of teacher-led versus student-led practice. They ultimately decide they are satisfied with their initial plan. Does this mean the report is not useful? Does it mean the content is misaligned?
The Lesson Inspector cannot answer these questions of pedagogical choice and teacher identity. However, it can provide a set of structured prompts to help users explore and solidify their concepts of a 'correctly calibrated' learning experience.
Future developments of the Inspector will include creating different 'curriculum standards' based on instructional frameworks such as Rosenshine (see our other blogs) or other learning strategies like UDL and enquiry-based learning. There is even potential for custom frameworks tailored to an institution's own set of values.
It's clear that Lesson Inspector is a valuable tool to begin to explore the content alignment of the lesson plans generated by GenAI tools. Our forthcoming developing will centre around:
Diverse testing of other content generated by Aila, Teachmate and other GenAI content generators.
Discussion of each thematic strand in the reports in detail and with reference to specific CCF/ITTECF criteria.
Development of an assessment framework that can be used for QA.
Watch this space. 🐉🔥