A 16:9 documentary editorial photograph. Over-the-shoulder perspective from behind a middle-school administrator standing quietly at the back of a classroom during a walkthrough observation. The administrator holds an iPad partially visible in the lower-left of the frame, with a few handwritten note pages tucked under the device. The classroom ahead is in soft natural focus — students at desks working independently, the teacher visible from the side or back near a whiteboard at the front. Late-morning sunlight streams through tall classroom windows on the right, casting warm directional light across the desks and floor. Muted naturalistic color palette: warm browns, soft creams, classroom blue-gray, the green of a bulletin board. National Geographic / The Atlantic / Education Week aesthetic. Shallow depth of field, rule-of-thirds composition, the administrator's shoulder and tablet in the lower-left third, classroom action receding into the middle distance. Mood is quiet, professional, observational — not staged. Avoid: stock-photo aesthetic, exaggerated smiles, fluorescent lighting, identifiable faces (keep all faces obscured by angle, soft focus, or framing), Tennessee state flags or heavy-handed locator imagery, text overlays, branded laptop/tablet logos, group huddles, high-fives, or any "celebratory" imagery.

How can AI help Tennessee evaluators write TEAM and Project COACH evaluations?

June 24, 2026•12 min read

How can AI help Tennessee evaluators write TEAM and Project COACH evaluations?

The short answer

Tennessee evaluators face a documentation problem that's really two problems. If your district uses TEAM — the Tennessee Educator Acceleration Model — you're writing across four domains and twenty-three indicators on a five-level performance rubric. If your district uses Project COACH, you're scoring across six domains and sixty criteria on a four-level rubric, with percentage-based summative scoring that converts to TNCompass automatically. EvalScribe handles both frameworks natively. The app captures observation evidence the way you already work, maps it to the indicators or criteria your framework actually uses, and drafts performance-level ratings and feedback language in your framework's voice. You review and finalize every document. Documentation time drops from roughly forty-five minutes to about ten.

Tennessee uses more than one framework — and that matters

Under State Board Educator Evaluation Policy 5.201, Tennessee districts and charters select their observation model from a list of state-board-approved frameworks. Most pick TEAM. It's the default, the most widely used, and the most familiar across the state. A meaningful subset choose Project COACH, the six-domain Kim Marshall framework. Two other state-board-approved models — TEM and TIGER — are used by additional Tennessee districts.

For an evaluator who's worked under the same framework for ten years, this might sound like inside-baseball detail. For an evaluator who just transferred into a Tennessee district from out of state, or who works across multiple buildings under different models, the dual-framework reality matters a great deal. Each framework has its own structure, its own rubric, its own language. The work of writing a TEAM evaluation and the work of writing a Project COACH evaluation are not the same work.

This is also where generic AI tools fall down for Tennessee evaluators. ChatGPT and Claude have no built-in knowledge of which framework your district uses. They'll produce text that sounds polished, but the indicators they cite may not exist, and the rating scale they apply may not match anything Tennessee actually uses. Polished isn't the bar. Defensible is.

What TEAM actually looks like, in structure

TEAM is built around four observable domains:

Instruction (12 indicators)
Planning (3 indicators)
Environment (4 indicators)
Professionalism (4 indicators)

That's twenty-three indicators total. Each one gets rated on a five-level performance rubric:

Significantly Below Expectations
Below Expectations
At Expectations
Above Expectations
Significantly Above Expectations

The observation component generates a Level of Overall Effectiveness — the LOE — and that LOE combines with student growth data and an achievement measure to produce the final TEAM evaluation. Observation accounts for fifty percent of the score. TVAAS contributes thirty-five percent. The teacher-selected achievement measure makes up the remaining fifteen percent.

The rubric distinction that most often trips evaluators up in TEAM is the At-Expectations-to-Above-Expectations transition. At Expectations is solid, expected practice — the teacher is doing the work of teaching well. Above Expectations starts to require evidence of student ownership: students driving the lesson, making meaningful choices, demonstrating the thinking rather than receiving it. The difference can be hard to capture in writing. It's especially hard when you're writing your fourth evaluation of the day and you've been in three different classrooms.

What Project COACH actually looks like, in structure

Project COACH is built around the Kim Marshall framework. Six domains, ten criteria each, sixty criteria total:

Planning & Preparation for Learning (10 criteria)
Classroom Management (10 criteria)
Delivery of Instruction (10 criteria)
Monitoring, Assessment & Follow-Up (10 criteria)
Family & Community Outreach (10 criteria)
Professional Responsibilities (10 criteria)

Each criterion is rated on a four-level rubric, with explicit percentage thresholds attached to each level:

Highly Effective (90% and above)
Effective (75–89%)
Improvement Necessary (60–74%)
Does Not Meet (59% and below)

That percentage piece matters more than it might look. Project COACH calculates a summative percentage across all scored criteria — and that percentage is what converts, automatically, to Tennessee's five-level state rating scale for TNCompass reporting. An evaluator working in COACH thinks simultaneously in two registers: the four-level rubric used for individual criteria, and the percentage threshold that determines the summative rating.

Like TEAM, the observation component represents fifty percent of the final score. TVAAS and the achievement measure account for the other fifty percent.

The rubric distinction that most often trips COACH evaluators up is the Effective-to-Highly-Effective transition. The leap from Effective to Highly Effective isn't about doing the same things harder. It's about evidence of intentional design — the lesson is built so students can demonstrate mastery, and they do. Three or four well-chosen pieces of evidence usually do the work, but the right three or four pieces of evidence in the right rubric language are what move the rating cleanly.

TEM and TIGER — what's on the roadmap

EvalScribe currently supports TEAM and Project COACH natively. The other two state-board-approved observation models — TEM (the Memphis-developed model) and TIGER — are on the development roadmap and coming in a future build.

For a Tennessee district running TEM or TIGER today, EvalScribe isn't the right tool yet. But the architecture that already handles TEAM and Project COACH treats new frameworks as additions, not rewrites. Tennessee evaluators who adopt EvalScribe now for TEAM or COACH will not be left behind when TEM and TIGER support ships.

This kind of detail matters more to a district evaluating long-term software than to an individual administrator deciding whether to try a free app. Worth saying clearly either way.

The work that happens after the walkthrough

In conversations with Tennessee administrators, the same problem comes up again and again. It's not the observing. Most evaluators are good at observing. They know what they want to see. They know what a strong lesson feels like and what one in trouble feels like, often within the first five minutes of walking in.

The problem is what comes after.

You walk back to your office with a page of notes — typed, handwritten, dictated, or some combination. The notes are good. They contain the evidence. But translating those notes into formal, rubric-aligned, indicator-mapped evaluation language is a different job, and it's the job that eats hours. You have to pick which evidence supports which indicator. You have to decide which performance level fits. You have to write a comment that's defensible if it ever ends up in front of a hearing officer six months from now. And then you have to do it all again next week for the next teacher.

Most evaluators put this work off until evenings or weekends, which is how a documentation problem becomes a quality-of-life problem.

For a Tennessee evaluator working across forty teachers with three observations apiece, that's 120 evaluations a year. At forty-five minutes per evaluation — a conservative number, talk to anyone who's done a full Project COACH write-up — that's ninety hours of after-hours writing. Two and a half full work weeks, lost to the keyboard.

Where AI helps (and where it doesn't)

This is the part where claims about AI tend to get oversold. So let's be precise.

AI is good at translation. It's good at taking raw observation notes and converting them into the formal language of a rubric, in the voice of that rubric, mapped to the indicator or criterion the evidence supports. That translation work is mechanical — important work, but not the work that requires your professional judgment.

AI is also good at consistency. Applied to the same rubric across the same evaluator's notes, framework-aware AI can produce more internally consistent ratings than the same evaluator writing in batches across a long week. The human gets tired. The framework doesn't.

AI is not good at — and shouldn't replace — the judgment. The judgment is yours. AI doesn't know whether the teacher you observed last Tuesday was on her best day or a hard day. It doesn't know that the class had two new students that morning. It doesn't know whether what you saw was the norm or an exception. That context, and the rating that depends on it, belongs to you.

The way this plays out in practice: AI drafts the evaluation. You read it, edit it, change ratings where they don't match what you saw, rewrite comments where the language doesn't capture your meaning, and finalize the document. The work of being an evaluator is preserved. The work of translating notes into rubric-aligned paperwork is automated.

That's where the thirty to sixty minutes per evaluation comes back to you.

How EvalScribe handles Tennessee evaluations

A few things matter most for Tennessee evaluators specifically.

Both TEAM and Project COACH are built in natively. The structures of both frameworks live inside the app — all four TEAM domains, all twenty-three indicators, the five-level performance rubric. All six COACH domains, all sixty criteria, the four-level rubric with percentage-based summative scoring. When EvalScribe drafts an evaluation, it's drafting against the rubric your district actually uses, not a generic approximation.

Three ways to enter notes. You can type. You can dictate. Or you can snap a photo of your handwritten notes mid-walkthrough — Smart Scan converts handwriting to text, even messy handwriting, even pedagogical terms. No retraining required. Whatever your existing observation style is, the app accepts it and flows it into the same framework-aware drafting pipeline.

Evidence mapping that respects framework structure. When you submit your notes, EvalScribe identifies which indicators (in TEAM) or criteria (in Project COACH) each piece of evidence supports. You can see the mapping before you finalize anything. Nothing gets generated against an indicator that doesn't exist in the actual framework.

Rubric-aware draft ratings. EvalScribe suggests performance-level ratings using each framework's own rubric logic — distinguishing between adjacent levels the way TEAM or COACH actually intends. The At-vs-Above distinction for TEAM. The Effective-vs-Highly-Effective distinction for COACH. The percentage thresholds for COACH summatives and the automatic conversion to Tennessee's five-level state scale for TNCompass reporting.

Draft comments in the framework's voice. Every comment EvalScribe writes is anchored to the evidence you captured and the indicator or criterion it supports. The language reads as the language of your framework, not generic feedback language imported from somewhere else.

You stay in control. Every comment, every rating, every piece of mapped evidence is editable before you export. You are always the last set of eyes on the evaluation. EvalScribe drafts. The evaluator decides.

Privacy by design. Observation notes are processed transiently on Microsoft Azure and aren't retained on EvalScribe's servers. Evaluation records sit locally on your device. Your data is never used to train public AI models. For districts that need a Data Processing Agreement or a FERPA attestation, those are available on request.

What the time savings actually look like

Beta testers report saving thirty to sixty minutes per evaluation compared to writing them traditionally. Across forty teachers and three observations per teacher per year, that's 120 evaluations. At thirty-five minutes saved per evaluation — the conservative middle of the range — that's 4,200 minutes annually. Seventy hours back. About nine workdays.

For a school with three evaluators all working at this pace, that's twenty-seven days of reclaimed time across the building. Days that can go into walkthroughs that aren't required, coaching conversations that aren't crammed between meetings, and instructional leadership that doesn't end at 4:30 p.m. because there are still seven evaluations to write.

The math isn't magic. It's just the work of paperwork removed.

Try it on your next evaluation

EvalScribe is available on iOS and macOS through the Apple App Store. The first three evaluations are free, no credit card required.

The state-specific page for Tennessee — built around both TEAM and Project COACH, with the actual domain structures, indicator counts, rubric levels, and TNCompass conversion logic visible at a glance — lives at evalscribe.com/tennessee. Tennessee evaluators are encouraged to start there.

If your district is running TEM or TIGER and the app isn't there yet, the contact page is still the right place to reach out. The roadmap is shaped partly by which districts are asking for what.

Frequently asked questions

Does EvalScribe support both TEAM and Project COACH?

Yes — both frameworks are built in natively. TEAM uses its five-level performance rubric across four domains and twenty-three indicators. Project COACH uses its four-level rubric across six domains and sixty criteria, with percentage-based summative scoring and automatic conversion to Tennessee's five-level state scale for TNCompass.

Does EvalScribe support TEM or TIGER?

Not yet. Both TEM and TIGER are on the development roadmap and coming in a future update. Tennessee evaluators using EvalScribe today for TEAM or COACH will not be left behind when TEM and TIGER support ships.

Does EvalScribe handle the full Tennessee evaluation, including TVAAS and student growth data?

No. EvalScribe addresses the observation component of the evaluation — fifty percent of the final TEAM or COACH score. The remaining fifty percent — TVAAS at thirty-five percent and the teacher-selected achievement measure at fifteen percent — is handled outside the app, typically through TNCompass and other district systems.

Does EvalScribe replace evaluator judgment?

No. EvalScribe drafts evaluations. The evaluator reviews, edits, and finalizes every document. The professional judgment is always yours.

Can the evaluations actually be edited before export?

Yes, fully. Every comment, rating, and piece of mapped evidence inside a draft evaluation is editable before you export the final PDF. You are always the last set of eyes on the document.

How is EvalScribe different from ChatGPT or Claude?

Generic AI tools have no built-in knowledge of TEAM or Project COACH. They invent indicators, score inconsistently, and produce vague language that misses the level distinctions Tennessee's frameworks actually require. EvalScribe is built around each framework's actual structure and rating scale — and the evaluator workflow that surrounds it — from evidence capture through final export.

Where can I see the Tennessee landing page?

The Tennessee-specific page lives at evalscribe.com/tennessee. It covers both TEAM and Project COACH with the actual domain structures, indicator counts, rubric levels, and TNCompass conversion logic visible.

Want to try EvalScribe on your next Tennessee evaluation? Download free on iOS and macOS from the Apple App Store — your first three evaluations are on us, no credit card required. Or visit evalscribe.com/tennessee to see how the app handles TEAM and Project COACH specifically.

Anthony D. Neely, Ph.D.

Anthony Neely is the Founder of EvalScribe, a veteran educator, an AI integration consultant for teaching & learning, researcher, & author.

Back to Blog