A YouTube Video QA Checklist: What an Agent Can Do, and What Still Needs a Human

Every YouTube channel we work with — ours included — eventually hits the same wall. The videos are good. The production is clean. The ideas are sharp. And yet YouTube doesn’t seem to notice. The usual response is to blame the algorithm, but in our experience the real culprit is almost always something more boring: the mechanical, easy-to-skip parts of publishing aren’t getting done consistently. Titles drift. Descriptions get thin. Thumbnails get rushed. Cards and end screens get forgotten. None of it is hard. All of it adds up.

So we built a QA checklist we run on every video before and after it goes live, and we tagged each item by who should own it: an AI agent, an agent with human review, or a human outright. We’re publishing it here so you can run it on your channel too.

How to read the ownership tags

Each item is tagged one of three ways:

[Agent] — An AI agent or automation can do this end-to-end with high reliability. A human only needs to spot-check.

[Agent-Assist] — An agent can draft, score, or surface options, but a human has to make the final call because taste, brand voice, legal exposure, or strategic context is involved.

[Human] — Requires a person. Either it depends on judgment the agent doesn’t have, access the agent shouldn’t have (billing, account settings, publishing rights), or accountability that needs to sit with a named human.

1. Pre-publish: the video file itself

Before the video ever touches YouTube Studio, there’s a layer of QA that most channels skip. This is where the cheapest mistakes get caught.

  • Audio levels are consistent and peak around -3 to -6 dB, with no clipping. [Agent] An agent can run the file through a loudness analyzer and flag anything out of range.
  • No dead air longer than two seconds, no unintentional jump cuts, no orphaned B-roll. [Agent-Assist] Automated scene and silence detection can surface candidates; a human confirms which are intentional.
  • Captions and subtitles are accurate, not just auto-generated. [Agent-Assist] An agent can produce a transcript and flag low-confidence segments; a human (ideally one familiar with the speaker) corrects names, jargon, and brand terms.
  • On-screen text is legible on mobile. [Human] Someone has to actually watch it on a phone. Pixel-counting doesn’t substitute for “can I read this on the subway.”
  • No copyrighted music, footage, or third-party logos used without clearance. [Agent-Assist] Content ID-style fingerprinting can flag risk; a human signs off on usage rights.
  • Intro hook earns its first 15 seconds. [Human] This is judgment. An agent can summarize what happens in the opening, but whether it hooks is a taste call.

2. Title

The title is the single highest-leverage piece of metadata on the video.

  • Title is 60 characters or fewer so it doesn’t truncate in search and suggested. [Agent]
  • Primary keyword appears in the first half of the title. [Agent]
  • Title matches a real query people type, not just one that sounds keyword-rich. [Agent-Assist] An agent can pull search volume and related queries from a keyword tool; a human picks the angle that fits the brand.
  • Title doesn’t overpromise relative to the content. [Human] Clickbait that the video doesn’t deliver on tanks session metrics. A person who has watched the video has to make this call.
  • Title is consistent with the channel’s voice and naming conventions. [Agent-Assist] An agent can compare against the last 20 titles; a human approves deviations.

3. Description

  • First two lines (the part visible above the fold) summarize the video and include the primary keyword. [Agent]
  • Full description is at least 150 words and includes secondary keywords naturally. [Agent]
  • Timestamps and chapters are present for any video over four minutes. [Agent] Chapters can be generated from the transcript.
  • Chapter titles are descriptive, not just “Intro” or “Part 1” or “Part 2.” [Agent-Assist]
  • Links to related videos, playlists, and the channel’s primary CTA are present and correct. [Agent] Link validity can be auto-checked.
  • UTM parameters are applied to outbound links so traffic shows up correctly in analytics. [Agent]
  • Affiliate disclosures, sponsorship disclosures, and any required legal language are present. [Human] This is a compliance call and needs a named owner.
  • Hashtags (max three visible above title) are relevant and not banned. [Agent]

4. Tags and topic metadata

  • Primary keyword is the first tag. [Agent]
  • Tags include a mix of broad, specific, and long-tail variants. [Agent]
  • No irrelevant or misleading tags (this can suppress reach). [Agent-Assist]
  • Category is set correctly. [Agent]
  • Language and “recorded in” location are set. [Agent]

5. Thumbnail

  • Thumbnail is 1280 by 720, under 2 MB, in JPG, PNG, or GIF. [Agent]
  • Readable at 120-pixel width (the size it appears in mobile search). [Agent-Assist] An agent can render the thumbnail at small size and run a contrast and legibility check; a human eyeballs the result.
  • Face, if present, has a clear emotion that matches the video’s promise. [Human]
  • Text on thumbnail is three to five words maximum and doesn’t duplicate the title verbatim. [Agent-Assist]
  • Thumbnail is visually distinct from the last 10 thumbnails on the channel (unless intentionally part of a series). [Agent-Assist]
  • A/B test variants exist for any video expected to drive significant traffic. [Human] Deciding whether a video warrants A/B testing is a strategy call.

6. End screens, cards, and CTAs

  • End screen includes a “subscribe” element and at least one “next video” element. [Agent]
  • Recommended next video is genuinely the best follow-up, not just the most recent upload. [Human] This is a content-strategy call.
  • Cards are placed at moments where a viewer is likely to want more, not randomly. [Human]
  • Pinned comment includes the primary CTA and a conversation-starter question. [Agent-Assist]

7. Publishing settings

  • Visibility is set correctly (Public, Unlisted, or Scheduled). [Human] Mistakes here are expensive and reversible only with reputational cost.
  • “Made for kids” toggle is correct. [Human] Legal and COPPA exposure.
  • Premiere is configured if applicable, with a countdown thumbnail. [Human]
  • Video is added to all relevant playlists. [Agent]
  • Video is scheduled for the channel’s best-performing day and time, based on the last 90 days of analytics. [Agent-Assist]

8. Cross-channel and distribution

  • Short-form clip (Reels, Shorts, TikTok) is queued from the strongest 30 to 60 seconds. [Agent-Assist] An agent can identify candidate clips by transcript salience; a human picks the one with the best delivery.
  • Newsletter or blog post links to the video within 24 hours of publish. [Agent]
  • Community tab post is scheduled. [Agent-Assist]
  • Internal Slack or email notification goes to the team and (where appropriate) to the client. [Agent]

9. Post-publish (first 48 hours)

  • CTR is being tracked against the channel’s median; thumbnail is swapped if CTR underperforms by more than 30 percent after 24 hours. [Agent-Assist] An agent can monitor and recommend; a human approves the swap.
  • Average view duration is tracked against the channel’s median. [Agent]
  • Comments are being responded to within four hours during the first 24 hours. [Human] Voice and judgment matter here, especially for anything critical or off-topic.
  • Any spam, scam, or impersonation comments are removed. [Agent-Assist] Filters catch most; a human reviews edge cases.
  • Notes are written back into the channel playbook for the next video: what worked, what didn’t. [Human]

10. Quarterly channel-level review

The video-by-video checklist above keeps individual uploads sharp. But a channel needs a wider QA pass too, and this is almost entirely human work — because it’s about strategy, not execution.

  • Content pillars still match what the audience is engaging with. [Human]
  • Top 10 videos by traffic source = “YouTube search” — are they still ranking, or has the SERP shifted? [Agent-Assist]
  • Channel trailer reflects current positioning. [Human]
  • Banner, handle, “About” section, and contact links are current. [Human]
  • Old videos with strong evergreen traffic get titles, thumbnails, and descriptions refreshed. [Agent-Assist]

What this checklist is not

It is not a replacement for a producer who watches the video and asks, “Would I share this?” It is not a substitute for a creator who knows their audience. And it is not a substitute for the strategic judgment of someone who actually understands the channel and where it is going.

What it is: a way to make sure that the boring, mechanical, easy-to-skip parts of publishing get done every single time — so the human time gets spent where it actually matters, on the parts of the video that a checklist can never replace.

If you want a copy of this as a Notion template or a Google Sheet with the [Agent], [Agent-Assist], and [Human] tags as columns you can sort by owner, contact us and we’ll send it over.

Dennis Yu
Dennis Yu
Dennis Yu is the CEO of Local Service Spotlight, a platform that amplifies the reputations of contractors and local service businesses using the Content Factory process. He is a former search engine engineer who has spent a billion dollars on Google and Facebook ads for Nike, Quiznos, Ashley Furniture, Red Bull, State Farm, and other brands. Dennis has achieved 25% of his goal of creating a million digital marketing jobs by partnering with universities, professional organizations, and agencies. Through Local Service Spotlight, he teaches the Dollar a Day strategy and Content Factory training to help local service businesses enhance their existing local reputation and make the phone ring. Dennis coaches young adult agency owners serving plumbers, AC technicians, landscapers, roofers, electricians, and believes there should be a standard in measuring local marketing efforts, much like doctors and plumbers must be certified.