Hello Hello,

In Issue 2 https://aigrcdesk.com/p/the-policy-management-deep-dive-ai-use-cases-across-all-four-tiers I shared the prompt I use to run a policy gap analysis at Tier 2.

A prompt is a good starting point. It is not a system.

The next step, the one that actually changes how your team works is to move from a one-off prompt to a configured agent with a defined role, connected sources, grounded standards, and explicit boundaries. That is the difference between using AI like a better search box and using it as a working operational tool.

This week I want to walk you through how I built mine. The GRC Policy Reviewer and Gap Analysis agent, a Workspace Agent I configured on ChatGPT, connected to the systems where policy work actually lives. What it does, why it works, and the seven design decisions that made it useful instead of generic

TLDR

  • A prompt is not a system, to move beyond ad hoc AI use, you need a configured agent with a defined role, connected sources, and explicit boundaries.

  • I built mine on ChatGPT's Workspace Agents available on Business, Enterprise, Edu, and Teachers plans.

  • The agent follows a real review workflow: scope, source, review, identify, return structured output.

  • It is anchored to ISO 27001, NIST, and the user's own checklist, never to "best practices."

  • It is connected to where policy work actually lives: SharePoint, Confluence, Jira, uploaded files, pasted text.

  • The output is forced into a usable structure with a Low/Medium/High risk rating and recommended edits.

  • Guardrails matter as much as capability, the agent is explicitly told what it must not do.

Where does your GRC team sit?

GRC Automation Maturity Model

Before you read how I built a GRC Policy Reviewer agent, find out where your GRC team actually sits on the maturity model. The self-assessment covers all five workstreams including this one. Here is a link: https://aigrcdesk.com/p/the-framework-the-grc-automation-maturity-model that breaks down the GRC Automation Maturity Model.

The Problem I Was Trying to Solve

Policy review is one of those GRC workflows that sounds simple until you actually do it.

In practice it usually means some combination of: reading a draft line by line, comparing it to an internal checklist or template, checking whether the expected control areas are covered, spotting vague wording that will create problems later, catching old references to systems or teams or standards, and deciding what matters enough to fix now.

None of that is hard in isolation. It is repetitive, uneven, and heavily dependent on reviewer memory. A lot of policy review quality comes down to whether the reviewer remembers what good looks like, what the organisation usually requires, and which gaps are actually material. That makes the process hard to scale and hard to standardise.

The goal was not to automate compliance. The goal was to build an agent that could do a strong first-pass review the way a good GRC reviewer would systematically, consistently, and with output a practitioner could act on.

Before You Start, What You Need (A note on the platform)

I built this on ChatGPT's Workspace Agents, the enterprise agent-building feature OpenAI launched in April 2026 as the successor to custom GPTs. Available on Business, Enterprise, Edu, and Teachers plans only. Consumer Plus and Pro cannot create them. Credit-based pricing now applie. If your org still runs custom GPTs, plan to migrate, OpenAI has signalled they are being phased out.

The feature remains in research preview, which matters for your governance posture: if your AI tool approval process does not yet cover Workspace Agents, sort that before you connect SharePoint, Confluence, or Jira to anything.

The build itself is accessible, no code, no API keys. Plain language configuration, step-by-step. You start from a blank template or one of OpenAI's pre-built starting points, connect your sources (SharePoint, OneDrive, Confluence, Jira, Slack, Google Drive, Notion, Salesforce among others), upload reference files, add skills if needed, set your output structure and guardrails, test against a known case, then share.

The whole build took me a couple of hours. The thinking took longer than the building. That is the point.

How to create an agent

OpenAI has made the build process genuinely accessible, much more so than the custom GPT flow it replaces. The steps:

1. Go to chatgpt.com/agents: Or click the Agents option in the ChatGPT sidebar, visible to anyone on an eligible plan.

2. Click "Create agent." You will be offered two starting points:

  • Blank template: start from scratch with full control over the role, workflow, sources, and guardrails. This is the path I used for the Policy Reviewer.

  • OpenAI templates: pre-built starting points for common workflows like software request triage, meeting prep, reporting, and team communications. Useful for orienting yourself before you build your own.

3. Describe the workflow in plain language: ChatGPT guides you through the configuration step by step, prompting you for the role, the task, the sources to connect, and the output you want. You do not need any technical setup, no code, no API keys.

4. Connect the sources: This is where the agent stops being a chatbot and becomes operational. Workspace Agents support connections to SharePoint, OneDrive, Confluence, Jira, Slack, Google Drive, Salesforce, Notion, and a growing list of enterprise systems. Connect only the sources the agent actually needs.

5. Add reference files: Upload checklists, templates, internal frameworks, or anything that gives the agent organisational context. For the Policy Reviewer I uploaded a review checklist and a SharePoint policy locations reference. Two files. That is enough.

6. Add skills: Workspace Agents can be given specific skills, discrete capabilities the agent can call on as part of its workflow. Skills extend what the agent can do beyond reading and writing text: running calculations, generating structured outputs, querying connected systems in specific ways, or executing repeatable sub-tasks. For the Policy Reviewer I kept the skills minimal, the heavy lifting is in the configured workflow and the connected sources. But the option to expand later is there, and that matters as your use case matures.

7. Set the output structure and guardrails: Define what the agent should return and more importantly what it must not do. Configure starter prompts so the workflow is visible to whoever uses it.

8. Test and share: Run the agent against a known case before sharing it with the team. If it produces output you would not stand behind, refine the configuration before anyone else uses it.

A Note Before you Build

Everything in this issue is specific to my workflow, my environment, and the policy review problem I was trying to solve.

The platform I used, the sources I connected, the standards I anchored to, the reference files I uploaded, those reflect how policy work actually lives in my organisation. Yours will be different. Your policies might live in Notion instead of SharePoint. Your team might review against a different framework. Your audit expectations might push the output structure in a different direction.

That is not a problem. That is the point.

The seven decisions that follow are not a blueprint to copy. They are a thinking framework for designing your own. The pattern narrow role, real workflow, connected sources, grounded standards, shaped output, written guardrails transfers. The specifics do not.

Build for your context. The agent that works for your team is the one designed around your workflow, not mine.

The Seven Design Decision

1. I gave the agent a narrow job

The first design choice was keeping the role tight.

I did not build a general "GRC assistant." I built a GRC Policy Reviewer and Gap Analysis agent. Its role is specific: review policies, standards, procedures, and related governance documents, then identify weaknesses, missing coverage, and alignment gaps against the review criteria provided.

That narrowness matters. The broader the role, the easier it is for the output to become generic. The narrower the role, the easier it is to make the output operational. If you want useful AI in GRC, you usually get more value from an agent with a well-defined workflow than from a broad agent that can talk about compliance in general.

2. I built it around a workflow, not just prompts

This is the real difference between treating AI like search and treating it like a working system.

I configured the agent to follow a clear workflow:

  1. Determine the review scope

  2. Gather the relevant source material

  3. Review the content against the requested or default standards

  4. Identify the issues

  5. Return a structured review

Instead of asking the model to "look at this policy and tell me what you think," I defined what good review behaviour actually looks like. The agent is told to look for missing clauses, control mapping gaps, ambiguous or weak wording, outdated references, notable risk areas, and recommended edits.

That is what makes the output usable. It is not just reacting to text. It is performing a review task.

3. I connected it to where policy work actually lives

Policy review rarely happens in one clean document. The policy might be in SharePoint. The implementation notes might be in Confluence. Open questions or remediation work might be sitting in Jira.

If you want AI to help with real GRC workflows, it needs access to the environment where the work actually lives.

So I did not make the agent depend only on pasted text. I connected it to:

  • SharePoint: for policies, supporting documents, and shared files stored in SharePoint or OneDrive

  • Atlassian Rovo: for Confluence and Jira content containing policy text, requirements, review notes, or related governance context. (Rovo is Atlassian's enterprise AI layer, it gives external agents access to your Confluence and Jira content without direct API configuration.)

  • Uploaded files in the current session

  • Text pasted directly into chat

That coverage matters. An agent restricted to copy-paste is a curiosity. An agent connected to your actual systems is operational.

4. I anchored the review to standards

A policy reviewer that is not anchored to a standard will produce vague advice.

I configured this one to review against the user's internal policy template or review checklist when provided, ISO 27001 expectations by default, and NIST-based expectations when the request calls for NIST or a related framework.

That reflects a practical reality: most policy review work is not framework-agnostic. Even when teams are not formally doing a certification exercise, they are still reviewing policies against some combination of internal requirements, audit expectations, and external control frameworks.

The agent can prioritise a narrower framework when the request is specific. But the review is always grounded in something more concrete than "best practices."

5. I gave it organisational context through reference files

This is where a lot of AI projects go wrong.

People assume the model should somehow infer the organisation's policy environment from general knowledge. It cannot. Policy review quality improves fast when you give the agent specific reference material.

I added two reference files: a policy review checklist, and a SharePoint policy locations reference. That gives the agent a more concrete frame for how policy review should work in this environment. It is not drawing from generic governance knowledge. It has reference material that tells it what to compare against and where policy material lives.

That is a better pattern for enterprise AI: do not ask the model to guess your operating context when you can provide it.

6. I forced the output into a structure people can use

If the output is just "here are some thoughts," it creates more work for the reviewer instead of less.

Unless the user asks for something else, the agent returns a structured response:

  • Scope reviewed

  • Standards used

  • Executive summary

  • Missing clauses

  • Control mapping gaps

  • Ambiguous or weak wording

  • Outdated references

  • Risk rating (Low / Medium / High)

  • Recommended edits

  • Open questions or evidence still needed

I also told the agent to assign a qualitative risk rating based on likely governance, compliance, operational, or audit impact. Things that push ratings higher: a required policy area missing entirely, a key control area not addressed, missing ownership or approval or enforcement or review cadence, wording that creates material interpretation risk, clearly outdated references.

That does not make the agent an auditor. It makes it useful. Good GRC work is not just spotting issues, it is helping people understand which issues matter most.

7. I was explicit about what it should not do

This is the most important design decision.

If you are building AI for GRC, guardrails are not optional polish. They are part of the product.

The agent is explicitly told not to:

  • Invent policy text, approvals, ownership assignments, or control coverage

  • Present legal or certification conclusions as definitive from partial evidence

  • Overstate what it knows when source material is incomplete

That last one is especially important. A lot of bad AI output in governance comes from false confidence. The answer sounds polished, but it quietly crosses the line from analysis into unsupported conclusion. In GRC, that is dangerous.

So the agent is instructed to say when evidence is incomplete and to be clear about what additional material is needed for a reliable review. That is not a limitation. That is what responsible output looks like.

What the Output Actually Looks Like

The Governance Check

Before any AI-generated review goes into your workflow, confirm all four:

1. Is the role narrow enough? A "GRC assistant" produces generic content. A "Policy Reviewer" produces operational output. If your agent's job description could apply to ten different workstreams, narrow it.

2. Is the agent anchored to a real standard? "Best practices" is not a standard. ISO 27001, NIST CSF, an internal checklist these are standards. Without one, the review cannot be defended.

3. Are the source connections governed? SharePoint, Confluence, Jira connections are powerful and they pull data the agent will use to produce output. Confirm data classification clearance before connecting any system that contains restricted information.

4. Are the guardrails written in? What the agent must not do is more important than what it can do. If you cannot list three things the agent is explicitly forbidden from doing, the guardrails are not real.

In My Notebook

The first version of this agent did not have the "what it should not do" section. The output was good. The structure was right. The connections worked.

But every now and then it would produce a review with a confident-sounding finding that was not actually supported by the source material. Not often. But enough that I could not trust it without re-checking every claim against the policy.

Adding the explicit "do not invent, do not present partial evidence as definitive, do not overstate" instructions changed the output materially. The agent started flagging when evidence was incomplete rather than papering over the gaps with plausible-sounding text. It started saying "the policy does not address this and the source material does not indicate where ownership sits, additional input required" instead of inventing a reasonable-sounding owner.

That shift from confident-sounding to honest is the difference between an AI tool you trust and one you have to validate behind. The guardrails were not a limitation. They were what made the agent usable.

The Bottom Line

A prompt is a good starting point. An agent is an operational tool. The difference is design, not capability. The pattern that works: start with a workflow that already exists. Narrow the job. Define the review standard. Connect the right sources. Shape the output. Add the guardrails.

Do not start with "how do I use AI for compliance?" Start with "where do I already have repeated judgment work that would benefit from consistency, speed, and structure?"

That is how you get something teams can actually use.

Next Issue

Issue 4 (Notebook): What Tier 3 looks like in Policy Management. Continuous policy monitoring against live regulatory feeds and the governance work that has to be in place before it becomes safe.

Until Next Tuesday,
Princess

Reply

Avatar

or to participate

Keep Reading