Human-in-the-Loop AI for ISO 27001 Auditing

AI can accelerate parts of ISO 27001 auditing, but it should not be described as an autonomous auditor. Large language models are useful for reading documents, locating candidate evidence, organizing questions and drafting summaries. They can also hallucinate, miss context, overvalue polished policies and express uncertain interpretations with unjustified confidence.

FANG is being built around a human-in-the-loop model in which automation assists the reviewer without inheriting the reviewer's authority. The design goal is not simply to add an approval button after an AI response. It is to create a workflow where the human can see what happened, test it against structured criteria and change the outcome.

Key takeaways

LLMs are well suited to extraction and drafting, not unsupported audit opinions.
Control status should follow explicit criteria outside generated prose.
Human reviewers need source access, context, competence and rejection authority.
Every material change should leave a traceable review record.
FANG is an evolving prototype, not an accredited audit or certification service.

Why AI-assisted auditing needs stronger boundaries

Audit work combines repeatable procedure with contextual judgment. A reviewer may examine a policy, interview an owner, sample records and test whether the process operated during the period under review. An LLM can summarize the policy but cannot independently establish that the described process happened, that the sample is representative or that the remaining risk is acceptable.

The risk increases when AI output is presented as a score without provenance. Users may trust the interface rather than inspect the evidence. That creates automation bias: a plausible answer becomes the default, and human review turns into a rubber stamp.

Human-in-the-loop is more than final approval

A meaningful human-in-the-loop control has several parts. The reviewer knows which task the model performed. The source material is visible. Extracted facts are distinguishable from generated interpretation. The criteria for the structured status are documented. The reviewer can reject, correct or request more evidence. The decision and the reason for it are retained.

If the reviewer sees only a green score and an approve button, the human is not genuinely governing the process. Oversight requires enough information and authority to disagree.

The FANG separation of responsibilities

What the LLM can assist with

The language-model layer can identify candidate passages, summarize long documents, extract roles and dates, classify material, draft evidence requests and turn approved structured findings into readable narratives. These tasks reduce repetitive reading and writing while leaving room for verification.

What structured logic should control

The deterministic layer should hold the control identifier, assessment question, allowed statuses, scoring criteria, required fields, evidence relationships, workflow state and validation rules. The same inputs should not produce a different definition of "partial" merely because a prompt was worded differently.

What the reviewer must decide

The human determines whether the evidence is authentic, relevant and sufficient; whether the control is appropriate to the organization's risk; whether contradictory information changes the conclusion; and how a finding should be communicated. Leadership, not software, accepts organizational risk.

A practical review workflow

Define the review question. State what is being evaluated and which scope applies.
Collect the source material. Record file identity, version, owner and period.
Extract candidate facts. Let AI locate relevant passages while preserving citations.
Apply structured criteria. Compare verified facts with the defined review logic.
Surface uncertainty. Flag missing, conflicting or stale information.
Review and decide. Require a competent person to accept or change the status.
Generate the narrative. Draft language from the approved finding and evidence references.
Retain the trail. Record the final decision, reviewer and changes.

Guardrails for AI-generated audit narratives

Generated narratives should be constrained to the approved finding. They should name what was reviewed, state what the evidence supports, avoid legal or certification conclusions and explain missing evidence without inventing facts. A narrative should not quietly upgrade "unknown" to "implemented" because optimistic language sounds more professional.

The report should also make AI assistance transparent internally. Reviewers need to know which text was generated and what was changed. The final published report remains the responsibility of the professional or organization issuing it.

Testing the oversight control itself

Human review can fail. Reviewers may approve too quickly, lack context or become anchored to the model's first answer. FANG should therefore make the oversight process measurable. Useful signals include rejection rates, correction reasons, evidence requests, recurring extraction errors and controls where generated narratives are frequently rewritten.

These signals can improve prompts, rules and reviewer training. They can also show whether the human-in-the-loop claim is functioning in practice or existing only in policy.

Security, privacy and model-provider risk

Audit evidence may expose internal architecture, suppliers, incidents, personnel and control weaknesses. AI-assisted review must consider where documents are processed, whether prompts or files are retained, whether data is used for model training, who can access outputs and how records are deleted.

These questions belong in the system design and vendor review, not in a footer added after deployment. The current FANG prototype does not claim to have solved every production security and privacy requirement.

What responsible acceleration looks like

The best outcome is not the highest automation percentage. It is a review process that becomes faster while remaining explainable and challengeable. FANG should reduce document handling and drafting effort, preserve evidence relationships and help reviewers focus on risk and anomalies.

That is the practical meaning of accelerating an ISMS: better structure around human work, not the removal of professional judgment.

FAQ

Can an LLM be an ISO 27001 auditor?

An LLM can assist defined tasks. It does not possess accreditation, organizational authority or independent professional accountability.

Does human approval make every AI conclusion reliable?

No. The reviewer needs competence, context, source access and genuine authority to reject the output.

Can FANG guarantee audit readiness?

No. FANG is an evolving prototype and cannot guarantee certification or audit outcomes.

Why keep deterministic rules outside the LLM?

Explicit rules make scoring and workflow behavior consistent, testable and easier to challenge.

Human-in-the-Loop AI for ISO 27001 Auditing: The FANG Approach