Trust & Methodology

How we build recommendations

This page explains exactly how our engine works — what data it uses, how recommendations are generated, what confidence scores mean, and where we deliberately stop short of making claims we cannot support.

What data we collect

We collect structured responses from employees via a tap-based survey. Each survey covers one repetitive task and includes: the employee role and department, the task type, source and destination systems, the trigger event, frequency, time per occurrence, action type, and variability level.

There is an optional free-text notes field. No files, screens, emails, or system access are involved. Responses are linked to a company but not to a named individual in the report output.

Normalisation

Raw responses are first normalised — field values are mapped to a consistent internal vocabulary (e.g. "Gmail", "Outlook", and "email" are all mapped to the email system type). Weekly minutes are calculated from frequency and time-per-occurrence inputs. A variability score is assigned (1 = low, 2 = medium, 3 = high) which affects confidence scoring downstream.

Clustering

Normalised workflows are grouped into clusters using a composite key of task type, source system, and destination system. This means that if multiple employees report doing the same structural task — even if described differently — they are treated as a single cluster. ROI estimates are scaled by the number of people in each cluster.

Pattern matching

Each cluster is scored against a library of pre-vetted automation patterns. Scoring is field-level:

  • 30 pts Source system match
  • 25 pts Destination system match
  • 25 pts Trigger type match
  • 20 pts Action type match

A workflow must achieve a minimum match score of 40 to be considered for a recommendation. Partial credit is given for “other” field values to avoid false negatives.

Confidence scoring

The raw match score is adjusted by confidence rules:

  • −30 points — highly variable task (hard to automate reliably)
  • −15 points — somewhat variable task
  • +10 points — high frequency task (5+ hours/week)
  • +5 points — moderate frequency (2.5+ hours/week)
  • −10 points — very low frequency (under 30 minutes/week)

Resulting scores are categorised as:

  • High (75–100) — strong match, reliable starting point
  • Medium (40–74) — solid lead, some investigation recommended
  • Low (<40) — weak match, expert review recommended before acting

Recommendation classification

Each recommendation is classified into one of five types:

  • AI Automation — tasks where AI can genuinely reduce manual effort (classification, summarisation, drafting)
  • Workflow Automation — trigger-based automations using tools like Zapier, Make, or n8n
  • Software Upgrade — replacing manual workarounds with purpose-built software
  • Process Standardisation — the process needs cleaning up before automation is feasible
  • Expert Review — confidence is too low to automate safely without specialist input

ROI estimation

ROI is estimated using a simple, transparent formula: weekly minutes saved × 52 weeks ÷ 60 × hourly rate. The hourly rate defaults to $50/hr but can be configured by the company admin to reflect actual employee cost including salary, benefits, and overhead. Time savings are capped at 85% of total task time to avoid overstating the benefit. Results are indicative estimates, not guarantees.

What we deliberately do not do

  • ×Generate recommendations without a matching pattern in the vetted library
  • ×Present low-confidence items as high-confidence
  • ×Record screens, track keystrokes, or monitor activity
  • ×Collect or process individual employee performance data
  • ×Use LLMs to generate automation recommendations (all recommendations are pattern-matched)
  • ×Guarantee implementation timelines or outcomes