What does phavella actually build?

Custom AI systems that sit over the systems you already run, Salesforce, NetSuite, SAP, QuickBooks, Meta Ads, and so on. Every consequential decision is reviewed and approved by a human on your team. We do not replace your stack and we do not ship autonomous agents on production decisions.

Why human-in-the-loop instead of agentic AI?

Agentic demos look great in a deck. The moment a decision has a dollar amount on it, an invoice approval, a price change, a $50k ad buy, the failure modes are not theoretical. We design HITL touchpoints explicitly: where the AI proposes, who approves, what gets logged. Speed without abandoning judgement.

What size of business is phavella a fit for?

Established mid-market operations and up, where integrations are dense and the cost of a wrong autonomous decision is real. Smaller operations are usually better served by off-the-shelf tools. We'll tell you on the discovery call if we're not a fit and point you somewhere useful.

How does an engagement work?

Four phases: Discovery (2 weeks, fixed-fee operations audit), Architecture (4 weeks, fixed-fee solution design + prototype + SOW), Build (12-24 weeks, milestone-billed implementation), Operate (ongoing, 6-12 month contracts for tuning and monitoring).

Where do you typically ship?

Back office automation, supply chain and inventory intelligence, customer pipeline / sales operations, and finance operations. We are systems builders for operations, not strategy consultants.

Does phavella replace QuickBooks, Salesforce, or our ERP?

No. We are the layer on top. Your system of record stays your system of record; we read from it and, where you let us, write back. Migrations are out of scope on purpose.

Engagement-shape dependent. Discovery and Architecture are fixed-fee. Build is milestone-billed. Operate is monthly. We share ranges in the discovery call and lock specifics in the Statement of Work.

All thinking

2026-08 · Essay

How we scope a Discovery.

Two weeks, fixed-fee, against your real systems and real operating people. What we ask, what we deliver at the end, and what makes us walk away. Doubles as a procurement document.

Most AI engagements that go badly went badly in the first two weeks. Either the wrong workflow got picked, or the right one got picked but with the wrong people, or the constraints that would later kill the project, data access, regulatory, organizational, never got named at the start. By the time the build phase exposes any of this, you're six months and a million dollars in.

Discovery is the opposite of that. It's two weeks where the only goal is to know enough about your operation to either propose a serious build or tell you we're not the right fit. It's fixed-fee. It's standalone, you're under no obligation to engage us beyond it. And it produces three artifacts that are useful even if you never work with us again.

What it is and isn't

Discovery is not:

A sales process. We will not pitch products. We will not show you a demo of what someone else built.
A workshop. There are no whiteboard exercises, no “AI strategy” deliverables, no maturity-model heat maps.
An audit. We're not grading your stack. We're mapping it.
A free pilot. Discovery is paid up front. Free-pilot terms exist, but only inside a 6+ month build commitment, not before one.

Discovery is:

A two-week, fixed-fee engagement against your real systems and real operating people.
A small team (one principal + one operator) inside your operation for a few hours each day.
A produced output: an opportunity map, a stack inventory, and a recommended next step (or a recommendation to do nothing with us).

Week one: get inside the operation

We spend the first week with the people who actually run the workflow we're scoping. Not their managers, not the executive who hired us, the people doing the day-to-day. We sit with them, we watch what they do, we ask questions. We are not there to suggest improvements. We're there to understand.

The questions we ask:

What does this workflow look like end-to-end? What are the inputs, the systems, the decisions, the outputs?
Where does it currently break? What requires a Slack message to clear? What needs a 30-minute meeting that didn't need to happen?
How long do the steps actually take? Not the SLA, the median.
What gets dropped when volume spikes? Which corners get cut, by whom, with what authority?
If you could change one thing about the work, what would it be?

Most of what we learn in week one is not what was in the brief. The brief described the symptom. The work-along surfaces the actual constraint, which is almost always upstream of where the buyer thought it was.

In parallel: stack inventory

While the operating sessions are happening, the engineering side of the team is mapping the stack. What systems hold what data. Which systems can be read from, written to, and on what cadence. Where authentication lives. Where data residency constraints live. What integrations already exist and where they break.

This is fast work because we've done it many times. By the end of week one we usually have: a one-page stack diagram, a list of integration points (with feasibility ratings), a flagged list of data and regulatory boundaries, and a short list of risks we can already name.

Week two: size the opportunity, draft the architecture

Week two is where we figure out where AI + HITL meaningfully changes throughput or accuracy, sized in dollars or hours, with confidence bands. We don't hand-wave (“significant productivity gains”). We pick the two or three workflows where we believe we can move a number, we estimate the lift, and we say what we'd build to capture it.

We also draft a thin architecture: where the AI sits, what systems it reads, where the human-in-the-loop touchpoints are, what gets logged, what the operate cadence looks like. This is not the full Architecture phase, that comes next, four weeks, separate engagement. But it's enough that anyone reading the Discovery output knows what we're actually proposing.

What you get at the end

Three artifacts, plus a conversation:

Opportunity map. Three to six workflows ranked by impact. For each: the current operational pain, the AI + HITL solution, the estimated lift (range, with assumptions named), and an integration feasibility flag.
Stack inventory. One-page diagram of the systems involved, the read/write paths, the authentication and data-residency constraints, and the integration risks.
Recommended next step. Either a proposed Architecture phase scope (four weeks, fixed-fee, named deliverables) or an honest “we're not the right fit and here's what we'd do if we were you” document.

Plus a 90-minute readout with the team, where we walk through all of the above and answer questions. Recorded if you want it.

What it costs and what it commits you to

Discovery is fixed-fee. We share the range on the qualifying call. The fee is paid up front. It is non-refundable, because the work is real and we're putting senior people on it.

It commits you to nothing beyond it. We do not write Discovery contracts that include “options” on later phases or any other lock-in trick. If at the end of two weeks you want to take the artifacts to a different vendor, that is a perfectly fine outcome. The artifacts are yours.

What you commit

Three things, in order of importance:

Operating-team time. 4 to 6 hours, total, across the people doing the workflow. Not the executive sponsor, the actual operators. We need them in the room or on the call.
Read access to relevant systems. We need to see what the data actually looks like, not what the brief said it looks like. Read-only is fine; we don't need write access during Discovery.
Honest answers. About what's broken, what's political, what's been tried before, what didn't work and why. We don't share these outside the engagement, and Discovery goes badly when buyers spend the two weeks managing the impression their own team is making.

What makes us walk away

We end Discovery with “don't hire us” in three situations:

The constraint is organizational. The workflow can be improved with AI, but the political will or operating ownership to deploy it doesn't exist. No technical fix solves a sponsorship problem.
The scale doesn't justify the build. What we'd propose costs more than the lift it would capture, even with generous assumptions. An off-the-shelf tool would serve you better, and we'll point you to one.
The data isn't there. The decisions we'd want the AI to support depend on data that doesn't exist, isn't captured, or sits behind a system we can't safely read. That's a different project; sometimes a worthy one, but not the one you hired us to scope.

We'd rather lose the engagement than ship a bad one. That sounds like a marketing line. It's actually the only way the model works: we get paid in Operate, and we don't reach Operate on bad fits.

Why two weeks specifically

Less than two weeks isn't enough to get inside an operation honestly. More than two weeks and we're pretending the discovery work is bigger than it is. Two weeks is what it actually takes, we've sized the work many times now across very different operations, and the shape doesn't change much.

If your operation needs a longer process, that means it's not Discovery, it's a different engagement. We'll say so on the qualifying call.

If this sounds like the right shape

Book the qualifying call.

30 minutes, no slides. We'll ask about your operation, you'll ask about ours, and we'll either schedule Discovery or be honest about why we shouldn't.

Talk to us See the four-phase model