Anomaly narratives — the WHY behind every flag
Live · Pro+
What this is. When Tally flags a document as anomalous, the AI generates a short plain-language explanation grounded in the vendor's history: "Stripe usually charges $12–$15/month; this $250 charge is roughly 17× higher. The last similar spike was March (refund reversal)." The explanation factors in the vendor's baselines, recent pattern shifts, and (on Business tier) firm-level history.
What you see in Tally.
- An "Anomaly" badge on the document, expanding to show the narrative inline.
- Bookkeeper drill-down views render the same narrative in context.
- The narrative becomes the body of any outbound vendor reply email when the bookkeeper drafts one.
Why it helps you. Bare anomaly flags ("amount unusual") are noise — reviewers ignore them. Narratives turn noise into actionable signal: you see the WHY in seconds and decide faster. The trust accelerant for autonomy progression.
Ask Tally — natural language
Live · All tiers
What this is. You type questions in plain English — "show me last 5 imports," "what did we spend on AWS this quarter?", "which invoices over $1,000 haven't been paid?" — and Tally answers using the AI brain grounded on your actual data. No fabricated numbers, no made-up document IDs, every answer cited to real records.
What you see in Tally.
- A chat interface on the Memory page.
- An ask box at the top of the Financials document list (returns a filtered table + chart).
- Multi-turn conversations preserved in a sidebar.
Why it helps you. Filter dropdowns scale to about 10 dimensions; real questions have hundreds. Natural language is the only interface that scales to "what about that vendor I was looking at last week?" without you pre-building every report.
Audit trail
Live · All tiers
What this is. Every AI extraction, every autonomy decision, every reviewer action lands in an unchangeable audit log. The raw AI output, the reasoning behind it, the model used, the cost, the human who reviewed it — all preserved. If a decision is ever questioned, Tally can replay it exactly as it happened.
What you see in Tally.
- The activity feed on Financials, showing every action chronologically.
- The "Tally explains" footer on each document, showing why a decision was made.
- Bookkeeper-side history queries against any client tenant.
Why it helps you. Trust requires accountability. When Tally auto-approves and you disagree, you need to know why — and have a permanent record. The audit trail is what makes "Tally is a colleague" a defensible claim instead of a tagline.
Autonomy economics — Tally's saved-time report
Live · Pro+
What this is. A nightly job tallies how many documents Tally auto-approved this month, multiplies by the time a human bookkeeper would normally spend reviewing each (around 4 minutes for an invoice, 2 minutes for a receipt), and converts that to dollars at the firm's labor rate. The result is a concrete monthly figure: "Tally saved 14 hours and $980 of bookkeeper time this month."
What you see in Tally.
- "Hours Saved" and "Labor Saved" KPI tiles on the bookkeeper Command Center.
- A printable per-client ROI report at
/bookkeeper/clients/[id]/roi. - An ROI email the firm can send directly to each SMB client.
Why it helps you. Tally's value used to be qualitative ("AI helps"). Now firms can show every client exactly how many hours and dollars Tally saved their bookkeeping team this month — closing the loop on "is Tally worth it?" with concrete numbers.
Close readiness
Live · Business
What this is. A daily job computes a 0–100 readiness score for each client's current month, weighted across five signals: percentage of documents reviewed, percentage of fields complete, percentage synced to QuickBooks, percentage of contractor 1099 profiles complete, percentage of bank transactions reconciled. Each client gets bucketed: green (ready to close), yellow (needs attention), or red (significant gaps).
What you see in Tally.
- "8 of 42 clients close-ready" KPI tile on the bookkeeper Command Center.
- A top-blockers panel showing the most common cross-firm gaps.
- Click any client to drill into their specific blockers.
Why it helps you. Without this, a firm partner has to open every client tenant individually to assess month-end status. Now the entire portfolio is visible at a glance, and attention can route only to yellow and red clients — the killer feature for the Business tier.
Confidence calibration
Live · Pro+
What this is. When the AI says "I'm 85% sure," what does that actually mean? Sometimes 85% turns out to be 95% accurate, sometimes only 70%. Tally watches its own predictions against your reviews and learns the actual accuracy at each confidence level — per vendor, per document type. Over time, "85% sure" becomes a meaningful, calibrated number.
What you see in Tally.
- Indirect — calibrated confidence is what feeds the autonomy decision on each document.
- Vendors whose AI extractions consistently match your reviews graduate to autonomy faster.
- Vendors where AI confidence overstates reality get held back even at high raw scores.
Why it helps you. Auto-approval can't trust raw model confidence. Calibration is what turns "the AI thinks 85%" into "for this user and this vendor, 85% historically meant 95% accurate" — which is what actually safe autonomy needs.
Conversation memory
Live · All tiers
What this is. When you chat with Tally, it remembers the prior turns. "What did Stripe charge us last month?" → "$487." → "What about the month before?" — Tally knows the second question is still about Stripe.
What you see in Tally.
- "Recent Conversations" sidebar on the Memory page — every prior multi-turn chat preserved.
- Click any conversation to replay it with sources, confidence levels, and follow-up suggestions.
- Up to ~100 exchanges held per conversation.
Why it helps you. Without memory, chat is just a search box. With memory, chat is a conversation — and conversations are how humans actually think out loud.
Documents as facts
Live · All tiers
What this is. Every document Tally processes — invoice, receipt, bank transaction, CSV row — becomes a single record of facts: who the vendor is, how much, on what date, what category, what status. This is Tally's permanent record of what happened. When you search, filter, sort, or chart your documents, you're reading from this record.
What you see in Tally.
- The Financials page document table.
- Every search result, filter, and chart.
- The bookkeeper drill-down view of a client's books.
Why it helps you. Your documents are never re-read or re-guessed. Once Tally extracts a fact, it's stored cleanly and instantly retrievable — questions answer in milliseconds, not seconds.
Firm memory — patterns across your client portfolio
Live · Business
What this is. A bookkeeper with 10 SMB clients learns patterns across their portfolio: their firm's preferred way to map "Adobe Creative Cloud" → "Software Subscriptions," their typical handling of meal expenses, their convention for splitting credit card payments. Tally captures these firm-level preferences and applies them to new clients automatically — but only within that bookkeeper's own portfolio (never shared across firms).
What you see in Tally.
- New invoices in new clients arrive pre-categorized when the firm has prior history with that vendor.
- "Tally noticed your firm typically maps Adobe → Software Subscriptions."
- Cross-client patterns visible on the bookkeeper Command Center.
Why it helps you. Onboarding a new SMB client should not mean starting from zero. With firm memory, the firm's existing decisions propagate to the new client on day one.
How Intelligence Digests keep users informed
Live · All tiers (opt-in)
What this is. Tally sends Intelligence Digests so owners and bookkeepers can see what was handled, what changed, and what needs attention. A digest is a short, recurring email — daily, weekly, or monthly — built from Tally's own activity during the period. Two audiences are supported: a business-owner digest for the workspace owner and a firm digest for a bookkeeping firm's owner or admin.
What you see in Tally.
- A digest settings page (Settings → Intelligence Digests) where the cadence, send time, and time zone are configured.
- A short summary email at the cadence you picked: what Tally handled, what's still open, what stands out, and a one-click unsubscribe link.
- Toggling the digest off in Settings stops new sends immediately and preserves your preferences for easy re-enable.
Why it helps you. Tally does the work in the background. Digests are how the owner of the books — or the bookkeeper running a firm — knows what's happened without having to log in to look. They are deliberately short, intentionally read-only, and sent only when the user has opted in.
This document is the canonical buyer-facing source. The engineering rationale (peer-platform survey, ML literature foundations, exact thresholds, code paths, test coverage) is in docs/architecture/two-track-learning-internal.md — internal-only and not indexed for public Help Center retrieval.
How Tally learns — context first, authority second
Status: Live across every UppaGo customer. The full engineering rationale (peer-platform survey, ML literature backing, exact thresholds) lives in docs/architecture/two-track-learning-internal.md.
How Tally learns from past resolutions
Live · All tiers (broader rollout pending pilot evidence)
What this is. Tally remembers how similar issues were resolved before, so future reviews can move faster and stay consistent. When a reviewer fixes a vendor mapping, sets a category for a new kind of receipt, or marks a duplicate, Tally treats that decision as a small reusable pattern. The next time the same kind of issue shows up, Tally surfaces the same resolution as a suggestion the reviewer can accept, edit, or reject.
What you see in Tally.
- A "Past resolutions" view in the Memory tab — the patterns Tally has noticed across your own review history.
- A suggestion card on similar future documents with Apply, Edit, or Reject options. The reviewer always decides; Tally never resolves on its own.
- A short, plain explanation of why a particular resolution is being suggested — usually grounded in how often this pattern has been resolved the same way.
Why it helps you. Most review work is repetitive — the same vendor flagged the same way, the same fix every month. By remembering how the resolution was done last time, Tally turns the second, third, fourth occurrence into a one-click action while keeping the human in the loop. The result is faster reviews and more consistent books over time.
How Tally's business memory works
Tally builds business memory across documents, vendors, review decisions, anomalies, close readiness, seasonality, and firm-level patterns. Each layer is described below in plain language: what it is, what you see in Tally, and why it helps you.
These are not separate products you turn on. They work together — every document Tally processes touches several of them at once. The compounding effect is the moat: Tally is more useful after 60 days than on day one, and more useful again after 6 months. We describe the layers one at a time so you can see the pieces clearly.
How Tally speaks about your business
Live · All tiers
"Tally now has a typed voice across its 10 business intelligence surfaces — facts, vendors, anomalies, open loops, missing-doc alerts, cash flow, close readiness, autonomy economics, resolution playbooks, and seasonal patterns. Each surface speaks for itself or honestly admits when it has nothing to say yet. Conversations are durable: Tally remembers prior turns and distills durable facts about how you work — your vocabulary, your focus areas, your patterns. Over time, the conversation becomes increasingly personal because Tally's awareness compounds."
What this means in practice. When you ask Tally something — whether through Ask Tally or by clicking through your data — the right business intelligence surfaces are queried as typed evidence. If a surface is empty (no anomalies flagged yet, no close-readiness signal yet, no seasonal cadence established), Tally says so plainly instead of inventing an answer. This is the architectural honesty layer: Tally cannot fabricate an "anomaly" from your largest invoice because the anomaly surface is its own typed slot — it is either present or it is not, and Tally never confuses one signal for another.
What you see in Tally.
- Ask Tally answers that name what they know AND what they don't know with equal clarity.
- Honest abstentions like "I don't see any flagged anomalies in your data yet." — rather than confidently-worded guesses.
- Compound answers when you ask compound questions — the data half grounded, the product-knowledge half explained, with no fabrication on either.
- Increasingly personal follow-up suggestions as Tally learns your vocabulary, your typical questions, and your business focus across conversations.
Why it helps you. Most AI assistants confidently fabricate when the answer isn't available. Tally is built the opposite way: honest absence is engraved at the architectural level, not asked-for in the prompt. You can trust the answer because Tally has been designed to refuse rather than guess. Over time, as Tally's awareness of your work compounds, the conversation becomes a working partnership — Tally knows your vendors, your patterns, and the way you actually talk about your business.
How the two tracks work for you
Tally has two distinct memories of every vendor, intentionally kept separate:
- Recognition memory — has Tally ever seen this vendor in any form? (Imports populate this; live reviews also populate it.)
- Authority memory — have you reviewed enough of this vendor's documents inside Tally's UI for Tally to start acting on its own?
Imports populate recognition only. Live reviews populate both. That asymmetry is the whole point: an import of 5,000 historical invoices tells Tally who exists in your books, but it doesn't tell Tally how you'd like each vendor handled going forward. To learn the latter, Tally has to watch you make real decisions.
Meaning-based search
Live · Pro+
What this is. Tally understands when two things mean the same thing even when the words differ. "AWS hosting" and "Amazon Web Services Compute" are recognized as the same concept; "MS Office 365" and "Microsoft Office 365 Business" are flagged as duplicates worth merging. This is meaning-based matching, not keyword matching.
What you see in Tally.
- "Similar documents" panel on the document detail sheet (3–5 conceptually similar docs).
- Vendor merge suggestions when two name variants likely refer to the same vendor.
- Smarter search that catches paraphrases (searching "cloud hosting" surfaces AWS and Azure docs).
Why it helps you. Keyword matching misses what humans see at a glance. Meaning-based search closes that gap, so Tally finds what you mean even when you can't remember the exact wording.
Missing-document detection
Live · Pro+
What this is. Tally notices expected recurring documents that haven't arrived. When a vendor that usually sends an invoice on the 5th is silent on the 10th, Tally surfaces a gentle nudge so the bookkeeper or business owner can follow up before late fees, service disruption, or a close-period gap. Tally only fires after she has seen enough of the vendor's pattern to be confident — irregular vendors stay quiet.
What you see in Tally.
- An "Overdue vendors" view in the Memory tab's Open Loops sub-tab — vendors that usually appear by now and haven't.
- A short reason on each row: which vendor, how many days late, the typical interval Tally has learned, and the estimated amount.
- Four actions on each row — Mark received (you got it through another channel), Expected later (defer the next check), Not expected (vendor stopped billing), Dismiss (acknowledge for now; Tally re-surfaces a week later if it's still missing). Every action is logged for audit.
- The same per-client signal flows into close readiness so a firm partner can see which clients have follow-up work before close.
Why it helps you. Recurring obligations are the easiest things to miss — particularly the ones that show up monthly without much fanfare. Catching them before late fees, service termination, or compliance gaps is the SMB version of preventive maintenance. Tally fires only after she has seen enough of the pattern to be confident, and stays quiet on irregular vendors — so the surface stays high-signal.
Open loops — your single attention queue
Live · All tiers
What this is. Across your books at any moment, dozens of unresolved threads exist: documents pending review, anomalies awaiting decisions, missing 1099 contractor info, rejected documents needing follow-up, vendor questions waiting on a reply. Tally pulls every one of these into a single prioritized queue.
What you see in Tally.
- The "Open Loops" tab on the Memory page.
- Items sorted by severity (critical → warning → info).
- One-click navigation from any item to its source document.
Why it helps you. "What needs my attention this morning?" should not require visiting five different surfaces. Open Loops is your single starting point — every unresolved thread, prioritized, in one place.
Quality safety net
Internal infrastructure (no user-facing surface)
What this is. Every change to Tally's AI — a new instruction version, an upgraded model, a tweaked decision threshold — is checked against real review outcomes before reaching production. Tally's quality is measured against documents you and other reviewers have already labeled as correct, not guessed. If accuracy regresses, the change is blocked.
What you see in Tally.
- Nothing directly — this is internal safety infrastructure.
- The indirect benefit: Tally improvements ship without you ever seeing a "the AI got worse this week" regression.
Why it helps you. AI systems regress silently. Without measuring against real review outcomes, a small wording tweak could quietly degrade extraction accuracy by 20% and nobody would notice for a week. This is the safety net that lets Tally evolve without breaking — and it's why "Tally is getting better" is a measurable claim, not a marketing promise.
Real-time intelligence
Live · Pro+
What this is. When something important happens — a high-severity anomaly, a sudden vendor pattern shift, a close-readiness blocker — Tally reacts immediately rather than waiting for a daily batch. Notifications, alerts, and autonomy adjustments fire within seconds.
What you see in Tally.
- "Tally noticed something" notifications on the dashboard.
- Email, SMS, or Slack alerts (depending on your notification preferences).
- Bell badge updates and tongue messages on the persistent UI.
Why it helps you. Daily batches are too slow when your largest vendor charges 17× their normal amount at 9am — you want to know by 9:15am, not at the end of the day. Real-time intelligence is what makes Tally feel awake.
Relationships between things
Live · Pro+
What this is. Tally remembers connections between things: which category each vendor usually falls into, which accounting account they post to, which vendors are subsidiaries of which parent companies, which client belongs to which bookkeeper. After three Walmart invoices coded as Office Supplies, Tally stops asking and starts proposing.
What you see in Tally.
- Pre-filled category on every new invoice ("Tally suggests: Office Supplies — based on 12 prior Walmart invoices").
- Vendor-merge prompts when Tally spots that "AWS" and "Amazon Web Services" are probably the same vendor.
- Auto-routed accounting entries that mirror your historical bookkeeping.
Why it helps you. Without learned relationships, every document is a fresh decision. With them, the routine 80% of bookkeeping flows through Tally with one click — your attention free for the 20% that's actually new.
Reviewer memory
Live · Business
What this is. Different reviewers approve differently. Andre might wave through 94% of Stripe charges; Maria might reject 24%. Tally tracks each reviewer individually — their typical approval rate, the categories they're stricter on, the vendors they consistently correct. When Andre approves a document, that's a different signal than when Maria approves the same document.
What you see in Tally.
- A "Reviewer Insights" panel on the Memory page (firm partner only).
- Per-reviewer approval rates and category preferences.
- Drift alerts when one reviewer is calibrated very differently from peers.
Why it helps you. Firm partners can spot calibration drift between team members before it becomes a quality problem. Tally's autonomy gets cleaner signal too — instead of averaging across reviewers, it knows whose approval it's listening to.
Seasonal intelligence
Live · Pro+
What this is. A weekly job detects three classes of timing patterns per vendor: monthly cadence (Stripe on the 5th), quarterly cycles (estimated taxes in March / June / September / December), and seasonal spikes (insurance renewal in January, Black Friday spend in November). The patterns power downstream proactive surfaces.
What you see in Tally.
- The cash flow forecast extends beyond month-by-month projection into quarterly and annual cycles.
- Anomaly narratives gain temporal context ("usually arrives by the 5th, today is the 10th").
- The foundation for missing-document detection.
Why it helps you. Without seasonal intelligence, every proactive feature is structurally limited to monthly cycles — exactly the cycles that are easy to track manually. Quarterly and annual cycles, where SMBs lose track most often (insurance renewals, estimated taxes), now have systematic awareness.
Table of contents
The everyday layer — how Tally feels like a colleague
These are the layers that make Tally feel like a colleague rather than a database.
The foundations — what Tally remembers about every document
These are the bedrock layers. Every other capability reads from these.
The full agentic loop — Watch · Learn · Speak · Act · Remember · Replay
Tally's intelligence layers are not a checklist of features. They are a single continuous loop that compounds with every human decision you make. Each layer described above is one phase in that loop:
- Tally watches. Every document, every transaction, every correction is captured as durable facts. (Documents as facts · Vendor patterns)
- Tally learns. Vendor baselines, reviewer behavior, relationships, semantic memory — Tally builds patterns specific to your business from the decisions you and your team make every day. (Trust and autonomy · Reviewer memory · Relationships · Meaning-based search)
- Tally speaks. Ask Tally questions in natural language and get answers grounded in the typed evidence — or an honest abstention when there's nothing to say yet. Conversations compound across turns. (Ask Tally · Conversation memory · How Tally speaks about your business)
- Tally acts. Per vendor, Tally moves through the 4-stage Autonomy Journey — Watching → Recommending → Acting with guardrails → Autonomous — earning trust through your real review decisions. The autonomy economics surface quantifies hours saved + labor saved every month. (Trust and autonomy · Autonomy economics)
- Tally remembers. Open Loops tracks every unresolved thread. The audit trail preserves every decision permanently. Conversation memory carries context across days. (Open loops · Audit trail · Conversation memory)
- Tally catches. Missing-doc detection, anomaly narratives, close readiness, seasonal patterns — the proactive surfaces fire when something is about to slip. (Missing-document detection · Anomaly narratives · Close readiness · Seasonal intelligence)
- Tally replays. When the same kind of issue is resolved by your team the same way repeatedly, Tally captures the resolution as a reusable playbook. Future similar issues surface the playbook as a one-click suggestion — Apply, Edit, or Reject. (How Tally learns from past resolutions)
Resolution Playbooks is the capstone of the agentic loop. Every other pillar feeds it: facts, baselines, trust, reviewer memory, autonomy. The playbook layer takes the firm's hardest-won institutional knowledge — the way your senior reviewer handles a tricky vendor, the category your firm always picks for a specific receipt type, the correction your team has applied twelve times — and replays it as Tally's own action on the next review. That is the line between "AI tool" and "agentic co-pilot."
Each iteration of the loop deepens the moat. By month six of using Tally, your firm has a compounding asset competitors cannot replicate in a feature sprint: the operational intelligence of every approval, correction, anomaly resolution, and reviewer decision — replayable on demand, scoped to your tenant, never shared across firms.
The proactive layer — when Tally speaks up
These are the surfaces where Tally stops waiting to be asked and starts speaking up.
The short version
Tally treats two kinds of signal differently:
- Imported history helps Tally recognize patterns. When you import a CSV, a QuickBooks export, a Xero export, or a bank feed, Tally now knows this vendor exists, what their typical amount looks like, what category they usually map to. Imports unblock vendor recognition. They do not earn the vendor a "trusted" auto-approve gate.
- Live human reviews teach Tally what it is allowed to trust. When you click Approve, Reject, or Correct inside Tally, that's the trust signal. Every approve, reject, correction, escalation, and resolution teaches Tally how this business actually works. Tally watches first, compares itself against your decisions, then earns more autonomy only after repeated agreement.
Imports give Tally context. Live reviews give Tally authority. Tally does not act just because it saw old data — it earns trust from live review history.
The trade-off, in plain language
When you import a 200-row history (CSV from QuickBooks, an export from Xero, a bank feed), it's reasonable to ask: "Should Tally treat all of that as if I'd reviewed and approved it inside the product?"
The answer most peer platforms have settled on, and the answer Tally uses, is no — for two reasons:
- Imports were not actively reviewed in the product. They're a data dump. Treating them as live approvals would mean a single bad export could silently auto-approve a vendor at the wrong category forever.
- The trust signal is the act of reviewing. When you click Approve or Reject, Tally learns your preferences — how strict you are on certain vendors, which categories you tend to correct, which exceptions you let pass. Imports can't carry that signal.
So Tally splits the difference: imports give the AI context (vendor recognition, baseline patterns, typical amounts); live reviews give the AI authority (the right to act on its own).
Trust and autonomy
Live · Pro+
What this is. Tally tracks how confident it is in each vendor — a 0–100 health score that blends approval history, document maturity, anomaly rate, and reviewer alignment. As trust grows, Tally's autonomy progresses through four levels: Manual (Tally watches, you decide), Suggest (Tally proposes, you confirm), Auto-Review (Tally acts, flags only edge cases), Fully Auto (hands-off for that vendor). Trust progresses per vendor — Stripe might be Auto while a new vendor is still Manual.
The Vendor Autonomy Journey makes that progression visible inside every document. A small four-stage stepper in the document's TALLY'S INSIGHTS section shows exactly where Tally stands with that vendor:
- Watching — Tally is establishing a baseline; you decide every document.
- Recommending — Tally proposes the right answer; you confirm with one click.
- Acting with guardrails — Tally is eligible to auto-process clean documents; final checks happen.
- Autonomous — auto-review active for clean documents; exceptions still come back to you.
Side states are separate from stages. If a vendor is paused, disabled, or the pack is off, Tally narrates that in her tongue at the bottom of the section in plain language ("Autonomy paused for this vendor — manual review required") rather than regressing the stage. Per-document holds (suspected duplicate, anomaly, amount over cap) keep the vendor at her earned stage but route this specific document back for manual review.
What you see in Tally.
- The 4-stage Vendor Autonomy Journey indicator on every document detail sheet (SMB owners and bookkeepers both see it on Pro+ tiers).
- A vendor health bar on the Memory page (color-coded).
- "Tally would auto-approve this" suggestion box on the document detail sheet.
- An autonomy progression chart on the bookkeeper drill-down.
- An "Autonomy details" expandable inside the journey card showing the per-document decision reasoning when Tally has acted (level, factors, confidence, decision id).
Why it helps you. Auto-approval without earned trust is dangerous; manual review forever is exhausting. The trust ladder gives Tally a graceful way to take over routine work while still asking for help on the unusual cases. The journey indicator turns that ladder into something you can read at a glance during a review — you know which vendors need your attention, which Tally is recommending on, and which are flowing through autonomously, before you finish reading the document title. See the dedicated Vendor Autonomy Journey help article for stage-by-stage examples and what each microcopy means.
Vendor patterns
Live · All tiers
What this is. Tally remembers what's normal for every vendor you have. For Stripe, that might be "$12–$15/month, paid by the 5th, always categorized as Software." For your office landlord, it might be "$3,500/month, paid by the 1st, always Rent." Each new document quietly updates Tally's running picture of "normal" without re-reading your full history. The picture grows sharper with every document you approve.
What you see in Tally.
- Vendor cards on the Memory page: "$2,400 average · 47 documents · 90% approved."
- Anomaly flags when a document is far outside normal.
- "47 from imports" provenance line distinguishing seeded history from live reviews.
Why it helps you. Without baselines, every document is judged in isolation. With baselines, Tally compares each new document against the vendor's track record — the foundation of every smart suggestion downstream.
What's live today
The two-track learning model is fully live. What this means for you:
- Imported vendors are recognized immediately. When you import history, Tally instantly knows those vendors are known — they're not flagged as "new." The "Needs Attention" inflation that used to happen on a freshly-imported tenant is gone.
- Tally surfaces import history as a first-class signal. Vendor briefings and the document detail sheet show "N from imports" alongside live-review counts, so you can see at a glance how Tally knows about a vendor.
- Tally only auto-approves after enough live reviews. A high approval-rate from imports alone is not enough. Tally tells you exactly how many more live reviews are needed before the next level of trust unlocks.
- Every ingestion path treats imports the same way. CSV imports, QuickBooks/Xero exports, bank-feed syncs, and migration imports all populate the recognition track with the same shape.
- Historical imports were retroactively brought up to date. Documents you imported before the two-track model shipped now show the same "N from imports" provenance as new imports. No re-import required.
- Bulk imports work both backward and forward. When you import a large history, Tally doesn't just file it away. Backward: it recognizes every vendor in that history, learns the category each usually maps to, and brings the whole corpus into what Tally can reason over — so the value isn't limited to documents you add from today on. Forward: the same intelligence checks then run on every new document, no matter how it arrives — uploaded, emailed, pulled from your bank feed, or imported. As Tally builds up each vendor's typical-amount baseline from the documents it processes, it flags unusually large amounts with the reasoning, catches duplicates, and lets trusted vendors move faster. Nothing gets a different level of scrutiny just because of where it came from.
What's coming next:
- International compliance standards. As more firms expand cross-border, Tally is being designed to align with international audit and data-handling frameworks (SOC 2, ISO 27001, GDPR-aligned controls, region-specific accounting/tax handling). The architectural pieces — append-only audit trail, customer-facing deletion lifecycle, audience-boundary serialization — are already in place; certification and region-specific wording are the next layer.
- Import-warmth discount. Vendors with strong, consistent imported history will earn a faster path to autonomy once we have operational evidence the default is too conservative for high-fidelity import sources.
- Historical trust decay. Imported priors will fade with age, so a 5-year-old export doesn't dominate forever.
- Deeper agentic AI capabilities. Tally's typed-evidence pipeline is the foundation for richer agentic behaviors — multi-step plans across signals (e.g., "reconcile this month, then flag what's still missing for close"), proactive playbook suggestion across review sessions, and tighter feedback loops between live human decisions and the next document's autonomy decision. Each capability lands behind the same engraved abstention contract: Tally never speaks beyond its evidence.
- Continued model upgrades. The capability-layer LLM registry means model improvements (latency, accuracy, cost) ship through configuration, not code. As frontier models evolve, Tally's reasoning improves while the learned business memory stays intact.
What Tally promises and where the boundaries are
- Tally helps prepare 1099s; Tally does NOT file taxes.
- Tally complements QuickBooks; Tally does NOT replace a general ledger.
- Tally assists bookkeepers; Tally does NOT replace a qualified bookkeeper, accountant, or tax advisor.
- QuickBooks Online + Plaid + Inbox are live integrations. Xero / NetSuite / Sage / IIF are not live integrations on this route — they accept CSV imports as historical context only.
- Tally does NOT claim country-specific compliance certifications without verified documentation. Some connected features may vary by region, setup, plan, bank-feed availability, accounting integration, and local tax/accounting requirements.
- Tally surfaces an explainable review queue; the human stays responsible for final judgment on tax / legal / accounting decisions, unusual transactions, novel vendors, unclear source documents, and anything outside verified integrations.
What this looks like in practice
| Scenario | Two-track behavior |
|---|---|
| Day 1 — fresh CSV import of 200 documents across 11 vendors | Only documents with real signals get flagged (duplicates, anomalies, missing fields). The "new vendor" flag is suppressed because the import seeded recognition for every vendor. |
| Live review of 5 documents from Vendor X | Tally's authority track for that vendor accrues toward auto-approve. After enough consistent live approvals, Tally graduates to auto-approve future documents from that vendor. |
| Import a vendor that's never been seen anywhere | Flagged as new (correct). Imports recognize; they don't fabricate. |
| Import a vendor that's already been auto-approved before | Not flagged — Tally already knows this one. |
| Import a 5-year-old QBO export with stale category coding | Coding seeds recognition; Tally treats it as background context, and your recent live reviews dominate the trust signal. |
The honest pitch: Tally watches you for the first few reviews before it acts on its own. Imports give Tally context; live reviews give Tally authority.
What this means for your business
The two-track design is the reason Tally is a defensible, trustworthy product:
-
For SMB owners: "Tally recognizes your vendors immediately when you import your QuickBooks or Xero history. It doesn't auto-approve them yet — it watches you for the first few reviews to learn YOUR preferences. After that, similar invoices auto-process and you get back to running your business."
-
For bookkeepers: "Tally won't take action on a client's books until your firm has reviewed enough documents that we know your standards. Imports give us the context (vendors, typical amounts, category patterns); your reviews give us the authority. You're never out of the loop until you choose to be."
-
For evaluators: Tally treats imports as context and live reviews as authority. Autonomy is calibrated, not granted. Every level of trust is earned vendor by vendor, with the human still in the loop on anything Tally hasn't seen enough of.
If Tally instead trusted imports as if they were live approvals, the obvious objection — "what if my import has bad data?" — would have no good answer. The two-track design has a defensible answer at every layer.