parsr.

Language support

Documents in any major language

Vision LLMs handle 100+ languages out of the box. Here's where we've validated and where we're still calibrating. Tell us yours; we'll confirm within 48 hours.

How we tier

Three honesty tiers

Tier 1 is production-validated against our own fixtures, with 95%+ measured accuracy. Tier 2is beta — we've confirmed 90%+ accuracy on representative samples but edge cases are still possible. Tier 3is experimental: accuracy reflects the current capabilities of the underlying foundation models (Gemini 2.0 Flash + Claude Sonnet 4), but we haven't run our own validation yet — use the generic/v1/extractendpoint and validate the output yourself. We move languages between tiers as customers request validation.

Tier 1 · Production validated · 95%+ accuracy

Validated against real fixtures

These languages have a fixture library, measured accuracy, and regression tests. Use the typed extractors — /v1/parse/bank-statement, /v1/parse/payslip, and friends — with confidence.

  • EnglishUS · UK · AU
  • FrenchFR · BE · CH · CA
  • GermanDE · AT · CH
  • DutchNL · BE
  • ItalianIT · CH
  • SpanishES · MX · AR
  • PortuguesePT · BR

Tier 2 · Beta · 90%+ accuracy

Confirmed on samples, edge cases possible

We've tested representative documents and confirmed 90%+ field-extraction accuracy. The fixture library is smaller than Tier 1; if you ship one of these in production, send us a sample and we'll move it to Tier 1.

Central + Eastern European

  • Polish
  • Czech
  • Hungarian
  • Romanian
  • Greek

CJK

  • Japanese
  • Korean
  • Simplified Chinese
  • Traditional Chinese

Other

  • Turkish
  • Arabic

Nordic

  • Swedish
  • Norwegian
  • Danish
  • Finnish

Tier 3 · Experimental

Use generic /v1/extract, validate yourself

Foundation models read these languages — Gemini 2.0 Flash and Claude Sonnet 4 both have demonstrated capability. We just haven't run our own validation, so we're honest about not claiming a number. Use the generic /v1/extract endpoint with a JSON schema and validate accuracy on your side. Send us a sample to move it to Tier 2.

Latin-script

  • All other Latin-script languages

South + Southeast Asian

  • Hindi
  • Bengali
  • Vietnamese
  • Thai

Middle Eastern

  • Hebrew
  • Persian

Cyrillic

  • Russian
  • Ukrainian

Mixed-language documents

Yes, supported.

Belgian payslips with French + Dutch + English headers parse cleanly. The vision LLM doesn't “switch language” — it just reads the document. The same applies to Swiss invoices (DE + FR + IT), Canadian statements (EN + FR), and any other multilingual layout you're likely to encounter in EU finance.

Special handling

Scripts and reading order

RTL · Arabic · Hebrew · Persian

Tested. Reading order is right-to-left for native scripts; embedded Latin characters (e.g. amounts, IBANs, dates) read left-to-right. Field extraction preserves the directionality the document was written in.

CJK · Japanese · Korean · Chinese

Tested. Accuracy depends on document quality — printed CJK is very reliable, but handwritten content drops 10–20%. Vertical layouts (rare in finance) are read in their native order.

Non-Latin scripts

Tested. Accuracy reflects the current capability of the foundation models (Gemini 2.0 Flash + Claude Sonnet 4). When the model improves, we get the improvement automatically — no re-training cycle on our end.

Improvement model

Driven by customer requests

We add language validation when a customer requests it. Email support@tryparsr.dev with the language and a sample document; we'll test, confirm accuracy, and move it to Tier 1 within 48 hours if the fixtures pass. Same SLA we apply to bank-statement format requests — see how parsr ships.

Try parsr on a language we don't list yet.

Send us a sample document — anonymized is fine. We'll come back within 48 hours with measured accuracy and, if it passes, promotion to Tier 1.