Now booking — agent builds & Ascend deployments

Ship agents. Iterate forever.

We build production-grade applications and AI agents for teams who can't wait six months to find out if their idea works. Specialists in deploying and fine-tuning large language models on the Ascend (昇腾) AI stack.

agent · trace · 14:32:08.221
1 · User input
Find unpaid invoices > 30d
2 · Tool: SQL
SELECT * FROM invoices ...
3 · Tool: Email
Draft 14 reminders
4 · Output
Awaiting approval ✓
latency · 1.4s tokens · 3,142 cost · $0.018 eval · passing 47/48
Recent products include
Series-B fintech · State-owned utility · Logistics platform · Edtech (350k MAU) · Manufacturing OEM · Retail chain
01 · why us

A small senior team, optimized for speed.

Four advantages we trade everything else for. Every project gets all four — they're not tiers.

2 wk
first
01

Fast delivery

First working build in 2 weeks. Production in 6 to 10. We bring a senior team and a starter kit; you bring the problem.

weekly
shipping
02

Continuous iteration

Weekly drops, real users in the loop, telemetry from day one. The product gets sharper every sprint instead of every release.

70%
avg
03

Lean deployment

Right-sized infra. Domestic chips when you need them, commodity GPUs when you don't. We size for your traffic, not for a slide deck.

24/7
response
04

Long-term support

After launch we stay on call: monitoring, model updates, on-prem upgrades, fine-tuning runs. One team, one number to call.

02 · what we build

Three lines of work.
One team behind all of them.

AI · core focus

Agent applications

Tool-using LLM agents that read your data, talk to your systems, and do real work. Built with evals, guardrails, and a human-in-the-loop UI from day one.

RAG, tool-use, multi-agent orchestration Eval suites + telemetry built-in Open-source models or hosted APIs
Application engineering

Custom apps

Web and mobile, internal tools, customer portals. TypeScript, React Native, Go, Python — picked to fit, not to flex.

Discovery → design → ship Native + web from one team Owned by you, hosted by you
Infra · Ascend stack

Ascend LLM deployment

Deploy and fine-tune open-source LLMs on the Ascend (昇腾) ecosystem. We've hit the gotchas already so your team doesn't.

MindSpore + CANN + MindIE pipelines LoRA / QLoRA / full fine-tunes On-prem inference benchmarks included
Ongoing

Tech support & advisory

A retainer team for after the build. Architecture reviews, model upgrades, performance tuning, on-call.

Quarterly model refreshes Capacity & cost reviews Direct Slack / WeChat line
03 · process

Six weeks. Working software.

01
Week 0

Scope

Half-day workshop. We leave with a one-page spec, a budget range, and a yes/no on technical risk.

02
Week 1

Prototype

A live prototype your stakeholders can click. Real auth, fake data. Used to settle scope before we burn build time.

03
Week 2 – 6

Build

Two-week sprints. Demo every Friday. You see velocity in working software, not Gantt charts.

04
Week 6+

Ship & iterate

Production launch, then a permanent retainer team measured on outcomes — not tickets closed.

04 · faq

The questions that come up first.

Anything else? Just ask.

Do you work fixed-price or T&M? +
Both. Discovery is fixed-price. Builds are usually a fixed scope per sprint with a not-to-exceed cap. Long-term support is a monthly retainer.
Why Ascend (昇腾) specifically? +
Domestic chip availability matters for a lot of our clients — finance, gov-adjacent, regulated SaaS. We have production deployments on Ascend 910B/310P and know where the rough edges are: operator coverage, MindIE conversion, distributed training quirks.
Can you fine-tune on our private data? +
Yes. LoRA / QLoRA on open-source bases (Qwen, DeepSeek, Llama, GLM) is our default. Full SFT and continued pre-training when the data warrants it. Everything stays inside your VPC or on your hardware.
How small is too small? +
Our smallest engagement is a 2-week prototype sprint. Below that, you probably don't need us — you need a weekend.
Do you do offshore / nearshore? +
We're a single team in one timezone. We sync to yours during overlap hours and hand off cleanly the rest of the day.
Let's build

Tell us what you're
trying to ship.

30-min discovery call. We come back with a one-page spec and a budget range — within 48 hours.

We reply within one working day.
Also reach us via
© crabagent 2026 Beijing · Remote-first