engineering report100% internal · engineering team

The impact of AI on our code, measured.

We compared the team before and after adopting Claude — deliveries, bugs and code volume, normalized per day, across three adoption phases.

P1 → P33 adoption phases
98 daysFeb 25 → Jun 2, 2026
5 devsbefore and after
scroll to see the numbers
eng-metrics — bash
$ eng stats --compare P1..P3 --per-day
bug:task ratio1.050.31 70%
bugs / day2.070.84 59%
critical / day0.570.20 65%
deliveries / day1.972.67 36%
loc / day9961,958 97%
# 4/5 devs with positive throughput
// 01 · context

From zero AI to a whole team on Claude

Adoption happened across three phases of different lengths. That is why everything is measured per day — the axis below is proportional to each period's days.

P1 · pre-claude
No AI
02/2503/26
30 days

Baseline of the team working without any AI tooling.

P2 · solo-claude
Build
03/2704/14
19 days

One dev using Claude alone to build the internal tool.

P3 · team-claude
Full team
04/1506/02
49 days

All 5 developers operating with Claude in their workflow.

// 02 · the number that matters
70%
in the bug : task ratio

The most robust signal of effective work: it combines productivity and quality in a single number. The team that used to create ~1 bug per delivered task now creates 0.3 — while delivering far more in the process.

1.05
P1 · no AI
0.31
P3 · with Claude
// 03 · team metrics

Six numbers, one direction

Per-day comparison across the three phases. P2 is the build phase — context, not feature delivery.

Jira Done / day
P11.97
P21.10
P32.67
+36%
Bugs created / day
P12.07
P21.87
P30.84
59%
High / Critical / day
P10.57
P20.74
P30.20
65%
Bug : Task ratio● most robust signal
P11.05
P21.67
P30.31
70%
Lines of code / day
P1996
P22,555
P31,958
+97%
Lines per PR
P1393
P21,184
P3959
+144%

// note: in P2 (build) the largest code volume of all phases was written — without opening Jira tasks. Hence few tasks/day (1.10) and the inflated Bug:Task ratio (1.67): P2's volume is infrastructure, not feature delivery.

// 04 · quality

Fewer bugs — and less severe ones

The drop in volume comes with an even larger drop in critical bugs: the team not only introduces fewer defects, it introduces less dangerous ones.

Bugs created / day
total volume of bugs introduced per day
2.07
P1 · no AI
0.84
P3 · Claude
59% in bug volume
High / Critical / day
high and critical severity bugs only
0.57
P1 · no AI
0.20
P3 · Claude
65% in severe bugs
// 05 · throughput

More delivery, much more code

With fewer bugs, the team also produced more: daily deliveries rose by a third and code volume nearly doubled, in PRs 2.4× larger.

Jira Done / day
tasks completed per day
1.97
P1
2.67
P3
+36% in deliveries
Lines of code / day
daily volume of code produced
996
P1
1,958
P3
+97% in code
Lines per PR
average size of each pull request
393
P1
959
P3
+144% per PR
// 06 · individual performance

The gain is distributed across the team

Jira Done per day, per developer. The jump shows up in almost everyone — it is not down to a single person.

P1 · no AI
P3 · with Claude
positive growth
0.83
1.43
+72%
Dev 1
 
0.60
0.43
28%
Dev 2
partial* · adj. +3%
0.33
0.31
6%
Dev 3
built the tool**
0.13
0.27
+108%
Dev 4
 
0.07
0.24
+243%
Dev 5
 

// * Dev 2 was present for 34 of the 49 days of P3 (left on 05/18). ** Dev 3's focus in P2 was infrastructure — code without associated tasks.

// 07 · origin of bugs in P3

The team spends energy cleaning the present

Of the 42 bugs closed in P3, most were born in the phase itself — a sign of active flow, not accumulated debt.

45%
born in P3
45.2%Born in P3Team triaging incoming work fast — healthy, active flow.
38.1%P2 debtInherited from the tool-building phase.
14.3%Old bugsPredating the project (pre-baseline).
2.4%Came from P1Minimal residue from the no-AI phase.
95%intra-phase resolution · P1
94%intra-phase resolution · P2
46%in P3 — phase still ongoing
// 08 · conclusion

AI does not replace the team. It multiplies a team that knows the product.

The biggest gain did not come from writing more code, but from a team with product context and history driving the tool. AI amplifies that knowledge — domain, legacy, prior decisions. Without that context, no tool delivers −70% bugs per task with +36% deliveries. It is the team that knows what it is building that makes AI effective.

+36%
more tasks delivered per day, with code volume nearly doubled.
−70%
fewer bugs per task — productivity and quality rising together.
4 / 5
developers with positive throughput growth in P3.
// methodological caveats
  • Phases of different lengths (30d / 19d / 49d) — all metrics normalized per day.
  • Analysis limited to the internal engineering team, over the AI repository built by the team itself — no external consulting.
  • P2 was an infrastructure-building phase, not feature delivery.
  • The growth coincides with the natural maturing of the team and the project.
  • Dev 2 left the project on 05/18 — partial presence in P3 (34 of 49 days).