Question 1

How do you make sure a generated test reflects what the system actually does?

Accepted Answer

Grounding, not generation, is the hard part. We reconstruct real call chains from the source with static analysis, then a second independent agent re-reads the code and verifies each feature description before it is trusted. Fabricated claims are caught before they ever reach a test.

Question 2

Is a model involved when the tests run?

Accepted Answer

No. Models help generate and verify tests, but at execution time there is no model in the loop. Generated tests run against the live system and record a plain verdict, so test results stay deterministic and reproducible rather than depending on a model's mood.

Question 3

What happens to test coverage as the code keeps changing?

Accepted Answer

Because tests are derived from the system's reconstructed behaviour rather than written by hand, coverage tracks the code instead of trailing it. When call chains change, the analysis picks it up, so the thin spots that usually matter most stop being left behind.

Question 4

How does the two-agent check reduce risk?

Accepted Answer

One agent describes a feature; a second, independent agent re-derives that description from the code and must agree before it is used. This adversarial step is the main defence against confidently wrong tests, the failure mode that makes automated test generation untrustworthy.

Tests that prove what software actually does

Common questions

Have a problem shaped like this?