Skip to content

snapshot testing: the cure in the AI software generation era

Published: at 12:00 AM

snapshot testing: the cure in the AI software generation era

Table of Contents

Open Table of Contents

Background

A typical dev workflow:

  1. Know your goal
  2. (Ask an AI Agent to) Implement the code
  3. Make sure the code works
  4. Fix bugs, add features
  5. Keep making sure the code works

The challenge is that

So we are fear of changing code.

Proposal of solution

Every snapshot testing workflow should has a 1 bash file entry, we don’t want to remember arguments.

Example:

scripts/test_1.sh


export RUNTIME_DIR=$PWD/local_data/test_1
export REPO_DIR=$PWD
export SNAPSHOT_DIR=$REPO_DIR/snapshots/test_1

cd $RUNTIME_DIR

cat << EOF > config.json
{
    "repo_dir": "$REPO_DIR",
    "runtime_dir": "$RUNTIME_DIR"
}
EOF

$REPO_DIR/build/cpp_executable --config ./config.json | rg "Results:" -A 100 > $SNAPSHOT_DIR/test_1.received.txt

diff $SNAPSHOT_DIR/test_1.received.txt $SNAPSHOT_DIR/test_1.approved.txt

Dev projects are workflows: Pipeline scaffolding - Node oriented vibe coding

For any dev projects, there is limited workflows, such as

If you look this from a higher level you can see they are consists of Nodes

And you will find the number of workflows are quite limited, about O(Number of Nodes)

So I propose a way of setting up your project, especially if you want to vibe coding:

./scripts/
        compile.sh
        run_test.sh
./workflows/
        1.sh -> compile.sh
        2.sh -> compile.sh + run_test.sh

Then instruct AI to run workflows/*.sh every time it finishes some task

The core idea here is, in the workflow, you can define data input gen, data output verification easily using Python etc. You can leverage pre-seeded random data generator. This part of the code might be 200 lines. And let AI worry about the middle 20K lines.

Defining DAG or pipeline is not a new idea whatsoever,