There’s a common belief that testing LLM-based apps requires throwing out the whole testing playbook. Because outputs are non-deterministic, the thinking goes, traditional testing just doesn’t apply. I get it. But what I’ve seen happen in practice is teams falling back on manual spot-checking and calling it done. At one company I worked at we...
