Automating Software Test Case Creation with AI Agents

Test Implementation Generation

To streamline this process, the DriveOS team at NVIDIA developed Hephaestus (HEPH), an internal generative AI framework for automatic test generation. HEPH automates design and implementation for various tests, including integration and unit tests. It uses large language models (LLMs) for input analysis and code generation, significantly reducing the time spent on creating test cases.

Procedure

Open a connection to the thermal zone using NvThermmonOpen().
Query the temperature for the thermal zone using NvThermmonGetZoneTemp() multiple times.
Compare temperature values over successive calls.

Expected Result: The temperature values should show a difference of >0 and should vary with the resolution supported by the temperature sensor.

Test Implementation Generation

Based on the test specifications and ICD details, HEPH generates the tests in C.

#include 
#include 
#include "NvThermmonAPI.h"

// Test Case TC_DOSBSP60-REQ-3874_001
// Verify successful temperature query for a supported thermal zone
void TC_DOSBSP60_REQ_3874_001(void) {
    NvThermmonHandle handle;
    zone_temp_t zoneTemp;

    // Step 1: Open a connection to the thermal zone using NvThermmonOpen()
    NvThermmonErrCode status = NvThermmonOpen("CPU-therm", &handle);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_001: NvThermmonOpen failed\n");
        return;
    }

    // Step 2: Query the temperature for the thermal zone using NvThermmonGetZoneTemp()
    status = NvThermmonGetZoneTemp(handle, &zoneTemp);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_001: NvThermmonGetZoneTemp failed\n");
        return;
    }

    // Step 3: Retrieve two consecutive temperature values
    int32_t delta = zoneTemp.temp_2 - zoneTemp.temp_1;
    if (delta > 100) { // Assume 100 is the delta determined by the thermal characteristics of the board
        printf("TC_DOSBSP60-REQ-3874_001: Temperature delta is too large\n");
    } else {
        printf("TC_DOSBSP60-REQ-3874_001: Passed\n");
    }
}

// Test Case TC_DOSBSP60-REQ-3874_002
// Verify accompanying timestamp for temperature queries
void TC_DOSBSP60_REQ_3874_002(void) {
    NvThermmonHandle handle;
    zone_temp_t zoneTemp;

    // Step 1: Open a connection to the thermal zone using NvThermmonOpen()
    NvThermmonErrCode status = NvThermmonOpen("CPU-therm", &handle);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_002: NvThermmonOpen failed\n");
        return;
    }

    // Step 2: Query the temperature for the thermal zone using NvThermmonGetZoneTemp()
    status = NvThermmonGetZoneTemp(handle, &zoneTemp);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_002: NvThermmonGetZoneTemp failed\n");
        return;
    }

    // Step 3: Retrieve two consecutive temperature values along with two corresponding timestamps
    uint64_t deltaTs = zoneTemp.ts_2 - zoneTemp.ts_1;
    if (deltaTs > 1000) { // Assume 1000 microseconds is the maximum allowed delta
        printf("TC_DOSBSP60-REQ-3874_002: Timestamp delta is too large\n");
    } else {
        printf("TC_DOSBSP60-REQ-3874_002: Passed\n");
    }
}

// Test Case TC_DOSBSP60-REQ-3874_003
// Verify temperature variation over successive calls
void TC_DOSBSP60_REQ_3874_003(void) {
    NvThermmonHandle handle;
    zone_temp_t zoneTemp1, zoneTemp2;

    // Step 1: Open a connection to the thermal zone using NvThermmonOpen()
    NvThermmonErrCode status = NvThermmonOpen("CPU-therm", &handle);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_003: NvThermmonOpen failed\n");
        return;
    }

    // Step 2: Query the temperature for the thermal zone using NvThermmonGetZoneTemp() multiple times
    status = NvThermmonGetZoneTemp(handle, &zoneTemp1);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_003: NvThermmonGetZoneTemp (1st call) failed\n");
        return;
    }

    status = NvThermmonGetZoneTemp(handle, &zoneTemp2);
    if (status!= NV_THERMMON_ERR_CODE_NO_ERROR) {
        printf("TC_DOSBSP60-REQ-3874_003: NvThermmonGetZoneTemp (2nd call) failed\n");
        return;
    }

    // Step 3: Compare temperature values over successive calls
    int32_t temperatureDifference = zoneTemp2.temp_1 - zoneTemp1.temp_1;
    if (temperatureDifference <= 0) {
        printf("TC_DOSBSP60-REQ-3874_003: Temperature did not vary as expected\n");
    } else {
        printf("TC_DOSBSP60-REQ-3874_003: Passed\n");
    }
}

Future Enhancements

There are a few core things to focus on when designing a test generation framework:

Supporting different test workflows
Integrating real-time human feedback

Supporting Different Test Workflows

HEPH is designed to support most test-generation use cases for software teams. Still, there are instances when teams require a custom test framework or an unsupported test creation workflow.

To address these challenges, potential future improvements to HEPH could include a modular design, enabling software teams to define custom modules for non-standard workflows.

Integrating Real-Time Human Feedback

While the latest LLMs perform well in understanding the development context and generating high-quality code, there are cases where generated test sets may require improvement.

Possible future enhancements to HEPH could include the introduction of an interactive mode alongside the current automatic mode. In this interactive mode, you’d interact with the HEPH agent at each step of the test generation process, reviewing results, providing feedback, and refining outputs before proceeding.

Start HEPHing Your Automatic Test Generation

Hephaestus (HEPH) automates test generation in software development by using LLMs to create comprehensive and context-aware tests. This automation reduces manual effort, accelerates development, and improves the quality and reliability of the final product.

Build Your AI Agent Application

For more information about using NVIDIA generative AI technologies and tools to create your own AI agents and applications, see ai.nvidia.com or try out NVIDIA NIM APIs.

Acknowledgments

Thanks to the DriveOS QNX BSP team for piloting HEPH.

Post Views: 40

Automating Software Test Case Creation with AI Agents

Test Implementation Generation

Procedure

Test Implementation Generation

Future Enhancements

Supporting Different Test Workflows

Integrating Real-Time Human Feedback

Start HEPHing Your Automatic Test Generation

Build Your AI Agent Application

Acknowledgments

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Generate single title from this title Nvidia shares fall as blockbuster results fail to dazzle in 100 -150 characters. And it must return only...

Generate single title from this title It exposed what was already broken in 100 -150 characters. And it must return only title i dont...

What is a Performance Review + Definition?

LEAVE A REPLY Cancel reply

Latest

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Categories

Useful Links

Our Newsletter