Skip to content

Commit

Permalink
human agent: add task quit command for giving up on tasks (#1426)
Browse files Browse the repository at this point in the history
* human agent: add `task quit` command for giving up on tasks

* callout for dev version

---------

Co-authored-by: jjallaire <[email protected]>
  • Loading branch information
jjallaire-aisi and jjallaire authored Feb 28, 2025
1 parent 2e2969d commit 977044f
Show file tree
Hide file tree
Showing 4 changed files with 106 additions and 42 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- OpenAI: Tolerate `None` for assistant content (can happen when there is a refusal).
- Google: Retry requests on more HTTP status codes (selected 400 errors and all 500 errors).
- Event Log: Add `working_start` attribute to events and `completed` and `working_time` to model, tool, and subtask events.
- Human Agent: Add `task quit` command for giving up on tasks.
- Human Agent: Don't emit sandbox events for human agent
- Inspect View: Improve rendering of JSON within logging events.
- Inspect View: Improve virtualized rendering of Sample List, Sample Transcript, and Sample Messages.
Expand Down
31 changes: 22 additions & 9 deletions docs/human-agent.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -78,14 +78,15 @@ Alternatively, if the human agent is working within VS Code then two links are p

The Human agent solver installs agent task tools in the default sandbox and presents the user with both task instructions and documentation for the various tools (e.g. `task submit`, `task start`, `task stop`, `task instructions`, etc.). By default, the following command are available:

| Command | Description |
|---------------------|-------------------------------------------|
| `task submit` | Submit your final answer for the task. |
| `task note` | Record a note in the task transcript. |
| `task status` | Print task status (clock, scoring , etc.) |
| `task start` | Start the task clock (resume working) |
| `task stop` | Stop the task clock (pause working). |
| `task instructions` | Display task command and instructions. |
| Command | Description |
|---------------------|---------------------------------------------|
| `task submit` | Submit your final answer for the task. |
| `task quit` | Quit the task without submitting an answer. |
| `task note` | Record a note in the task transcript. |
| `task status` | Print task status (clock, scoring , etc.) |
| `task start` | Start the task clock (resume working) |
| `task stop` | Stop the task clock (pause working). |
| `task instructions` | Display task command and instructions. |

: {tbl-colwidths=\[40,60\]}

Expand All @@ -95,7 +96,7 @@ Note that the instructions are also copied to an `instructions.txt` file in the

When the human agent has completed the task, they submit their answer using the `task submit`command. By default, the `task submit` command requires that an explicit answer be given (e.g. `task submit picoCTF{73bfc85c1ba7}`).

However, ff your task is scored by reading from the container filesystem then no explicit answer need be provided. Indicate this by passing `answer=False` to the `human_agent()`:
However, if your task is scored by reading from the container filesystem then no explicit answer need be provided. Indicate this by passing `answer=False` to the `human_agent()`:

``` python
solver=human_agent(answer=False)
Expand All @@ -113,6 +114,18 @@ You can also specify a regex to match the answer against for validation, for exa
solver=human_agent(answer=r"picoCTF{\w+}")
```

### Quitting

::: {.callout-note appearance="simple"}
The `task quit` command described below is currently available only in the development version of Inspect. To install the development version from GitHub:

``` bash
pip install git+https://github.com/UKGovernmentBEIS/inspect_ai
```
:::

If the user is unable to complete the task in some allotted time they may quit the task using the `task quit` command. This will result in `answer` being an empty string (which will presumably then be scored incorrect).

### Intermediate Scoring

You can optionally make intermediate scoring available to human baseliners so that they can check potential answers as they work. Use the `intermediate_scoring` option (which defaults to `False`) to do this:
Expand Down
10 changes: 7 additions & 3 deletions src/inspect_ai/solver/_human_agent/commands/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from .note import NoteCommand
from .score import ScoreCommand
from .status import StatusCommand
from .submit import SubmitCommand, ValidateCommand
from .submit import QuitCommand, SubmitCommand, ValidateCommand


def human_agent_commands(
Expand All @@ -15,8 +15,12 @@ def human_agent_commands(
intermediate_scoring: bool,
record_session: bool,
) -> list[HumanAgentCommand]:
# base submit and validate
commands = [SubmitCommand(record_session), ValidateCommand(answer)]
# base submit, validate, and quit
commands = [
SubmitCommand(record_session),
ValidateCommand(answer),
QuitCommand(record_session),
]

# optional intermediate scoring
if intermediate_scoring:
Expand Down
106 changes: 76 additions & 30 deletions src/inspect_ai/solver/_human_agent/commands/submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,89 @@
logger = getLogger(__name__)


class SubmitCommand(HumanAgentCommand):
class SessionEndCommand(HumanAgentCommand):
def __init__(self, record_session: bool):
super().__init__()
self._record_session = record_session

@property
def group(self) -> Literal[1, 2, 3]:
return 1

async def _read_session_logs(self) -> dict[str, str]:
# retreive session logs (don't fail)
sessions_dir = PurePosixPath(RECORD_SESSION_DIR)
result = await sandbox().exec(["ls", "-1", sessions_dir.as_posix()])
if not result.success:
logger.warning(f"Error listing human agent session logs: {result.stderr}")
return {}

# read logs
session_logs: dict[str, str] = {}
for session_log in result.stdout.strip().splitlines():
try:
session_logs[session_log] = await sandbox().read_file(
(sessions_dir / session_log).as_posix()
)
except Exception as ex:
logger.warning(f"Error reading human agent session log: {ex}")

return session_logs


class QuitCommand(SessionEndCommand):
@property
def name(self) -> str:
return "submit"
return "quit"

@property
def description(self) -> str:
return "Submit your final answer for the task."
return "Quit the task without submitting an answer."

def cli(self, args: Namespace) -> None:
# verify that the user wants to proceed
action = "quit the task without submitting an answer (ending the exercise)"
while True:
response = (
input(
f"\nDo you definitely want to {action}?\n\nThis will disconnect you from the task environment and you won't be able to reconnect.\n\nYes (y) or No (n): "
)
.lower()
.strip()
)
if response in ["yes", "y"]:
break
elif response in ["no", "n"]:
return
else:
print("Please enter yes or no.")

# thank the user!
print(
"\nThank you for working on this task!\n\n"
+ "Your task will now be scored and you will be disconnected from this container.\n"
)

call_human_agent("quit")

def service(self, state: HumanAgentState) -> Callable[..., Awaitable[JsonValue]]:
async def submit() -> None:
if self._record_session:
state.logs = await self._read_session_logs()
state.running = False
state.answer = ""

return submit


class SubmitCommand(SessionEndCommand):
@property
def group(self) -> Literal[1, 2, 3]:
return 1
def name(self) -> str:
return "submit"

@property
def description(self) -> str:
return "Submit your final answer for the task."

@property
def cli_args(self) -> list[HumanAgentCommand.CLIArg]:
Expand All @@ -55,10 +122,12 @@ def cli(self, args: Namespace) -> None:
# verify that the user wants to proceed
answer = call_args.get("answer", None)
answer_text = f" '{answer}'" if answer else ""
action = f"end the task and submit{answer_text}"

while True:
response = (
input(
f"\nDo you definitely want to end the task and submit{answer_text}?\n\nThis will disconnect you from the task environment and you won't be able to reconnect.\n\nYes (y) or No (n): "
f"\nDo you definitely want to {action}?\n\nThis will disconnect you from the task environment and you won't be able to reconnect.\n\nYes (y) or No (n): "
)
.lower()
.strip()
Expand All @@ -76,40 +145,17 @@ def cli(self, args: Namespace) -> None:
+ "Your task will now be scored and you will be disconnected from this container.\n"
)

# submit the task
call_human_agent("submit", **call_args)

def service(self, state: HumanAgentState) -> Callable[..., Awaitable[JsonValue]]:
async def submit(
answer: str | None, session_logs: dict[str, str] | None = None
) -> None:
async def submit(answer: str) -> None:
if self._record_session:
state.logs = await self._read_session_logs()
state.running = False
state.answer = answer

return submit

async def _read_session_logs(self) -> dict[str, str]:
# retreive session logs (don't fail)
sessions_dir = PurePosixPath(RECORD_SESSION_DIR)
result = await sandbox().exec(["ls", "-1", sessions_dir.as_posix()])
if not result.success:
logger.warning(f"Error listing human agent session logs: {result.stderr}")
return {}

# read logs
session_logs: dict[str, str] = {}
for session_log in result.stdout.strip().splitlines():
try:
session_logs[session_log] = await sandbox().read_file(
(sessions_dir / session_log).as_posix()
)
except Exception as ex:
logger.warning(f"Error reading human agent session log: {ex}")

return session_logs


class ValidateCommand(HumanAgentCommand):
def __init__(self, answer: bool | str) -> None:
Expand Down

0 comments on commit 977044f

Please sign in to comment.