human agent: add task quit command for giving up on tasks (#1426)

* human agent: add `task quit` command for giving up on tasks * callout for dev version --------- Co-authored-by: jjallaire <[email protected]>
UKGovernmentBEIS · Feb 28, 2025 · 977044f · 977044f
1 parent 2e2969d
commit 977044f
Show file tree

Hide file tree

Showing 4 changed files with 106 additions and 42 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -11,6 +11,7 @@
 - OpenAI: Tolerate `None` for assistant content (can happen when there is a refusal).
 - Google: Retry requests on more HTTP status codes (selected 400 errors and all 500 errors). 
 - Event Log: Add `working_start` attribute to events and `completed` and `working_time` to model, tool, and subtask events.
+- Human Agent: Add `task quit` command for giving up on tasks.
 - Human Agent: Don't emit sandbox events for human agent
 - Inspect View: Improve rendering of JSON within logging events.
 - Inspect View: Improve virtualized rendering of Sample List, Sample Transcript, and Sample Messages.

diff --git a/docs/human-agent.qmd b/docs/human-agent.qmd
@@ -78,14 +78,15 @@ Alternatively, if the human agent is working within VS Code then two links are p
 
 The Human agent solver installs agent task tools in the default sandbox and presents the user with both task instructions and documentation for the various tools (e.g. `task submit`, `task start`, `task stop`, `task instructions`, etc.). By default, the following command are available:
 
-| Command             | Description                               |
-|---------------------|-------------------------------------------|
-| `task submit`       | Submit your final answer for the task.    |
-| `task note`         | Record a note in the task transcript.     |
-| `task status`       | Print task status (clock, scoring , etc.) |
-| `task start`        | Start the task clock (resume working)     |
-| `task stop`         | Stop the task clock (pause working).      |
-| `task instructions` | Display task command and instructions.    |
+| Command             | Description                                 |
+|---------------------|---------------------------------------------|
+| `task submit`       | Submit your final answer for the task.      |
+| `task quit`         | Quit the task without submitting an answer. |
+| `task note`         | Record a note in the task transcript.       |
+| `task status`       | Print task status (clock, scoring , etc.)   |
+| `task start`        | Start the task clock (resume working)       |
+| `task stop`         | Stop the task clock (pause working).        |
+| `task instructions` | Display task command and instructions.      |
 
 : {tbl-colwidths=\[40,60\]}
 
@@ -95,7 +96,7 @@ Note that the instructions are also copied to an `instructions.txt` file in the
 
 When the human agent has completed the task, they submit their answer using the `task submit`command. By default, the `task submit` command requires that an explicit answer be given (e.g. `task submit picoCTF{73bfc85c1ba7}`).
 
-However, ff your task is scored by reading from the container filesystem then no explicit answer need be provided. Indicate this by passing `answer=False` to the `human_agent()`:
+However, if your task is scored by reading from the container filesystem then no explicit answer need be provided. Indicate this by passing `answer=False` to the `human_agent()`:
 
 ``` python
 solver=human_agent(answer=False)
@@ -113,6 +114,18 @@ You can also specify a regex to match the answer against for validation, for exa
 solver=human_agent(answer=r"picoCTF{\w+}")
 ```
 
+### Quitting
+
+::: {.callout-note appearance="simple"}
+The `task quit` command described below is currently available only in the development version of Inspect. To install the development version from GitHub:
+
+``` bash
+pip install git+https://github.com/UKGovernmentBEIS/inspect_ai
+```
+:::
+
+If the user is unable to complete the task in some allotted time they may quit the task using the `task quit` command. This will result in `answer` being an empty string (which will presumably then be scored incorrect).
+
 ### Intermediate Scoring
 
 You can optionally make intermediate scoring available to human baseliners so that they can check potential answers as they work. Use the `intermediate_scoring` option (which defaults to `False`) to do this:

diff --git a/src/inspect_ai/solver/_human_agent/commands/__init__.py b/src/inspect_ai/solver/_human_agent/commands/__init__.py
@@ -6,7 +6,7 @@
 from .note import NoteCommand
 from .score import ScoreCommand
 from .status import StatusCommand
-from .submit import SubmitCommand, ValidateCommand
+from .submit import QuitCommand, SubmitCommand, ValidateCommand
 
 
 def human_agent_commands(
@@ -15,8 +15,12 @@ def human_agent_commands(
     intermediate_scoring: bool,
     record_session: bool,
 ) -> list[HumanAgentCommand]:
-    # base submit and validate
-    commands = [SubmitCommand(record_session), ValidateCommand(answer)]
+    # base submit, validate, and quit
+    commands = [
+        SubmitCommand(record_session),
+        ValidateCommand(answer),
+        QuitCommand(record_session),
+    ]
 
     # optional intermediate scoring
     if intermediate_scoring:

diff --git a/src/inspect_ai/solver/_human_agent/commands/submit.py b/src/inspect_ai/solver/_human_agent/commands/submit.py
@@ -16,22 +16,89 @@
 logger = getLogger(__name__)
 
 
-class SubmitCommand(HumanAgentCommand):
+class SessionEndCommand(HumanAgentCommand):
     def __init__(self, record_session: bool):
         super().__init__()
         self._record_session = record_session
 
+    @property
+    def group(self) -> Literal[1, 2, 3]:
+        return 1
+
+    async def _read_session_logs(self) -> dict[str, str]:
+        # retreive session logs (don't fail)
+        sessions_dir = PurePosixPath(RECORD_SESSION_DIR)
+        result = await sandbox().exec(["ls", "-1", sessions_dir.as_posix()])
+        if not result.success:
+            logger.warning(f"Error listing human agent session logs: {result.stderr}")
+            return {}
+
+        # read logs
+        session_logs: dict[str, str] = {}
+        for session_log in result.stdout.strip().splitlines():
+            try:
+                session_logs[session_log] = await sandbox().read_file(
+                    (sessions_dir / session_log).as_posix()
+                )
+            except Exception as ex:
+                logger.warning(f"Error reading human agent session log: {ex}")
+
+        return session_logs
+
+
+class QuitCommand(SessionEndCommand):
     @property
     def name(self) -> str:
-        return "submit"
+        return "quit"
 
     @property
     def description(self) -> str:
-        return "Submit your final answer for the task."
+        return "Quit the task without submitting an answer."
+
+    def cli(self, args: Namespace) -> None:
+        # verify that the user wants to proceed
+        action = "quit the task without submitting an answer (ending the exercise)"
+        while True:
+            response = (
+                input(
+                    f"\nDo you definitely want to {action}?\n\nThis will disconnect you from the task environment and you won't be able to reconnect.\n\nYes (y) or No (n): "
+                )
+                .lower()
+                .strip()
+            )
+            if response in ["yes", "y"]:
+                break
+            elif response in ["no", "n"]:
+                return
+            else:
+                print("Please enter yes or no.")
 
+        # thank the user!
+        print(
+            "\nThank you for working on this task!\n\n"
+            + "Your task will now be scored and you will be disconnected from this container.\n"
+        )
+
+        call_human_agent("quit")
+
+    def service(self, state: HumanAgentState) -> Callable[..., Awaitable[JsonValue]]:
+        async def submit() -> None:
+            if self._record_session:
+                state.logs = await self._read_session_logs()
+            state.running = False
+            state.answer = ""
+
+        return submit
+
+
+class SubmitCommand(SessionEndCommand):
     @property
-    def group(self) -> Literal[1, 2, 3]:
-        return 1
+    def name(self) -> str:
+        return "submit"
+
+    @property
+    def description(self) -> str:
+        return "Submit your final answer for the task."
 
     @property
     def cli_args(self) -> list[HumanAgentCommand.CLIArg]:
@@ -55,10 +122,12 @@ def cli(self, args: Namespace) -> None:
         # verify that the user wants to proceed
         answer = call_args.get("answer", None)
         answer_text = f" '{answer}'" if answer else ""
+        action = f"end the task and submit{answer_text}"
+
         while True:
             response = (
                 input(
-                    f"\nDo you definitely want to end the task and submit{answer_text}?\n\nThis will disconnect you from the task environment and you won't be able to reconnect.\n\nYes (y) or No (n): "
+                    f"\nDo you definitely want to {action}?\n\nThis will disconnect you from the task environment and you won't be able to reconnect.\n\nYes (y) or No (n): "
                 )
                 .lower()
                 .strip()
@@ -76,40 +145,17 @@ def cli(self, args: Namespace) -> None:
             + "Your task will now be scored and you will be disconnected from this container.\n"
         )
 
-        # submit the task
         call_human_agent("submit", **call_args)
 
     def service(self, state: HumanAgentState) -> Callable[..., Awaitable[JsonValue]]:
-        async def submit(
-            answer: str | None, session_logs: dict[str, str] | None = None
-        ) -> None:
+        async def submit(answer: str) -> None:
             if self._record_session:
                 state.logs = await self._read_session_logs()
             state.running = False
             state.answer = answer
 
         return submit
 
-    async def _read_session_logs(self) -> dict[str, str]:
-        # retreive session logs (don't fail)
-        sessions_dir = PurePosixPath(RECORD_SESSION_DIR)
-        result = await sandbox().exec(["ls", "-1", sessions_dir.as_posix()])
-        if not result.success:
-            logger.warning(f"Error listing human agent session logs: {result.stderr}")
-            return {}
-
-        # read logs
-        session_logs: dict[str, str] = {}
-        for session_log in result.stdout.strip().splitlines():
-            try:
-                session_logs[session_log] = await sandbox().read_file(
-                    (sessions_dir / session_log).as_posix()
-                )
-            except Exception as ex:
-                logger.warning(f"Error reading human agent session log: {ex}")
-
-        return session_logs
-
 
 class ValidateCommand(HumanAgentCommand):
     def __init__(self, answer: bool | str) -> None: