Skip to content

Commit

Permalink
zygi's site
Browse files Browse the repository at this point in the history
  • Loading branch information
mikesklar committed Jan 13, 2024
1 parent 52d0023 commit f84ed72
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions posts/TDC2023.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,9 @@ Assume we are trying to find triggers for some payload $s_2$. Take a completely

Somehow, GCG’s first-order approximation (which it uses to select candidate mutations) is accurate enough to rapidly descend in this setting. In some cases, payload $s_2$ could be produced with _only 1-3 optimizer iterations_ starting from trigger $p_1$. We were very surprised by this. Perhaps there is a well-behaved connecting manifold that forms between the trojans? **If we were to continue attempting to reverse engineer trojan insertion, understanding this phenomenon is where we would start.**

#### 5.
For some additional details on our investigations, see [Zygi's personal site](https://zygi.me/blog/adventures-in-trojan-detection/#open-questions)

# Red Teaming Track Takeaways

#### First, a note on terminology
Expand Down

0 comments on commit f84ed72

Please sign in to comment.