Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the python interface #309

Open
thebabush opened this issue Aug 28, 2018 · 6 comments
Open

Regarding the python interface #309

thebabush opened this issue Aug 28, 2018 · 6 comments

Comments

@thebabush
Copy link

thebabush commented Aug 28, 2018

Hi,

first of all thanks for the tool. During the last DEFCON finals I realized once again how there's a lack of tools when it comes to userspace and scriptable multi-arch emulation.
Usercorn looks like it might be THE tool if it matures enough.

That being said, I noticed you removed the python interface in favor of #184.
As everyone in security I do love python and I did a small PoC of how a Go/python integration would work (thebabush/usercorn).

If you want to try it, just make && make py && ./make.sh.

So the idea is to use opaque handles + ref counting as a way around the garbage collector.
I think that most (all?) of go/models/usercorn.go could be easily mapped to C or FFI.
It's a PITA to do the mapping manually, but after some tests with parsing Go, I would say that doing it automatically in the proper way would be a whole project by itself (maybe a regex-based approach would suffice, which is what z3 bindings for python do AFAIK).

Still, why not create a barebone plugin mechanism instead of exposing usercorn as a shared object? Like usercorn --whatever whatever.so ./my_binary.

My stuff actually uses CFFI at compile time, which should be faster than a shared plugin and should support Python 2/3/pypy.

I'm opening this issue to see what you think about it. It's just an hack for now but it looks like a viable way of implementing scripting (or a general C API).

@lunixbochs
Copy link
Owner

Usercorn already has a scripting engine (luaish), which is lua modified to be more pythonic, and it autobinds the whole API.

See some basic examples here: https://github.com/Caesurus/usercorn_examples

@lunixbochs
Copy link
Owner

lunixbochs commented Aug 28, 2018

Due to limitations in Go, usercorn definitely needs to be the script host. It doesn't make sense to compile usercorn to a shared object at all.

Another option is to generate an RPC layer and run the scripts in a separate process. Go's reflection is fairly capable. Neovim's msgpack-rpc system has a lot of flexibility and is able to service many programming languages and embedding styles easily, so it might be worth looking in that direction.

Honestly the first step is going to be collecting goals of the scripting system. For example, I want to make it very easy to write a custom binary loader, or extend an existing binary loader using a script.

@thebabush
Copy link
Author

I do know about the scripting engine, thanks.

RPC sounds slow to me, but I may well be wrong.

Yeah that's a good point. Being able to extend the loader would be awesome. Still, I think that a common use case might be using usercorn as an advanced/scriptable debugger or as an multiarch (slow-ish) instrumentation tool. AFAIK the current offering (PIN/qbdi/panda/etc...) don't cover that space. Like, as of now there's no easy way (that I know of) to trace an ARM linux binary on x86. Yes you can use QEMU user and its logging feature, but it is far from ideal.

@lunixbochs
Copy link
Owner

lunixbochs commented Aug 29, 2018

RPC is mostly only going to be slow if you're doing something on a per instruction level, and there's no fast way to do that in Python with or without RPC.

There are also some pretty big performance improvements I can do to unicorn/usercorn at some point for read-only hooks (if you don't need to modify anything during the hook callback).

If you're using usercorn as a scriptable debugger or doing things like basic binary loading, RPC should be plenty fast enough. If you're doing read-only tracing, you can trace to a file and parse it in whatever language you want, which will be way faster than trying to do it in-process due to FFI overhead.

RPC would also be able to inject simple luaish snippets/hooks to modify behavior that wouldn't need a round trip through the RPC interface.

@lunixbochs
Copy link
Owner

It would help me think about this quite a bit if you describe some main specific APIs you want + what sorts of things about the emulator you want to be able to script.

@thebabush
Copy link
Author

Stuff like:

  • Breakpoint/callbacks on complex conditions
  • Modify registers/memory
  • Save/restore CPU state
  • Tracing

All of this could be used to write quick dynamic analysis scripts to help reversing in IDA/binja. Taint tracking, tracing memory operations, stuff like that. Also, it could be used to prototype all sorts of tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants