This script sends text chunks to an ollama
server running the Large Language Model (LLM).
It performs pre- and post-processing and creates a diff between the original text and the edited version.
Not perfect, but helpful.
- You need an ollama instance running. Adjust the end-point if necessary
- Make sure you have to desired model installed
- Save the text you want to check in
inout/input.txt
. (Seeinout/input.example.txt
) - If it is a long text, add lines starting with
#
to break the text into chunks. (E.g. preprend headings with#
) - Run script (see
python main.py --help
) - e.g. run diff-so-fancy on the output diff file (
cat inout/diff.txt | diff-so-fancy
)
Different models and corresponding system prompts can be configured in llm_config.json
.
Most concise results are achieved with karen
so far.
If you are writing a latex article and want to process the text, you need to somehow extract it from there. I haven't found a good way yet.
My current approach is the following:
- Create a PDF that is as "clean" as possible (no figures, no equations, single-column). See
/latex
for this. - Use
extract_pdf.py
to get the text - Some manual fine-tuning