Skip to content

Commit

Permalink
edit README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
joheli committed Feb 19, 2024
1 parent 69d1d5c commit b16dfdb
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,14 @@ This tells `rosinenpicker` to look in `/path/to/documents` for PDF files contain

Now of course it is not very useful to just extract the term "apple pie" out of documents. But you can do much more. Instead of "apple pie" you can enter a regular expression, e.g. "\d{8}" to extract numbers consisting of exactly eight digits. But there's more: if you enter an expression along with "@@@" (which stands for "variable string"), only a match to "@@@" is returned. E.g. "Name: @@@" will return whatever follows "Name:"!

#### Even further fine-grained control
You can even add more fine-grained control by appending characters after the string '===' (three equal signs):
- `m` (**m**ultiline) will allow multiline pattern matching
- `l` (**l**inebreak to space) will replace linebreaks with space (only applies for multiline matching)
- `c(x)` (**c**rop length to x) will crop the length of the returned string to x
- `?` (optional term) will mark the term as not optional; optional key `move_to_directory` (see [sample configuration file](configs/config.yml)) will ignore these terms
So e.g. a term "start@@@finish===mc(100)l?" will search for text between pattern "start" and "finish" over multiple lines, replace line breaks with space, crop the returned text to 100 characters, and mark the term as optional (not required).

## Using `rosinenpicker`

With your `config.yml` ready, go back to the command line and run `rosinenpicker` with the `-c` and `-d` arguments as shown above.
Expand Down

0 comments on commit b16dfdb

Please sign in to comment.