Skip to content

Commit

Permalink
updated the README
Browse files Browse the repository at this point in the history
Signed-off-by: Peter Staar <[email protected]>
  • Loading branch information
PeterStaar-IBM committed Jan 6, 2025
1 parent a57832f commit 1c33c50
Show file tree
Hide file tree
Showing 8 changed files with 191 additions and 383 deletions.
98 changes: 98 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,104 @@ The final result (struct only here) can be visualised as,
| 0.95 | 1 | 27.53 | 72.47 | 27.53 | 2862 |
</details>

### Pub1M

Using a single command (loading the dataset from Huggingface: [Pub1M_OTSL](https://huggingface.co/datasets/ds4sd/Pub1M_OTSL)),

```sh
poetry run python ./docs/examples/benchmark_p1m.py
```

<details>
<summary><b>Table evaluations for Pub1M</b></summary>
<br>

👉 Evaluate the dataset,

```sh
poetry run evaluate -t evaluate -m tableformer -b Pub1M -i ./benchmarks/Pub1M-dataset/tableformer -o ./benchmarks/Pub1M-dataset/tableformer
```

👉 Visualise the dataset,

```sh
poetry run evaluate -t visualize -m tableformer -b Pub1M -i ./benchmarks/Pub1M-dataset/tableformer -o ./benchmarks/Pub1M-dataset/tableformer
```

| x0<=TEDS | TEDS<=x1 | prob [%] | acc [%] | 1-acc [%] | total |
|------------|------------|------------|-----------|-------------|---------|
| 0 | 0.05 | 1.3 | 0 | 100 | 13 |
| 0.05 | 0.1 | 0.8 | 1.3 | 98.7 | 8 |
| 0.1 | 0.15 | 0.2 | 2.1 | 97.9 | 2 |
| 0.15 | 0.2 | 0.2 | 2.3 | 97.7 | 2 |
| 0.2 | 0.25 | 0 | 2.5 | 97.5 | 0 |
| 0.25 | 0.3 | 0 | 2.5 | 97.5 | 0 |
| 0.3 | 0.35 | 0.3 | 2.5 | 97.5 | 3 |
| 0.35 | 0.4 | 0 | 2.8 | 97.2 | 0 |
| 0.4 | 0.45 | 0.1 | 2.8 | 97.2 | 1 |
| 0.45 | 0.5 | 0.3 | 2.9 | 97.1 | 3 |
| 0.5 | 0.55 | 0.8 | 3.2 | 96.8 | 8 |
| 0.55 | 0.6 | 1.6 | 4 | 96 | 16 |
| 0.6 | 0.65 | 1.6 | 5.6 | 94.4 | 16 |
| 0.65 | 0.7 | 2.3 | 7.2 | 92.8 | 23 |
| 0.7 | 0.75 | 4.6 | 9.5 | 90.5 | 46 |
| 0.75 | 0.8 | 10.8 | 14.1 | 85.9 | 108 |
| 0.8 | 0.85 | 15.3 | 24.9 | 75.1 | 153 |
| 0.85 | 0.9 | 21.6 | 40.2 | 59.8 | 216 |
| 0.9 | 0.95 | 22.9 | 61.8 | 38.2 | 229 |
| 0.95 | 1 | 15.3 | 84.7 | 15.3 | 153 |
</details>

### PubTabNet

Using a single command (loading the dataset from Huggingface: [Pubtabnet_OTSL](https://huggingface.co/datasets/ds4sd/Pubtabnet_OTSL)),

```sh
poetry run python ./docs/examples/benchmark_pubtabnet.py
```

<details>
<summary><b>Table evaluations for Pubtabnet</b></summary>
<br>

👉 Evaluate the dataset,

```sh
poetry run evaluate -t evaluate -m tableformer -b Pubtabnet -i ./benchmarks/pubtabnet-dataset/tableformer -o ./benchmarks/pubtabnet-dataset/tableformer
```

👉 Visualise the dataset,

```sh
poetry run evaluate -t visualize -m tableformer -b Pubtabnet -i ./benchmarks/pubtabnet-dataset/tableformer -o ./benchmarks/pubtabnet-dataset/tableformer
```

The final result (struct only here) can be visualised as,

| x0<=TEDS | TEDS<=x1 | prob [%] | acc [%] | 1-acc [%] | total |
|------------|------------|------------|-----------|-------------|---------|
| 0 | 0.05 | 0 | 0 | 100 | 0 |
| 0.05 | 0.1 | 0.01 | 0 | 100 | 1 |
| 0.1 | 0.15 | 0.01 | 0.01 | 99.99 | 1 |
| 0.15 | 0.2 | 0.02 | 0.02 | 99.98 | 2 |
| 0.2 | 0.25 | 0 | 0.04 | 99.96 | 0 |
| 0.25 | 0.3 | 0 | 0.04 | 99.96 | 0 |
| 0.3 | 0.35 | 0 | 0.04 | 99.96 | 0 |
| 0.35 | 0.4 | 0 | 0.04 | 99.96 | 0 |
| 0.4 | 0.45 | 0.02 | 0.04 | 99.96 | 2 |
| 0.45 | 0.5 | 0.1 | 0.06 | 99.94 | 10 |
| 0.5 | 0.55 | 0.1 | 0.15 | 99.85 | 10 |
| 0.55 | 0.6 | 0.24 | 0.25 | 99.75 | 25 |
| 0.6 | 0.65 | 0.47 | 0.49 | 99.51 | 49 |
| 0.65 | 0.7 | 1.04 | 0.96 | 99.04 | 108 |
| 0.7 | 0.75 | 2.44 | 2 | 98 | 254 |
| 0.75 | 0.8 | 4.65 | 4.44 | 95.56 | 483 |
| 0.8 | 0.85 | 13.71 | 9.09 | 90.91 | 1425 |
| 0.85 | 0.9 | 21.2 | 22.8 | 77.2 | 2204 |
| 0.9 | 0.95 | 28.48 | 43.99 | 56.01 | 2961 |
| 0.95 | 1 | 27.53 | 72.47 | 27.53 | 2862 |
</details>

## Contributing

Please read [Contributing to Docling](https://github.com/DS4SD/docling/blob/main/CONTRIBUTING.md) for details.
Expand Down
Empty file.
239 changes: 0 additions & 239 deletions docling_eval/benchmarks/fintabnet/create.py

This file was deleted.

14 changes: 9 additions & 5 deletions docling_eval/cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,11 @@ def visualise(
with open(filename, "r") as fd:
table_evaluation = DatasetTableEvaluation.parse_file(filename)

figname = odir / f"evaluation_{benchmark.value}_{modality.value}-delta_row_col.png"
figname = (
odir / f"evaluation_{benchmark.value}_{modality.value}-delta_row_col.png"
)
table_evaluation.save_histogram_delta_row_col(figname=figname)

data, headers = table_evaluation.TEDS.to_table()
logging.info(
"TEDS table: \n\n" + tabulate(data, headers=headers, tablefmt="github")
Expand All @@ -153,10 +155,12 @@ def visualise(
logging.info(
"TEDS table: \n\n" + tabulate(data, headers=headers, tablefmt="github")
)

figname = odir / f"evaluation_{benchmark.value}_{modality.value}-struct-only.png"

figname = (
odir / f"evaluation_{benchmark.value}_{modality.value}-struct-only.png"
)
table_evaluation.TEDS_struct.save_histogram(figname=figname, name="struct")

elif modality == EvaluationModality.CODEFORMER:
pass

Expand Down
Loading

0 comments on commit 1c33c50

Please sign in to comment.