Skip to content

Commit

Permalink
Merge pull request #3959 from guardian/an/mass-deletion-script
Browse files Browse the repository at this point in the history
Commit mass-deletion script
  • Loading branch information
andrew-nowak authored Jan 13, 2025
2 parents 940160f + 1bcc139 commit 56edbc8
Show file tree
Hide file tree
Showing 2 changed files with 96 additions and 0 deletions.
43 changes: 43 additions & 0 deletions scripts/mass-deletion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# mass-deletion

The Grid services and UI have no way to perform mass deletion of images. When
this is required, you will need to perform it by interacting with the Grid API.
This script may help you to run these deletions.

## input

### todelete.txt

File containing list of image IDs to be deleted, separated by newlines

e.g.
```
abcdef0123456789abcdef0123456789abcdef01
10fedcba9876543210fedcba9876543210fedcba
...
```

### GRIDKEY

Environment variable containing [Grid API key](/docs/03-apis/01-authentication.md#api-keys)
to use for the request (should be Internal tier)

### GRIDDOMAIN

Environment variable containing the grid domain (kahuna domain).

## output

### progress.txt

File contains the last processed Image ID - this allows cancelling and resuming
the script if necessary.

### complete.txt

File containing the list of successfully deleted Image IDs.

### errors.txt

File containing the list of Image IDs which could not be deleted (for any
reason).
53 changes: 53 additions & 0 deletions scripts/mass-deletion/mass-deletion.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/bin/bash

if [[ -n "$GRIDKEY" || -n "$GRIDDOMAIN" ]]; then
echo "make sure to set env vars GRIDKEY and GRIDDOMAIN to run this script"
exit 1
fi

# input file, stores all the ids that must be deleted
del=todelete.txt
# progress file, stores the id most recently successfully deleted
prog=progress.txt
# completion file, stores all ids successfully deleted
com=complete.txt
# errors file, stores all ids that could not be deleted
errs=errors.txt

touch $prog
touch $com
touch $errs

all="$(wc -l $del | awk '{ print $1 }' )"
ndone=0

# while skipping is "yes", we'll fastforward through the input file "todelete.txt"
# until we get to the id that was in the progress file. This way we won't reattempt
# any deletions that completed in a previous run, and we can quickly resume our
# previous status.
# (If instead you do want to start again from the beginning, simply remove the progress file!)
skipping=yes
last="$(cat $prog)"



while read id; do
ndone=$((ndone + 1))
if [[ $skipping = "yes" && -n "$last" && $id != $last ]]; then
continue
elif [[ $skipping = "yes" && -n "$last" ]]; then
skipping=no
continue
else
echo -n "deleting $id ($ndone / $all)... "
if curl -fLso /dev/null -XDELETE -H "X-Gu-Media-Key: $GRIDKEY" "https://api.$GRIDDOMAIN/images/$id/hard-delete"; then
echo $id > $prog
echo $id >> $com
echo "done"
else
echo "error $id"
echo $id >> $errs
fi
sleep 0.2
fi
done < $del

0 comments on commit 56edbc8

Please sign in to comment.