https://github.com/Lejo1/facebook_ad_library
wget https://fastdl.mongodb.org/tools/db/mongodb-database-tools-amazon2-x86_64-100.10.0.tgz
tar -xzvf mongodb-database-tools-amazon2-x86_64-100.10.0.tgz
export PATH="$(pwd)/mongodb-database-tools-amazon2-x86_64-100.10.0/bin:$PATH"
zcat ads.bson.gz | bsondump -vvvvv --type=json | gzip > ads.json.gz
(Ideally in a dedicated Python environment)
pip install -r requirements.txt
python analyze_data_structure.py ads.json.gz
Or add an extra integer argument to run only on first N rows, for instance:
python analyze_data_structure.py ads.json.gz 1000000
Returns 2 files (possibly named ads_first_N_lines
when relevant):
ads.json.gz.datastructure.json
: a json of all useful fields wih counts and naturesads.json.gz.regions.csv
: a csv with all delivery regions with counts
python convert_json_to_csv.py ads.json.gz | gzip > ads.csv.gz
Or add an extra integer argument to run only on first N rows, for instance:
python convert_json_to_csv.py ads.json.gz 1000000 > ads_first_1000000_lines.csv
xan search -s languages "fr" ads.csv.gz | gzip > ads-fr.csv.gz