database

Panjiva Data Methods

Code used to create the tables and figures in our FEDS note: "Bill of Lading Data in International Trade Research with an Application to the Covid-19 Pandemic" (Flaaen et al.).

database

All files used to build the database use python and require large computational servers to process the large data files. In our project, we used a server with 24 cores, 200GB RAM, and over 3 TB of disk space. Our version of the Panjiva Impala database is served on a Hadoop cluster. Your usage may vary depending on the resources available.

Downloading data from FTP server:
- Coordinator script with Ray local parallelization: data_ftppull_coordinator_ray.py
- Worker script: data_ftppull_worker.py
Processing data and uploading to Hadoop:
- Script with Dask local parallelization: process_and_load_dask.py

Data Manipulation

All files used to query/manipulate data from our Panjiva Hadoop database are located in code_for_paper/data_manipulation or code_for_paper/stata:

Figs 1 + 2: pull_teu_shpt_val.R
Fig 3: pull_transport_2019.R
Fig 4: pull_mode_of_transport.R
Fig 5: pull_walmart_redaction.R
Fig 6: pull_weighted_hist_shp_con.R
Fig 7: pull_hist_con_shp_shpt_per_year.R
Fig 8: pull_yoy_shp_per_con.R
Fig 9: pull_intramonth.R
Fig 10a: query_furniture_decomp.R, pull_furniture_decomp_step1.do, pull_furniture_decomp_step2.do
Fig 10b: query_all_decomp.R, pull_all_decomp_step1.do, pull_all_decomp_step2.do
Figs 11+12: query_port_to_port_data.R, step1_vessel_clean.do, step2_vessel_clean.do, port_analysis.do
Fig 13: pull_ports_teu.R
Fig 14: pull_teu_ports_all.R
Fig 15: panjiva-days-count-teu-port-data.ipynb
Figs 16 + 17: pull_trading_partners.R

Each file outputs a csv of data that can be used as the input for the charting scripts listed in the next section.

Charting

All charts are created using ggplot2. All files used to create charts are in code_for_paper/figures:

Figs 1 + 2: fig_teu_shpt_val.R
Fig 3: fig_transport_2019.R
Fig 4: fig_mode_of_transport.R
Fig 5: fig_walmart_redaction.R
Fig 6: fig_weighted_hist_shp_con.R
Fig 7: fig_hist_con_shp_shpt_per_year.R
Fig 8: fig_yoy_shp_per_con.R
Fig 9: fig_intramonth.R
Fig 10a: fig_furn_decomp.R
Fig 10b: fig_all_decomp.R
Figs 11: fig_avg_days_between_shipments_ec_la.R
Fig 12: fig_inbound_reroute_la_lb.R
Fig 13: fig_ports_teu.R
Fig 14: fig_teu_ports_all.R
Fig 15: fig_teu_delay.R
Figs 16 + 17: fig_trading_partners.R

Tables

All files used to create tables are in tables_for_paper:

Tab 2: tab_transport_2019_by_country_levels.R
Tab 3: tab_missing_data.R
Tab 4: tab_top_consignees.R
Tab 5: tab_top_shippers.R
Tab 6: tab_census_panjiva_weight.R

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
code_for_paper		code_for_paper
tables_for_paper		tables_for_paper
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Panjiva Data Methods

database

Data Manipulation

Charting

Tables

About

Releases

Packages

Contributors 2

Languages

maddieky/panjiva-code

Folders and files

Latest commit

History

Repository files navigation

Panjiva Data Methods

database

Data Manipulation

Charting

Tables

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages