Skip to content

Reading, Extracting, and Converting an Mbox File into a Tibble in R

Notifications You must be signed in to change notification settings

jooyoungseo/mboxr

Repository files navigation

mboxr

CRAN status Total Downloads License: GPL v3 Travis build status AppVeyor build status Codecov test coverage

The goal of mboxr is to allow R users to conveniently import an mbox file into R tibble for hands-on analyses in R environment.

Installation

Python Dependencies

mboxr requires Anaconda Python environment on your system Path.

If you have not installed Conda environment on your system, please download and install Anaconda (Python 3.6 or later is recommended).

For this package, I have implemented mailbox.mbox, email.header.decode_header, email.utils and pandas.DataFrame Python modules into R using reticulate.

R Package Installation

Development Version

You can install the latest development version as follows:

if(!require(remotes)) {
install.packages("remotes")
}

remotes::install_github("jooyoungseo/mboxr")

Stable Version

You can install the released version of mboxr from CRAN with:

install.packages('mboxr')

Usage

Please use read_mbox() function after loading mboxr library like below:

library(mboxr)
# Importing your mbox file into an R:
test <- system.file("extdata", "test1.mbox", package = "mboxr")
data <- read_mbox(test)
data
#> # A tibble: 2 x 6
#>   date                from      to         cc    subject  content          
#>   <dttm>              <chr>     <chr>      <chr> <chr>    <chr>            
#> 1 2011-07-08 12:08:34 Author <~ Recipient~ <NA>  Sample ~ "This is the bod~
#> 2 2011-07-08 12:08:34 Author <~ Recipient~ <NA>  Sample ~ "This is the sec~

# Or, you can save your mbox file as an RDS file while assigning a tibble variable at the same time like below:
data <- read_mbox(mbox = test, file = "output.rds")
data
#> # A tibble: 2 x 6
#>   date                from      to         cc    subject  content          
#>   <dttm>              <chr>     <chr>      <chr> <chr>    <chr>            
#> 1 2011-07-08 12:08:34 Author <~ Recipient~ <NA>  Sample ~ "This is the bod~
#> 2 2011-07-08 12:08:34 Author <~ Recipient~ <NA>  Sample ~ "This is the sec~

# You can merge all mbox files in your current directory or in any specified path into one tibble and save as an RDS file for the integrated one:
test_path <- system.file("extdata", package = "mboxr")
all_data <- merge_mbox_all(path = test_path, file = "all_merged_mbox.rds")
## Find your "output.rds" file saved in your working directory while freely using the imported tibble in your R session!

all_data
#> # A tibble: 4 x 6
#>   date                from      to         cc    subject   content         
#>   <dttm>              <chr>     <chr>      <chr> <chr>     <chr>           
#> 1 2011-07-08 12:08:34 Author <~ Recipient~ <NA>  Sample m~ "This is the bo~
#> 2 2011-07-08 12:08:34 Author <~ Recipient~ <NA>  Sample m~ "This is the se~
#> 3 2011-07-09 12:09:35 Author <~ Recipient~ <NA>  Another ~ "R is the best!~
#> 4 2011-07-10 10:03:32 Author <~ Recipient~ <NA>  The last~ "This is the la~