logistica.Rmd

---
title: "Modelando o Número Total de Casos de COVID-19 para o Brasil"
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Introdução

Uma estratégia bastante comum para modelar curvas de crescimento é o emprego da função logística ou sigmóide. Neste caso:


![\frac{\phi_1}{1 + \exp\left\{\frac{\phi_2 - x}{\phi_3}\right\}}](https://render.githubusercontent.com/render/math?math=%5Cfrac%7B%5Cphi_1%7D%7B1%20%2B%20%5Cexp%5Cleft%5C%7B%5Cfrac%7B%5Cphi_2%20-%20x%7D%7B%5Cphi_3%7D%5Cright%5C%7D%7D)

Nesta expressão, o parâmetro phi1 é a assíntota da curva (número máximo de casos), phi2 é o tempo em que atinge-se a metade dos casos e 1/phi3 é a velocidade de crescimento da função logística.

## Ajuste para Brasil

```{r config, include=FALSE}
library(tidyverse)
library(datacovidbr)
library(investr)
prepData = function(input)
  input %>% ungroup() %>% arrange(date) %>%
      mutate(day = date - lag(date, default = date[1]),
             d = cumsum(as.integer(day))) %>% 
      select(-day)

fitLogistic = function(input, days_ahead=14){
  fit = nls(confirmed ~ SSlogis(d, phi1, phi2, phi3), data=input,
            control=nls.control(minFactor = 1e-12))
  preds = predFit(fit, newdata=input, interval = "confidence")
  input = input %>% mutate(pred = preds[, 'fit'],
                           lb = preds[, 'lwr'],
                           ub = preds[, 'upr'],
                           status = "presente")
  rm(preds)
  futuro = tibble(d=max(input$d) + (1:days_ahead))
  futuro = futuro %>%
    mutate(date = head(input, 1)$date + futuro$d,
           confirmed = NA_integer_, deaths = NA_integer_)
  preds = predFit(fit, newdata=futuro, interval="prediction")
  futuro$pred = preds[,1]
  futuro$lb = preds[, 'lwr']
  futuro$ub = preds[, 'upr']
  futuro$status = "futuro"
  rm(preds)
  futuro = futuro %>%
    select(date, confirmed, deaths, d, pred, lb, ub, status)
  list(fit=fit, input=input, futuro=futuro)
}

## phi1/(1+exp((phi2-d)/phi3))
d1f = function(x, pars){
  expterm = exp((pars[2]-x)/pars[3])
  num = pars[1]*expterm
  den = pars[3]*((1+expterm)^2)
  num/den
}

d2f = function(x, pars){
  expterm = exp((pars[2]-x)/pars[3])
  num = pars[1]*expterm*(expterm-1)
  den = (pars[3]^2)*((expterm+1)^3)
  num/den
}

```

```{r dados_brasil, echo=FALSE, warning=FALSE, message=FALSE}
indata = CSSEGISandData() %>%
  ungroup() %>% 
  filter(Country.Region == "Brazil", casosAcumulados > 0) %>% 
  select(-Country.Region, -recuperadosAcumulado) %>% 
  rename(date=data, confirmed=casosAcumulados, deaths=obitosAcumulado) %>% 
  arrange(date) %>% prepData()
indata %>% head() %>% knitr::kable()
```

```{r, echo=FALSE, warning=FALSE, message=FALSE}
model = fitLogistic(indata)
alldata = model$input %>% bind_rows(model$futuro)
alldata %>% 
  ggplot(aes(date, confirmed)) +
  geom_ribbon(aes(ymin=lb, ymax=ub), fill="grey70") +
  geom_line(aes(y=pred, colour=status)) +
  geom_point() +
  theme_bw() +
  xlab("Data") + ylab("Casos Confirmados")

alldata = alldata %>%
  mutate(d1 = d1f(d, coef(model$fit)),
         d2=d2f(d, coef(model$fit)))

alldata %>% ggplot(aes(date, d1, color=status)) + geom_line() + theme_bw() + xlab("Data") + ylab("Derivada 1")
alldata %>% ggplot(aes(date, d2, color=status)) + geom_line() + theme_bw() + xlab("Data") + ylab("Derivada 2")
coef(model$fit) %>% knitr::kable()
```

Estimativa do pico: `r indata[1,1] + coef(model$fit)[2]`.


## Estado de São Paulo

```{r, message=FALSE}
indata = brasilio() %>%
  filter(place_type == "state", state=="SP") %>%
  select(date, confirmed, deaths) %>%
  ungroup() %>% prepData()
indata %>% head() %>% knitr::kable()
```


```{r, echo=FALSE, warning=FALSE, message=FALSE}
model = fitLogistic(indata)
alldata = model$input %>% bind_rows(model$futuro)
alldata %>% 
  ggplot(aes(date, confirmed)) +
  geom_ribbon(aes(ymin=lb, ymax=ub), fill="grey70") +
  geom_line(aes(y=pred, colour=status)) +
  geom_point() +
  theme_bw() +
  xlab("Data") + ylab("Casos Confirmados")

alldata = alldata %>%
  mutate(d1 = d1f(d, coef(model$fit)),
         d2=d2f(d, coef(model$fit)))

alldata %>% ggplot(aes(date, d1, color=status)) + geom_line() + theme_bw() + xlab("Data") + ylab("Derivada 1")
alldata %>% ggplot(aes(date, d2, color=status)) + geom_line() + theme_bw() + xlab("Data") + ylab("Derivada 2")
coef(model$fit) %>% knitr::kable()
```

Estimativa do pico: `r indata[1,1] + coef(model$fit)[2]`.

## Cidade de Campinas

```{r, message=FALSE}
indata = brasilio() %>%
  filter(place_type == "city", state=="SP", city=="Campinas") %>%
  select(date, confirmed, deaths) %>%
  ungroup() %>% prepData()
indata %>% head() %>% knitr::kable()
```


```{r, echo=FALSE, warning=FALSE, message=FALSE, eval=FALSE}
model = fitLogistic(indata)
alldata = model$input %>% bind_rows(model$futuro)
alldata %>% 
  ggplot(aes(date, confirmed)) +
  geom_ribbon(aes(ymin=lb, ymax=ub), fill="grey70") +
  geom_line(aes(y=pred, colour=status)) +
  geom_point() +
  theme_bw() +
  xlab("Data") + ylab("Casos Confirmados")

alldata = alldata %>%
  mutate(d1 = d1f(d, coef(model$fit)),
         d2=d2f(d, coef(model$fit)))

alldata %>% ggplot(aes(date, d1, color=status)) + geom_line() + theme_bw() + xlab("Data") + ylab("Derivada 1")
alldata %>% ggplot(aes(date, d2, color=status)) + geom_line() + theme_bw() + xlab("Data") + ylab("Derivada 2")
coef(model$fit) %>% knitr::kable()
```

Estimativa do pico: `r indata[1,1] + coef(model$fit)[2]`.


## Observação

O modelo precisa ser melhorado, pois as estimativas de pico não estão apropriadas.