diff --git a/LICENSE.html b/LICENSE.html
index 6ba8e78..454aa38 100644
--- a/LICENSE.html
+++ b/LICENSE.html
@@ -28,7 +28,7 @@
diff --git a/authors.html b/authors.html
index f50947d..73f4354 100644
--- a/authors.html
+++ b/authors.html
@@ -41,7 +41,7 @@
Authors
Citation
-Source: DESCRIPTION
Source: DESCRIPTION
Turner S (2024). biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama. diff --git a/news/index.html b/news/index.html index d1c743f..4d57cf1 100644 --- a/news/index.html +++ b/news/index.html @@ -28,7 +28,7 @@
diff --git a/pkgdown.yml b/pkgdown.yml
index e2330da..672a365 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -2,7 +2,7 @@ pandoc: 3.1.11
pkgdown: 2.1.1
pkgdown_sha: ~
articles: {}
-last_built: 2024-09-25T20:56Z
+last_built: 2024-09-27T08:38Z
urls:
reference: https://stephenturner.github.io/biorecap/reference
article: https://stephenturner.github.io/biorecap/articles
diff --git a/reference/add_prompt.html b/reference/add_prompt.html
index cf2545f..731acd3 100644
--- a/reference/add_prompt.html
+++ b/reference/add_prompt.html
@@ -28,7 +28,7 @@
diff --git a/reference/add_prompt_subject.html b/reference/add_prompt_subject.html
index 8c3fc9f..957de2f 100644
--- a/reference/add_prompt_subject.html
+++ b/reference/add_prompt_subject.html
@@ -28,7 +28,7 @@
diff --git a/reference/add_summary.html b/reference/add_summary.html
index 93e0ef5..8f3e46a 100644
--- a/reference/add_summary.html
+++ b/reference/add_summary.html
@@ -28,7 +28,7 @@
diff --git a/reference/biorecap-package.html b/reference/biorecap-package.html
index 2c92736..3493473 100644
--- a/reference/biorecap-package.html
+++ b/reference/biorecap-package.html
@@ -30,7 +30,7 @@
diff --git a/reference/biorecap_report.html b/reference/biorecap_report.html
index a6f6f94..8c1be49 100644
--- a/reference/biorecap_report.html
+++ b/reference/biorecap_report.html
@@ -28,7 +28,7 @@
diff --git a/reference/build_prompt_preprint.html b/reference/build_prompt_preprint.html
index 4a712e3..242b231 100644
--- a/reference/build_prompt_preprint.html
+++ b/reference/build_prompt_preprint.html
@@ -28,7 +28,7 @@
diff --git a/reference/build_prompt_subject.html b/reference/build_prompt_subject.html
index 6db5ad2..fd87189 100644
--- a/reference/build_prompt_subject.html
+++ b/reference/build_prompt_subject.html
@@ -28,7 +28,7 @@
diff --git a/reference/example_preprints.html b/reference/example_preprints.html
index 4efcf16..e1de3fe 100644
--- a/reference/example_preprints.html
+++ b/reference/example_preprints.html
@@ -28,7 +28,7 @@
diff --git a/reference/get_preprints.html b/reference/get_preprints.html
index 0fb3d23..f7cb104 100644
--- a/reference/get_preprints.html
+++ b/reference/get_preprints.html
@@ -28,7 +28,7 @@
diff --git a/reference/reexports.html b/reference/reexports.html
index 90ec5c9..ce126b4 100644
--- a/reference/reexports.html
+++ b/reference/reexports.html
@@ -42,7 +42,7 @@
diff --git a/reference/subjects.html b/reference/subjects.html
index 9353a13..a1dddb9 100644
--- a/reference/subjects.html
+++ b/reference/subjects.html
@@ -28,7 +28,7 @@
diff --git a/reference/tt_preprints.html b/reference/tt_preprints.html
index a0e8097..c769efc 100644
--- a/reference/tt_preprints.html
+++ b/reference/tt_preprints.html
@@ -28,7 +28,7 @@
diff --git a/search.json b/search.json
index db6b16b..864d10e 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":null,"dir":"","previous_headings":"","what":"Contributing to biorecap","title":"Contributing to biorecap","text":"outlines propose change biorecap. detailed discussion contributing tidyverse packages, please see development contributing guide code review principles.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"fixing-typos","dir":"","previous_headings":"","what":"Fixing typos","title":"Contributing to biorecap","text":"can fix typos, spelling mistakes, grammatical errors documentation directly using GitHub web interface, long changes made source file. generally means ’ll need edit roxygen2 comments .R, .Rd file. can find .R file generates .Rd reading comment first line.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"bigger-changes","dir":"","previous_headings":"","what":"Bigger changes","title":"Contributing to biorecap","text":"want make bigger change, ’s good idea first file issue make sure someone team agrees ’s needed. ’ve found bug, please file issue illustrates bug minimal reprex (also help write unit test, needed). See guide create great issue advice.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"pull-request-process","dir":"","previous_headings":"Bigger changes","what":"Pull request process","title":"Contributing to biorecap","text":"Fork package clone onto computer. haven’t done , recommend using usethis::create_from_github(\"stephenturner/biorecap\", fork = TRUE). Install development dependencies devtools::install_dev_deps(), make sure package passes R CMD check running devtools::check(). R CMD check doesn’t pass cleanly, ’s good idea ask help continuing. Create Git branch pull request (PR). recommend using usethis::pr_init(\"brief-description--change\"). Make changes, commit git, create PR running usethis::pr_push(), following prompts browser. title PR briefly describe change. body PR contain Fixes #issue-number. user-facing changes, add bullet top NEWS.md (.e. just first header). Follow style described https://style.tidyverse.org/news.html.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"code-style","dir":"","previous_headings":"Bigger changes","what":"Code style","title":"Contributing to biorecap","text":"New code follow tidyverse style guide. can use styler package apply styles, please don’t restyle code nothing PR. use roxygen2, Markdown syntax, documentation. use testthat unit tests. Contributions test cases included easier accept.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"code-of-conduct","dir":"","previous_headings":"","what":"Code of Conduct","title":"Contributing to biorecap","text":"Please note biorecap project released Contributor Code Conduct. contributing project agree abide terms.","code":""},{"path":"https://stephenturner.github.io/biorecap/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2024 Stephen Turner Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://stephenturner.github.io/biorecap/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Stephen Turner. Author, maintainer.","code":""},{"path":"https://stephenturner.github.io/biorecap/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Turner S (2024). biorecap: Retrieve summarize bioRxiv medRxiv preprints local LLM using ollama. R package version 0.2.0, https://stephenturner.github.io/biorecap/.","code":"@Manual{, title = {biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama}, author = {Stephen Turner}, year = {2024}, note = {R package version 0.2.0}, url = {https://stephenturner.github.io/biorecap/}, }"},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"biorecap-","dir":"","previous_headings":"","what":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"Retrieve summarize bioRxiv medRxiv preprints using local LLM Ollama via ollamar. Turner, S. D. (2024). biorecap: R package summarizing bioRxiv preprints local LLM. arXiv, 2408.11707. https://doi.org/10.48550/arXiv.2408.11707.","code":""},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"Install biorecap GitHub (keep dependencies=TRUE get Suggests packages needed create HTML report):","code":"# install.packages(\"remotes\") remotes::install_github(\"stephenturner/biorecap\", dependencies=TRUE)"},{"path":[]},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"quick-start","dir":"","previous_headings":"Usage","what":"Quick start","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"First, load biorecap library. Let’s make sure Ollama running can talk R: Next can list available models: Write HTML report containing summaries recent preprints select subject areas current working directory. can include bioRxiv medRxiv subjects, biorecap know RSS feed use. Example HTML report generated bioRxiv (bioinformatics) infectious diseases (medRxiv) subjects September 25, 2024:","code":"library(biorecap) test_connection() #> Ollama local server running #> #> GET http://localhost:11434/ #> Status: 200 OK #> Content-Type: text/plain #> Body: In memory (17 bytes) list_models() name size parameter_size quantization_level modified 1 gemma2:latest 5.4 GB 9.2B Q4_0 2024-08-07T07:35:15 3 llama3.1:70b 40 GB 70.6B Q4_0 2024-07-24T10:57:08 4 llama3.1:latest 4.7 GB 8.0B Q4_0 2024-07-31T09:38:38 5 llama3.2:latest 2 GB 3.2B Q4_K_M 2024-09-25T14:54:23 6 phi3:latest 2.2 GB 3.8B Q4_0 2024-08-28T04:37:58 biorecap_report(output_dir=\".\", subject=c(\"bioinformatics\", \"infectious_diseases\"), model=\"llama3.1\")"},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"details","dir":"","previous_headings":"Usage","what":"Details","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"get_preprints() function retrieves preprints RSS feed either bioRxiv medRxiv, based subject provided. pass one subjects subject argument. add_prompt() function adds prompt preprint used prompt model. Let’s take look one prompts: giving paper’s title abstract. Summarize paper many sentences instruct. include preamble text. Just give summary. Number sentences summary: 2 Title: SeuratExtend: Streamlining Single-Cell RNA-Seq Analysis Integrated Intuitive Framework Abstract: Single-cell RNA sequencing (scRNA-seq) revolutionized study cellular heterogeneity, rapid expansion analytical tools proven blessing curse, presenting researchers significant challenges. , present SeuratExtend, comprehensive R package built upon widely adopted Seurat framework, streamlines scRNA-seq data analysis integrating essential tools databases. SeuratExtend offers user-friendly intuitive interface performing wide range analyses, including functional enrichment, trajectory inference, gene regulatory network reconstruction, denoising. package seamlessly integrates multiple databases, Gene Ontology Reactome, incorporates popular Python tools like scVelo, Palantir, SCENIC unified R interface. SeuratExtend enhances data visualization optimized plotting functions carefully curated color schemes, ensuring aesthetic appeal scientific rigor. demonstrate SeuratExtend’s performance case studies investigating tumor-associated high-endothelial venules autoinflammatory diseases, showcase novel applications pathway-Level analysis cluster annotation. SeuratExtend empowers researchers harness full potential scRNA-seq data, making complex analyses accessible wider audience. package, along comprehensive documentation tutorials, freely available GitHub, providing valuable resource single-cell genomics community. add_summary() function uses locally running LLM available Ollama summarize preprint. Let’s add summary. Notice can single pipeline. takes minutes! Let’s take look results: Let’s look one summaries. ’s summary SeuratExtend paper (abstract ): SeuratExtend R package integrates essential tools databases single-cell RNA sequencing (scRNA-seq) data analysis, streamlining process user-friendly interface. package offers various analyses, including functional enrichment gene regulatory network reconstruction, seamlessly integrates multiple databases popular Python tools. biorecap_report() function runs code RMarkdown template, writing resulting HTML CSV file results current working directory. built-subjects list vectors containing available bioRxiv medRxiv subjects. create report subjects like (note, take time):","code":"pp <- get_preprints(subject=c(\"bioinformatics\", \"infectious_diseases\")) head(pp) tail(pp) #> # A tibble: 6 × 5 #> source subject title url abstract #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging Multi-Relationa… http… Genetic… #> 2 bioRxiv bioinformatics High-throughput bacterial aggregation a… http… The com… #> 3 bioRxiv bioinformatics scParadise: Tunable highly accurate mul… http… scRNA-s… #> 4 bioRxiv bioinformatics Camera Paths, Modeling, and Image Proce… http… The enh… #> 5 bioRxiv bioinformatics dScaff - an automatic bioinformatics fr… http… Rapid e… #> 6 bioRxiv bioinformatics Jaeger: an accurate and fast deep-learn… http… Abstrac… #> # A tibble: 6 × 5 #> source subject title url abstract #> #> 1 medRxiv infectious_diseases Reactogenicity and immunogenicity … http… \"The re… #> 2 medRxiv infectious_diseases A next generation CRISPR diagnosti… http… \"The WH… #> 3 medRxiv infectious_diseases Hospital-onset bacteraemia and fun… http… \"Backgr… #> 4 medRxiv infectious_diseases Co-circulating pathogens of humans… http… \"Histor… #> 5 medRxiv infectious_diseases Integration of Group A Streptococc… http… \"The Ca… #> 6 medRxiv infectious_diseases Deep Learning Models for Predictin… http… \"The Nu… pp <- pp |> add_prompt() pp #> # A tibble: 60 × 6 #> source subject title url abstract prompt #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging Multi-R… http… Genetic… I am … #> 2 bioRxiv bioinformatics High-throughput bacterial aggre… http… The com… I am … #> 3 bioRxiv bioinformatics scParadise: Tunable highly accu… http… scRNA-s… I am … #> 4 bioRxiv bioinformatics Camera Paths, Modeling, and Ima… http… The enh… I am … #> 5 bioRxiv bioinformatics dScaff - an automatic bioinform… http… Rapid e… I am … #> 6 bioRxiv bioinformatics Jaeger: an accurate and fast de… http… Abstrac… I am … #> 7 bioRxiv bioinformatics AI-Augmented R-Group Exploratio… http… Efficie… I am … #> 8 bioRxiv bioinformatics OPLS-based Multiclass Classific… http… Multicl… I am … #> 9 bioRxiv bioinformatics STANCE: a unified statistical m… http… A signi… I am … #> 10 bioRxiv bioinformatics AsaruSim: a single-cell and spa… http… Motivat… I am … #> # ℹ 50 more rows pp <- get_preprints(subject=c(\"bioinformatics\", \"infectious_diseases\")) |> add_prompt() |> add_summary(model=\"llama3.2\") pp #> # A tibble: 60 × 7 #> source subject title url abstract prompt summary #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging… http… Genetic… I am … MedGra… #> 2 bioRxiv bioinformatics High-throughput bacteri… http… The com… I am … The co… #> 3 bioRxiv bioinformatics scParadise: Tunable hig… http… scRNA-s… I am … scAdam… #> 4 bioRxiv bioinformatics Camera Paths, Modeling,… http… The enh… I am … ArtiaX… #> 5 bioRxiv bioinformatics dScaff - an automatic b… http… Rapid e… I am … dScaff… #> 6 bioRxiv bioinformatics Jaeger: an accurate and… http… Abstrac… I am … Jaeger… #> 7 bioRxiv bioinformatics AI-Augmented R-Group Ex… http… Efficie… I am … The pa… #> 8 bioRxiv bioinformatics OPLS-based Multiclass C… http… Multicl… I am … OPLS-D… #> 9 bioRxiv bioinformatics STANCE: a unified stati… http… A signi… I am … STANCE… #> 10 bioRxiv bioinformatics AsaruSim: a single-cell… http… Motivat… I am … AsaruS… #> # ℹ 50 more rows biorecap_report(output_dir=\".\", subject=c(\"bioinformatics\", \"infectious_diseases\"), model=\"llama3.2\") subjects$biorxiv #> [1] \"all\" #> [2] \"animal_behavior_and_cognition\" #> [3] \"biochemistry\" #> [4] \"bioengineering\" #> [5] \"bioinformatics\" #> [6] \"biophysics\" #> [7] \"cancer_biology\" #> [8] \"cell_biology\" #> [9] \"clinical_trials\" #> [10] \"developmental_biology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"evolutionary_biology\" #> [14] \"genetics\" #> [15] \"genomics\" #> [16] \"immunology\" #> [17] \"microbiology\" #> [18] \"molecular_biology\" #> [19] \"neuroscience\" #> [20] \"paleontology\" #> [21] \"pathology\" #> [22] \"pharmacology_and_toxicology\" #> [23] \"plant_biology\" #> [24] \"scientific_communication_and_education\" #> [25] \"synthetic_biology\" #> [26] \"systems_biology\" #> [27] \"zoology\" subjects$medrxiv #> [1] \"all\" #> [2] \"addiction_medicine\" #> [3] \"allergy_and_immunology\" #> [4] \"anesthesia\" #> [5] \"cardiovascular_medicine\" #> [6] \"dentistry_and_oral_medicine\" #> [7] \"dermatology\" #> [8] \"dermatology\" #> [9] \"endocrinology\" #> [10] \"epidemiology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"forensic_medicine\" #> [14] \"gastroenterology\" #> [15] \"genetic_and_genomic_medicine\" #> [16] \"geriatric_medicine\" #> [17] \"health_economics\" #> [18] \"health_informatics\" #> [19] \"health_policy\" #> [20] \"health_systems_and_quality_improvement\" #> [21] \"hematology\" #> [22] \"hivaids\" #> [23] \"infectious_diseases\" #> [24] \"intensive_care_and_critical_care_medicine\" #> [25] \"medical_education\" #> [26] \"medical_ethics\" #> [27] \"nephrology\" #> [28] \"neurology\" #> [29] \"nursing\" #> [30] \"nutrition\" #> [31] \"obstetrics_and_gynecology\" #> [32] \"occupational_and_environmental_health\" #> [33] \"oncology\" #> [34] \"ophthalmology\" #> [35] \"orthopedics\" #> [36] \"otolaryngology\" #> [37] \"pain_medicine\" #> [38] \"palliative_medicine\" #> [39] \"pathology\" #> [40] \"pediatrics\" #> [41] \"pharmacology_and_therapeutics\" #> [42] \"primary_care_research\" #> [43] \"psychiatry_and_clinical_psychology\" #> [44] \"public_and_global_health\" #> [45] \"radiology_and_imaging\" #> [46] \"rehabilitation_medicine_and_physical_therapy\" #> [47] \"respiratory_medicine\" #> [48] \"rheumatology\" #> [49] \"sexual_and_reproductive_health\" #> [50] \"sports_medicine\" #> [51] \"surgery\" #> [52] \"toxicology\" #> [53] \"transplantation\" #> [54] \"urology\" biorecap_report(output_dir=\".\", subject=subjects, model=\"llama3.2\")"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":null,"dir":"Reference","previous_headings":"","what":"Add prompt to a data frame of preprints — add_prompt","title":"Add prompt to a data frame of preprints — add_prompt","text":"Add prompt data frame preprints","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Add prompt to a data frame of preprints — add_prompt","text":"","code":"add_prompt(preprints, ...)"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Add prompt to a data frame of preprints — add_prompt","text":"preprints Result get_preprints(). ... Additional arguments build_prompt_preprint().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Add prompt to a data frame of preprints — add_prompt","text":"data frame preprints prompt added.","code":""},{"path":[]},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Add prompt to a data frame of preprints — add_prompt","text":"","code":"preprints <- get_preprints(subject=c(\"bioinformatics\", \"genomics\")) preprints <- add_prompt(preprints) preprints #> # A tibble: 60 × 6 #> source subject title url abstract prompt #> #> 1 bioRxiv bioinformatics SeaMoon: Prediction of molecula… http… How pro… \"I am… #> 2 bioRxiv bioinformatics Clustering individuals using IN… http… Motivat… \"I am… #> 3 bioRxiv bioinformatics INSPIRE: interpretable, flexibl… http… Recent … \"I am… #> 4 bioRxiv bioinformatics MedGraphNet: Leveraging Multi-R… http… Genetic… \"I am… #> 5 bioRxiv bioinformatics High-throughput bacterial aggre… http… The com… \"I am… #> 6 bioRxiv bioinformatics scParadise: Tunable highly accu… http… scRNA-s… \"I am… #> 7 bioRxiv bioinformatics Camera Paths, Modeling, and Ima… http… The enh… \"I am… #> 8 bioRxiv bioinformatics dScaff - an automatic bioinform… http… Rapid e… \"I am… #> 9 bioRxiv bioinformatics Jaeger: an accurate and fast de… http… Abstrac… \"I am… #> 10 bioRxiv bioinformatics AI-Augmented R-Group Exploratio… http… Efficie… \"I am… #> # ℹ 50 more rows"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":null,"dir":"Reference","previous_headings":"","what":"Add prompts for an entire subject — add_prompt_subject","title":"Add prompts for an entire subject — add_prompt_subject","text":"Add prompts entire subject","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Add prompts for an entire subject — add_prompt_subject","text":"","code":"add_prompt_subject(preprints, ...)"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Add prompts for an entire subject — add_prompt_subject","text":"preprints Output get_preprints() followed add_prompt() followed add_summary(). ... Additional arguments build_prompt_subject().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Add prompts for an entire subject — add_prompt_subject","text":"tibble subject prompt column.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Add prompts for an entire subject — add_prompt_subject","text":"","code":"subjects <- example_preprints |> dplyr::group_by(subject) |> add_prompt_subject() #> Warning: Expecting a tibble of class 'preprints_prompt' returned from get_preprints() |> add_prompt(). subjects #> # A tibble: 2 × 2 #> subject prompt #> #> 1 bioinformatics \"I am giving you information about recent bioRxiv/medRxiv… #> 2 infectious_diseases \"I am giving you information about recent bioRxiv/medRxiv…"},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":null,"dir":"Reference","previous_headings":"","what":"Generate a summary from a data frame of prompts — add_summary","title":"Generate a summary from a data frame of prompts — add_summary","text":"Generate summary data frame prompts","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Generate a summary from a data frame of prompts — add_summary","text":"","code":"add_summary(preprints, model = \"llama3.2\")"},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Generate a summary from a data frame of prompts — add_summary","text":"preprints Output get_preprints() followed add_prompt(). model model available Ollama (run ollamar::list_models()) see available.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Generate a summary from a data frame of prompts — add_summary","text":"tibble, response column added.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Generate a summary from a data frame of prompts — add_summary","text":"","code":"if (FALSE) { # \\dontrun{ # Individual papers preprints <- get_preprints(c(\"genomics\", \"bioinformatics\")) |> add_prompt() |> add_summary() preprints } # }"},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap-package.html","id":null,"dir":"Reference","previous_headings":"","what":"biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama — biorecap-package","title":"biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama — biorecap-package","text":"Retrieve summarize bioRxiv medRxiv preprints local LLM using ollama.","code":""},{"path":[]},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama — biorecap-package","text":"Maintainer: Stephen Turner vustephen@gmail.com (ORCID)","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":null,"dir":"Reference","previous_headings":"","what":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"Create report bioRxiv/medRxiv preprints","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"","code":"biorecap_report( output_dir = \".\", subject = NULL, nsentences = 2L, model = \"llama3.1\", use_example_preprints = FALSE, ... )"},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"output_dir Directory save report. subject Character vector subjects include report. nsentences Number sentences summarize paper . model model use generating summaries. See ollamar::list_models(). use_example_preprints Use example preprints data included package instead fetching new data bioRxiv/medRxiv. diagnostic/testing purposes . ... arguments passed rmarkdown::render().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"Nothing; called side effects produce report.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"","code":"if (FALSE) { # \\dontrun{ output_dir <- tempdir() biorecap_report(use_example_preprints=TRUE, output_dir=output_dir) biorecap_report(subject=c(\"bioinformatics\", \"genomics\", \"synthetic_biology\"), output_dir=output_dir) } # }"},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":null,"dir":"Reference","previous_headings":"","what":"Construct a prompt to summarize a paper — build_prompt_preprint","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"Construct prompt summarize paper","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"","code":"build_prompt_preprint( title, abstract, nsentences = 2L, instructions = c(\"I am giving you a paper's title and abstract.\", \"Summarize the paper in as many sentences as I instruct.\", \"Do not include any preamble text to the summary\", \"just give me the summary with no preface or intro sentence.\") )"},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"title title paper. abstract abstract paper. nsentences number sentences summarize paper . instructions Instructions prompt. can character vector gets collapsed single string.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"string containing prompt.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"","code":"build_prompt_preprint(title=\"A great paper\", abstract=\"This is the abstract.\") #> [1] \"I am giving you a paper's title and abstract. Summarize the paper in as many sentences as I instruct. Do not include any preamble text to the summary just give me the summary with no preface or intro sentence.\\nNumber of sentences in summary: 2\\nTitle: A great paper\\nAbstract: This is the abstract.\""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":null,"dir":"Reference","previous_headings":"","what":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"Construct prompt summarize set papers subject","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"","code":"build_prompt_subject( subject, title, summary, nsentences = 5L, instructions = c(\"I am giving you information about recent bioRxiv/medRxiv preprints.\", \"I'll give you the subject, preprint titles, and short summary of each paper.\", \"Please provide a general summary new advances in this subject/field in general.\", \"Provide this summary of the field in as many sentences as I instruct.\", \"Do not include any preamble text to the summary\", \"just give me the summary with no preface or intro sentence.\") )"},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"subject name subject. title character vector titles subject summary character vector summaries paper provided get_preprints() followed add_prompt() followed add_summary(). nsentences number sentences summarize subject . instructions Instructions prompt. can character vector gets collapsed single string.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"string containing prompt.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"","code":"title <- example_preprints |> dplyr::filter(subject==\"bioinformatics\") |> dplyr::pull(title) summary <- example_preprints |> dplyr::filter(subject==\"bioinformatics\") |> dplyr::pull(summary) build_prompt_subject(subject=\"bioinformatics\", title=title, summary=summary) #> [1] \"I am giving you information about recent bioRxiv/medRxiv preprints. I'll give you the subject, preprint titles, and short summary of each paper. Please provide a general summary new advances in this subject/field in general. Provide this summary of the field in as many sentences as I instruct. Do not include any preamble text to the summary just give me the summary with no preface or intro sentence.\\n\\nSubject: bioinformatics\\nNumber of sentences in summary: 5\\n\\nHere are the titles and summaries:\\n\\nTitle: MedGraphNet: Leveraging Multi-Relational Graph Neural Networks and Text Knowledge for Biomedical Predictions\\nSummary: MedGraphNet leverages multi-relational Graph Neural Networks and text knowledge to improve biomedical predictions by initializing nodes using informative embeddings from existing text knowledge, allowing for robust integration of various data types and improved generalizability. The model demonstrates superior performance compared to traditional single-relation approaches in scenarios with isolated or sparsely connected nodes, particularly in identifying disease-gene associations and drug-phenotype relationships, and shows promising results in accurately inferring drug side effects without direct training on such data.\\n\\nTitle: High-throughput bacterial aggregation analysis in droplets\\nSummary: The communal lifestyle of bacteria can contribute significantly to antimicrobial resistance by promoting biofilm formation. A key approach to addressing this issue is to develop novel techniques for analyzing bacterial behavior, such as those enabled by droplet-based platforms and image analysis methods.\\n\\nTitle: scParadise: Tunable highly accurate multi-task cell type annotation and surface protein abundance prediction\\nSummary: scAdam outperforms existing methods in annotating rare cell types with high accuracy and consistency across diverse datasets. scEve enhances clustering and cell type separation through improved surface protein prediction, leading to better characterization of complex tissues.\\n\\nTitle: Camera Paths, Modeling, and Image Processing Tools for ArtiaX\\nSummary: ArtiaX is a plugin that has been extended to improve the analysis and visualization of cryo-electron tomography data through advanced visualization techniques. The plugin allows for the generation of diverse models with putative particle positions and orientations, as well as a coarse grained algorithm to rectify overlaps in template matching, driving camera position and facilitating movie creation with fundamental image filtering options.\\n\\nTitle: dScaff - an automatic bioinformatics framework for scaffolding draft de novo assemblies based on reference genome data\\nSummary: dScaff is an automatic bioinformatics framework designed for scaffolding draft de novo assemblies based on reference genome data. The tool uses a series of bash and R scripts to create a minimal complete scaffold from a genome assembly, with potential future features to be implemented, including using reference chromosomes or scaffolds.\\n\\nTitle: Jaeger: an accurate and fast deep-learning tool to detect bacteriophage sequences\\nSummary: Jaeger's accuracy and speed in identifying bacteriophage sequences outperform existing deep-learning tools by consistently producing few false positives despite encountering diverse viral sequences. The novel method achieves an estimated 2-27% false discovery rate when applied to over 16,000 metagenomic assemblies, which is significantly lower than the benchmarking paper where deep-learning tools produced many false positives.\\n\\nTitle: AI-Augmented R-Group Exploration in Medicinal Chemistry\\nSummary: The paper presents a novel approach to enhancing free-wing QSAR models by embedding R-groups with atom-centric pharmacophoric features, allowing for the distinction of regioisomers and improved predictivity across 12 public datasets. The proposed method is integrated into an open-source program, enabling its application in various scenarios, including classic free-Wilson analysis and exploration of uncharted chemical space facilitated by AI-generated building blocks.\\n\\nTitle: OPLS-based Multiclass Classification and Data-Driven Inter-Class Relationship Discovery\\nSummary: OPLS-DA models are widely used in metabolomics for two-class comparisons due to their strong discrimination capabilities, but these models face challenges in multiclass settings. An extension of OPLS-DA called OPLS-HDA integrates Hierarchical Cluster Analysis with the OPLS-DA framework to create a decision tree that addresses multiclass classification challenges and provides intuitive visualization of inter-class relationships.\\n\\nTitle: STANCE: a unified statistical model to detect cell-type-specific spatially variable genes in spatial transcriptomics\\nSummary: STANCE, a unified statistical model to detect cell-type-specific spatially variable genes in spatial transcriptomics, was developed to address the challenges posed by existing methods in detecting spatially variable genes (SVGs) and cell type-specific spatially variable genes (ctSVGs). The proposed method integrates gene expression, spatial location, and cell type composition through a linear mixed-effect model to identify both SVGs and ctSVGs in an initial stage, followed by a second stage test dedicated to ctSVG detection.\\n\\nTitle: AsaruSim: a single-cell and spatial RNA-Seq Nanopore long-reads simulation workflow\\nSummary: AsaruSim simulates synthetic single-cell long-read Nanopore datasets that closely mimic real experimental data by employing a multi-step process. It includes the creation of a synthetic UMI count matrix, generation of perfect reads, optional PCR amplification, introduction of sequencing errors, and comprehensive quality control reporting.\\n\\nTitle: Building a literature knowledge base towards transparent biomedical AI\\nSummary: LiteralGraph extracts biomedical terms and relationships from PubMed literature, establishing a comprehensive knowledge graph. The resulting Genomic Literature Knowledge Base consolidates over 263 million biomedical terms, 14 million relationships, and 10 million genomic events across multiple sources, including nine established repositories.\\n\\nTitle: Accurate non-invasive quantification of astaxanthin content using hyperspectral images and machine learning\\nSummary: The authors investigated a method to accurately quantify astaxanthin content in Haematococcus pluvialis microalgae cultures using hyperspectral images and machine learning. They found that this approach, combining reflectance hyperspectral imaging with a 1-dimensional convolutional neural network, had low average prediction error across a range of astaxanthin contents, although it was unreliable at very low levels (<0.6 micrograms mg-1).\\n\\nTitle: AlphaMut: a deep reinforcement learning model to suggest helix-disrupting mutations\\nSummary: The authors propose a deep reinforcement learning model called AlphaMut to predict helix-disrupting mutations in proteins. AlphaMut identifies amino acids crucial for maintaining structural integrity and predicts key mutations that could alter protein function.\\n\\nTitle: Beyond Static Brain Atlases: AI-Powered Open Databasing and Dynamic Mining of Brain-Wide Neuron Morphometry\\nSummary: NeuroXiv is a large-scale database that provides detailed 3D morphologies of individual neurons mapped to a standard brain atlas, allowing for dynamic, interactive neuroscience applications. The database offers a comprehensive collection of 175,149 atlas-oriented reconstructed morphologies of individual neurons from over 518 mouse brains, classified into 292 distinct types and mapped into the Common Coordinate Framework Version 3 (CCFv3).\\n\\nTitle: Metabolic modeling identifies determinants of thermal growth responses in Arabidopsis thaliana\\nSummary: The paper developed an enzyme-constrained model of Arabidopsis thaliana's metabolism, which facilitates predictions of growth-related phenotypes at different temperatures and identifies genes affecting plant growth at suboptimal temperatures. This model was validated using mutant lines, demonstrating its potential in accurately predicting plant thermal responses and providing a template for developing climate-resilient crops.\\n\\nTitle: Decoding Protein Dynamics: ProFlex as a Linguistic Bridge in Normal Mode Analysis\\nSummary: Artificial intelligence has revolutionized structural bioinformatics with AlphaFold being arguably the most impactful development to date. The structural atlases generated by these methods present significant opportunities for unraveling biological mysteries, but also pose challenges in leveraging such massive datasets effectively.\\n\\nTitle: Exploring midgut expression dynamics: longitudinal transcriptomic analysis of adult female Amblyomma americanum midgut and comparative insights with other hard tick species\\nSummary: The study investigates the transcriptomic dynamics of the midgut in adult female Amblyomma americanum ticks during different feeding stages, revealing 15,599 putative DNA coding sequences and highlighting dynamic transcriptional changes as feeding progresses. The analysis also identified conserved transcripts across three hard tick species, providing insight into the physiological pathways relevant to the tick midgut and potential avenues for developing control methods targeting multiple tick species.\\n\\nTitle: Designing of thermostable proteins with a desired melting temperature\\nSummary: We developed a regression method for predicting protein melting temperatures (Tm) using 17,312 non-redundant proteins and achieved the highest Pearson correlation of 0.80 with an R2 of 0.63 between predicted and actual Tm values. Our best model, fine-tuned on large language models such as ProtBert, achieved a maximum correlation of 0.89 with an R2 of 0.80, demonstrating improved performance in predicting protein stability at higher temperatures.\\n\\nTitle: Joint Modeling of Cellular Heterogeneity and Condition Effects with scPCA in Single-Cell RNA-Seq\\nSummary: scRNA-seq in multi-condition experiments enables the systematic assessment of treatment effects by analyzing gene expression profiles. scPCA is a flexible DR framework that jointly models cellular heterogeneity and conditioning variables, allowing for an integrated factor representation and revealing transcriptional changes across conditions and components.\\n\\nTitle: Identification of potential inhibitors against Inosine 5'-Monophosphate Dehydrogenase of Cryptosporidium parvum through an integrated in silico approach\\nSummary: A total of 24 bioactive phytochemicals were screened virtually using molecular docking and ADMET analyses to identify potential inhibitors against Inosine 5'-Monophosphate Dehydrogenase (IMPDH) of Cryptosporidium parvum, with four lead compounds identified as Brevelin A, Vernodalin, Luteolin, and Pectolinarigenin. The lead compounds were found to possess favorable pharmacokinetic and pharmacodynamic properties, satisfactory toxicity analysis results, and no major side effects or violation of Lipinski's rules of five, indicating the possibility of oral bioavailability as potential drug candidates.\\n\\nTitle: Identification and Diagnostic Potential of Pyroptosis-Related Genes in Endometriosis: A Novel Bioinformatics Analysis\\nSummary: Pyroptosis-related genes were identified through a bioinformatics analysis of endometriosis (EM) transcriptomic datasets, resulting in 26 differentially expressed genes that play a crucial role in the pathogenesis of EM. A novel diagnostic model was constructed using LASSO regression based on pyroptosis scores, which included five key genes: KIF13B, BAG6, MYO5A, HEATR, and AK055981.\\n\\nTitle: Improving the accuracy of pose prediction by incorporating symmetry-related molecules\\nSummary: The study aimed to improve the accuracy of pose prediction in molecular docking by incorporating symmetry-related molecules (SRMs). Redocking protein-ligand complexes with and without SRMs revealed that using SRMs significantly improved the prediction of biologically significant poses, as indicated by MM-GBSA calculations.\\n\\nTitle: Identification and study of Prolyl Oligopeptidases and related sequences in bacterial lineages\\nSummary: The study examined ~32000 completely annotated bacterial genomes from the NCBI RefSeq Assembly database to identify annotated S9 family proteins, resulting in the discovery of ~53,000 bacterial S9 family proteins (referred to as POP homologues) which can be classified into distinct subfamilies through various machine-learning approaches and comprehensive analysis. These sequence homologues display distinct subclusters and class-specific motifs suggesting differences in substrate specificity in POP homologues.\\n\\nTitle: Learning-Augmented Sketching Offers Improved Performance for Privacy Preserving and Secure GWAS\\nSummary: The introduction of trusted execution environments (TEEs) such as Intel SGX technology has enabled secure and privacy-preserving computation on the cloud, but stringent resource limitations pose a challenge for some TEEs. The SkSES method, which identifies significant SNPs in GWAS without disclosing sensitive genotype information, has been improved upon with a learning-augmented approach that achieves up to 40% accuracy gain compared to the original SkSES method.\\n\\nTitle: Liberality is More Explainable than PCA of Transcriptome for Vertebrate Embryo Development\\nSummary: Liberality is a quantitative index of cellular differentiation and dedifferentiation that has been widely used for genome-scale data analysis, particularly in understanding vertebrate embryo development. The study analyzed a time course transcriptome dataset on vertebrate embryo development and found a trend that historically annotated embryo developmental stages matched changes in liberality, indicating the potential of liberality to analyze biological phenomena beyond just embryo development.\\n\\nTitle: Bacopa monnieri phytochemicals as promising BACE1 inhibitors for Alzheimers Disease Therapy\\nSummary: Bacopa monnieri phytochemicals are investigated as potential BACE1 inhibitors for Alzheimer's Disease Therapy, with Bacopaside I showing superior binding affinity and interaction profile compared to established synthetic inhibitors. The study highlights the promising role of natural compounds in AD treatment, emphasizing their potential to overcome limitations faced in clinical settings, and advocates for a paradigm shift towards integrating traditional medicinal knowledge into contemporary drug discovery efforts.\\n\\nTitle: Accurate Multiple Sequence Alignment of Ultramassive Genome Sets\\nSummary: The current state of multiple sequence alignment (MSA) is insufficient for handling ultramassive genome sets due to challenges in scalability and accuracy. The proposed algorithms, including directed acyclic graph construction, profile hidden Markov model training, and graph-based alignment, significantly improve accuracy and acceleration of MSA compared to widely used MAFFT for genome set sizes ranging from 40,000 to over 4 million.\\n\\nTitle: Machine Learning Driven Simulations of SARS-CoV-2 Fitness Landscape\\nSummary: The SARS-CoV-2 infection is caused by interactions between the receptor binding domain of viral spike proteins and host cell ACE2 receptors, with mutations in the spike protein leading to neutralizing antibody escape and breakthrough infections. Machine learning-driven simulations combined with deep mutational scanning data predict variants of concern not seen in the training data and sample statistics of the fitness landscape, providing insight into the relationship between RBD sequence elements and emerging viral strains.\\n\\nTitle: Modelling dynamics of human NDPK hexamer structure, stability and interactions\\nSummary: The precise assembly of the NDPK hexameric structure into homo- /hetero-oligomeric complexes is necessary for kinase activity but has been poorly understood due to high subunit homology, experimental challenges, and limited data on in vivo heterohexamer formation and subunit abundances across cellular compartments. A conserved Arg27 residue plays a key role in hexamer assembly, mediating inter- and intra-molecular monomeric interactions and ensuring similar hexameric assembly across subunits.\\n\\nTitle: GuaCAMOLE: GC-bias aware estimation improves the accuracy of metagenomic species abundances\\nSummary: GuaCAMOLE is a novel computational method that detects and removes GC bias from metagenomic sequencing data, which affects the accuracy of quantifying microbial community compositions. The algorithm reports unbiased abundances and corrects the abundance of clinically relevant GC-poor species by up to a factor of two in gut microbiomes of colorectal cancer patients.\""},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":null,"dir":"Reference","previous_headings":"","what":"Example preprints with summaries — example_preprints","title":"Example preprints with summaries — example_preprints","text":"Example preprints summaries August 6, 2024.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Example preprints with summaries — example_preprints","text":"","code":"example_preprints"},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Example preprints with summaries — example_preprints","text":"tibble returned get_preprints() followed add_prompt() followed add_summary().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Example preprints with summaries — example_preprints","text":"","code":"example_preprints #> # A tibble: 60 × 7 #> source subject title url abstract prompt summary #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging… http… Genetic… I am … MedGra… #> 2 bioRxiv bioinformatics High-throughput bacteri… http… The com… I am … The co… #> 3 bioRxiv bioinformatics scParadise: Tunable hig… http… scRNA-s… I am … scAdam… #> 4 bioRxiv bioinformatics Camera Paths, Modeling,… http… The enh… I am … ArtiaX… #> 5 bioRxiv bioinformatics dScaff - an automatic b… http… Rapid e… I am … dScaff… #> 6 bioRxiv bioinformatics Jaeger: an accurate and… http… Abstrac… I am … Jaeger… #> 7 bioRxiv bioinformatics AI-Augmented R-Group Ex… http… Efficie… I am … The pa… #> 8 bioRxiv bioinformatics OPLS-based Multiclass C… http… Multicl… I am … OPLS-D… #> 9 bioRxiv bioinformatics STANCE: a unified stati… http… A signi… I am … STANCE… #> 10 bioRxiv bioinformatics AsaruSim: a single-cell… http… Motivat… I am … AsaruS… #> # ℹ 50 more rows"},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":null,"dir":"Reference","previous_headings":"","what":"Get bioRxiv/medRxiv preprints — get_preprints","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"Get bioRxiv/medRxiv preprints","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"","code":"get_preprints(subject = \"all\", clean = TRUE)"},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"subject character vector valid bioRxiv /medRxiv subjects. See subjects. clean Logical; try strip graphical abstract information? TRUE, strips away text O_FIG C_FIG, words graphical abstract abstract text RSS feed.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"data frame preprints bioRxiv /medRxiv.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"","code":"preprints <- get_preprints(subject=c(\"bioinformatics\", \"Public_and_Global_Health\")) preprints #> # A tibble: 60 × 5 #> source subject title url abstract #> #> 1 bioRxiv bioinformatics SeaMoon: Prediction of molecular motio… http… How pro… #> 2 bioRxiv bioinformatics Clustering individuals using INMTD: a … http… Motivat… #> 3 bioRxiv bioinformatics INSPIRE: interpretable, flexible and s… http… Recent … #> 4 bioRxiv bioinformatics MedGraphNet: Leveraging Multi-Relation… http… Genetic… #> 5 bioRxiv bioinformatics High-throughput bacterial aggregation … http… The com… #> 6 bioRxiv bioinformatics scParadise: Tunable highly accurate mu… http… scRNA-s… #> 7 bioRxiv bioinformatics Camera Paths, Modeling, and Image Proc… http… The enh… #> 8 bioRxiv bioinformatics dScaff - an automatic bioinformatics f… http… Rapid e… #> 9 bioRxiv bioinformatics Jaeger: an accurate and fast deep-lear… http… Abstrac… #> 10 bioRxiv bioinformatics AI-Augmented R-Group Exploration in Me… http… Efficie… #> # ℹ 50 more rows"},{"path":"https://stephenturner.github.io/biorecap/reference/reexports.html","id":null,"dir":"Reference","previous_headings":"","what":"Objects exported from other packages — reexports","title":"Objects exported from other packages — reexports","text":"objects imported packages. Follow links see documentation. ollamar list_models, test_connection","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":null,"dir":"Reference","previous_headings":"","what":"bioRxiv subjects — subjects","title":"bioRxiv subjects — subjects","text":"Names subjects RSS feeds biorXiv","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"bioRxiv subjects — subjects","text":"","code":"subjects"},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"bioRxiv subjects — subjects","text":"list character vectors subjects, one bioRxiv, one medRxiv.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"bioRxiv subjects — subjects","text":"https://www.biorxiv.org/alertsrss","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"bioRxiv subjects — subjects","text":"","code":"subjects #> $biorxiv #> [1] \"all\" #> [2] \"animal_behavior_and_cognition\" #> [3] \"biochemistry\" #> [4] \"bioengineering\" #> [5] \"bioinformatics\" #> [6] \"biophysics\" #> [7] \"cancer_biology\" #> [8] \"cell_biology\" #> [9] \"clinical_trials\" #> [10] \"developmental_biology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"evolutionary_biology\" #> [14] \"genetics\" #> [15] \"genomics\" #> [16] \"immunology\" #> [17] \"microbiology\" #> [18] \"molecular_biology\" #> [19] \"neuroscience\" #> [20] \"paleontology\" #> [21] \"pathology\" #> [22] \"pharmacology_and_toxicology\" #> [23] \"plant_biology\" #> [24] \"scientific_communication_and_education\" #> [25] \"synthetic_biology\" #> [26] \"systems_biology\" #> [27] \"zoology\" #> #> $medrxiv #> [1] \"all\" #> [2] \"addiction_medicine\" #> [3] \"allergy_and_immunology\" #> [4] \"anesthesia\" #> [5] \"cardiovascular_medicine\" #> [6] \"dentistry_and_oral_medicine\" #> [7] \"dermatology\" #> [8] \"dermatology\" #> [9] \"endocrinology\" #> [10] \"epidemiology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"forensic_medicine\" #> [14] \"gastroenterology\" #> [15] \"genetic_and_genomic_medicine\" #> [16] \"geriatric_medicine\" #> [17] \"health_economics\" #> [18] \"health_informatics\" #> [19] \"health_policy\" #> [20] \"health_systems_and_quality_improvement\" #> [21] \"hematology\" #> [22] \"hivaids\" #> [23] \"infectious_diseases\" #> [24] \"intensive_care_and_critical_care_medicine\" #> [25] \"medical_education\" #> [26] \"medical_ethics\" #> [27] \"nephrology\" #> [28] \"neurology\" #> [29] \"nursing\" #> [30] \"nutrition\" #> [31] \"obstetrics_and_gynecology\" #> [32] \"occupational_and_environmental_health\" #> [33] \"oncology\" #> [34] \"ophthalmology\" #> [35] \"orthopedics\" #> [36] \"otolaryngology\" #> [37] \"pain_medicine\" #> [38] \"palliative_medicine\" #> [39] \"pathology\" #> [40] \"pediatrics\" #> [41] \"pharmacology_and_therapeutics\" #> [42] \"primary_care_research\" #> [43] \"psychiatry_and_clinical_psychology\" #> [44] \"public_and_global_health\" #> [45] \"radiology_and_imaging\" #> [46] \"rehabilitation_medicine_and_physical_therapy\" #> [47] \"respiratory_medicine\" #> [48] \"rheumatology\" #> [49] \"sexual_and_reproductive_health\" #> [50] \"sports_medicine\" #> [51] \"surgery\" #> [52] \"toxicology\" #> [53] \"transplantation\" #> [54] \"urology\" #>"},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":null,"dir":"Reference","previous_headings":"","what":"Create a markdown table from prepreprint summaries — tt_preprints","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"Create markdown table prepreprint summaries","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"","code":"tt_preprints(preprints, cols = c(\"title\", \"summary\"), width = c(1, 3))"},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"preprints Output get_preprints() followed add_prompt() followed add_summary(). cols Columns display resulting markdown table. width Vector relative widths equal length(cols).","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"tinytable table.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"","code":"# Use built-in example data example_preprints #> # A tibble: 60 × 7 #> source subject title url abstract prompt summary #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging… http… Genetic… I am … MedGra… #> 2 bioRxiv bioinformatics High-throughput bacteri… http… The com… I am … The co… #> 3 bioRxiv bioinformatics scParadise: Tunable hig… http… scRNA-s… I am … scAdam… #> 4 bioRxiv bioinformatics Camera Paths, Modeling,… http… The enh… I am … ArtiaX… #> 5 bioRxiv bioinformatics dScaff - an automatic b… http… Rapid e… I am … dScaff… #> 6 bioRxiv bioinformatics Jaeger: an accurate and… http… Abstrac… I am … Jaeger… #> 7 bioRxiv bioinformatics AI-Augmented R-Group Ex… http… Efficie… I am … The pa… #> 8 bioRxiv bioinformatics OPLS-based Multiclass C… http… Multicl… I am … OPLS-D… #> 9 bioRxiv bioinformatics STANCE: a unified stati… http… A signi… I am … STANCE… #> 10 bioRxiv bioinformatics AsaruSim: a single-cell… http… Motivat… I am … AsaruS… #> # ℹ 50 more rows tt_preprints(example_preprints| title | summary || [MedGraphNet: Leveraging Multi-Relational Graph Neural Networks and Text Knowledge for Biomedical Predictions](http://biorxiv.org/cgi/content/short/2024.09.24.614782v1?rss=1) | MedGraphNet leverages multi-relational Graph Neural Networks and text knowledge to improve biomedical predictions by initializing nodes using informative embeddings from existing text knowledge, allowing for robust integration of various data types and improved generalizability. The model demonstrates superior performance compared to traditional single-relation approaches in scenarios with isolated or sparsely connected nodes, particularly in identifying disease-gene associations and drug-phenotype relationships, and shows promising results in accurately inferring drug side effects without direct training on such data. || [High-throughput bacterial aggregation analysis in droplets](http://biorxiv.org/cgi/content/short/2024.09.24.613170v1?rss=1) | The communal lifestyle of bacteria can contribute significantly to antimicrobial resistance by promoting biofilm formation. A key approach to addressing this issue is to develop novel techniques for analyzing bacterial behavior, such as those enabled by droplet-based platforms and image analysis methods. |},{"path":"https://stephenturner.github.io/biorecap/news/index.html","id":"biorecap-020","dir":"Changelog","previous_headings":"","what":"biorecap 0.2.0","title":"biorecap 0.2.0","text":"Added medRxiv support. get_preprints() function now pull either bioRxiv medRxiv RSS feed depending subject passed . downstream functions reporting updated reflect change (fixes #5). Changed default model llama 3.2 3B. Added new source column returned preprints indicating whether preprint came bioRxiv medRxiv. Updated tests.","code":""},{"path":"https://stephenturner.github.io/biorecap/news/index.html","id":"biorecap-011","dir":"Changelog","previous_headings":"","what":"biorecap 0.1.1","title":"biorecap 0.1.1","text":"Fix bug add_summary() caused upstream changes ollamar (fixes #1). Bumped minimum required version ollamar 1.2.1.","code":""},{"path":"https://stephenturner.github.io/biorecap/news/index.html","id":"biorecap-010","dir":"Changelog","previous_headings":"","what":"biorecap 0.1.0","title":"biorecap 0.1.0","text":"Initial release.","code":""}]
+[{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":null,"dir":"","previous_headings":"","what":"Contributing to biorecap","title":"Contributing to biorecap","text":"outlines propose change biorecap. detailed discussion contributing tidyverse packages, please see development contributing guide code review principles.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"fixing-typos","dir":"","previous_headings":"","what":"Fixing typos","title":"Contributing to biorecap","text":"can fix typos, spelling mistakes, grammatical errors documentation directly using GitHub web interface, long changes made source file. generally means ’ll need edit roxygen2 comments .R, .Rd file. can find .R file generates .Rd reading comment first line.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"bigger-changes","dir":"","previous_headings":"","what":"Bigger changes","title":"Contributing to biorecap","text":"want make bigger change, ’s good idea first file issue make sure someone team agrees ’s needed. ’ve found bug, please file issue illustrates bug minimal reprex (also help write unit test, needed). See guide create great issue advice.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"pull-request-process","dir":"","previous_headings":"Bigger changes","what":"Pull request process","title":"Contributing to biorecap","text":"Fork package clone onto computer. haven’t done , recommend using usethis::create_from_github(\"stephenturner/biorecap\", fork = TRUE). Install development dependencies devtools::install_dev_deps(), make sure package passes R CMD check running devtools::check(). R CMD check doesn’t pass cleanly, ’s good idea ask help continuing. Create Git branch pull request (PR). recommend using usethis::pr_init(\"brief-description--change\"). Make changes, commit git, create PR running usethis::pr_push(), following prompts browser. title PR briefly describe change. body PR contain Fixes #issue-number. user-facing changes, add bullet top NEWS.md (.e. just first header). Follow style described https://style.tidyverse.org/news.html.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"code-style","dir":"","previous_headings":"Bigger changes","what":"Code style","title":"Contributing to biorecap","text":"New code follow tidyverse style guide. can use styler package apply styles, please don’t restyle code nothing PR. use roxygen2, Markdown syntax, documentation. use testthat unit tests. Contributions test cases included easier accept.","code":""},{"path":"https://stephenturner.github.io/biorecap/CONTRIBUTING.html","id":"code-of-conduct","dir":"","previous_headings":"","what":"Code of Conduct","title":"Contributing to biorecap","text":"Please note biorecap project released Contributor Code Conduct. contributing project agree abide terms.","code":""},{"path":"https://stephenturner.github.io/biorecap/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2024 Stephen Turner Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://stephenturner.github.io/biorecap/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Stephen Turner. Author, maintainer.","code":""},{"path":"https://stephenturner.github.io/biorecap/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Turner S (2024). biorecap: Retrieve summarize bioRxiv medRxiv preprints local LLM using ollama. R package version 0.2.0, https://stephenturner.github.io/biorecap/.","code":"@Manual{, title = {biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama}, author = {Stephen Turner}, year = {2024}, note = {R package version 0.2.0}, url = {https://stephenturner.github.io/biorecap/}, }"},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"biorecap-","dir":"","previous_headings":"","what":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"Retrieve summarize bioRxiv medRxiv preprints using local LLM Ollama via ollamar. Turner, S. D. (2024). biorecap: R package summarizing bioRxiv preprints local LLM. arXiv, 2408.11707. https://doi.org/10.48550/arXiv.2408.11707.","code":""},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"Install biorecap GitHub (keep dependencies=TRUE get Suggests packages needed create HTML report):","code":"# install.packages(\"remotes\") remotes::install_github(\"stephenturner/biorecap\", dependencies=TRUE)"},{"path":[]},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"quick-start","dir":"","previous_headings":"Usage","what":"Quick start","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"First, load biorecap library. Let’s make sure Ollama running can talk R: Next can list available models: Write HTML report containing summaries recent preprints select subject areas current working directory. can include bioRxiv medRxiv subjects, biorecap know RSS feed use. Example HTML report generated bioRxiv (bioinformatics) infectious diseases (medRxiv) subjects September 25, 2024:","code":"library(biorecap) test_connection() #> Ollama local server running #> #> GET http://localhost:11434/ #> Status: 200 OK #> Content-Type: text/plain #> Body: In memory (17 bytes) list_models() name size parameter_size quantization_level modified 1 gemma2:latest 5.4 GB 9.2B Q4_0 2024-08-07T07:35:15 3 llama3.1:70b 40 GB 70.6B Q4_0 2024-07-24T10:57:08 4 llama3.1:latest 4.7 GB 8.0B Q4_0 2024-07-31T09:38:38 5 llama3.2:latest 2 GB 3.2B Q4_K_M 2024-09-25T14:54:23 6 phi3:latest 2.2 GB 3.8B Q4_0 2024-08-28T04:37:58 biorecap_report(output_dir=\".\", subject=c(\"bioinformatics\", \"infectious_diseases\"), model=\"llama3.1\")"},{"path":"https://stephenturner.github.io/biorecap/index.html","id":"details","dir":"","previous_headings":"Usage","what":"Details","title":"Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama","text":"get_preprints() function retrieves preprints RSS feed either bioRxiv medRxiv, based subject provided. pass one subjects subject argument. add_prompt() function adds prompt preprint used prompt model. Let’s take look one prompts: giving paper’s title abstract. Summarize paper many sentences instruct. include preamble text. Just give summary. Number sentences summary: 2 Title: SeuratExtend: Streamlining Single-Cell RNA-Seq Analysis Integrated Intuitive Framework Abstract: Single-cell RNA sequencing (scRNA-seq) revolutionized study cellular heterogeneity, rapid expansion analytical tools proven blessing curse, presenting researchers significant challenges. , present SeuratExtend, comprehensive R package built upon widely adopted Seurat framework, streamlines scRNA-seq data analysis integrating essential tools databases. SeuratExtend offers user-friendly intuitive interface performing wide range analyses, including functional enrichment, trajectory inference, gene regulatory network reconstruction, denoising. package seamlessly integrates multiple databases, Gene Ontology Reactome, incorporates popular Python tools like scVelo, Palantir, SCENIC unified R interface. SeuratExtend enhances data visualization optimized plotting functions carefully curated color schemes, ensuring aesthetic appeal scientific rigor. demonstrate SeuratExtend’s performance case studies investigating tumor-associated high-endothelial venules autoinflammatory diseases, showcase novel applications pathway-Level analysis cluster annotation. SeuratExtend empowers researchers harness full potential scRNA-seq data, making complex analyses accessible wider audience. package, along comprehensive documentation tutorials, freely available GitHub, providing valuable resource single-cell genomics community. add_summary() function uses locally running LLM available Ollama summarize preprint. Let’s add summary. Notice can single pipeline. takes minutes! Let’s take look results: Let’s look one summaries. ’s summary SeuratExtend paper (abstract ): SeuratExtend R package integrates essential tools databases single-cell RNA sequencing (scRNA-seq) data analysis, streamlining process user-friendly interface. package offers various analyses, including functional enrichment gene regulatory network reconstruction, seamlessly integrates multiple databases popular Python tools. biorecap_report() function runs code RMarkdown template, writing resulting HTML CSV file results current working directory. built-subjects list vectors containing available bioRxiv medRxiv subjects. create report subjects like (note, take time):","code":"pp <- get_preprints(subject=c(\"bioinformatics\", \"infectious_diseases\")) head(pp) tail(pp) #> # A tibble: 6 × 5 #> source subject title url abstract #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging Multi-Relationa… http… Genetic… #> 2 bioRxiv bioinformatics High-throughput bacterial aggregation a… http… The com… #> 3 bioRxiv bioinformatics scParadise: Tunable highly accurate mul… http… scRNA-s… #> 4 bioRxiv bioinformatics Camera Paths, Modeling, and Image Proce… http… The enh… #> 5 bioRxiv bioinformatics dScaff - an automatic bioinformatics fr… http… Rapid e… #> 6 bioRxiv bioinformatics Jaeger: an accurate and fast deep-learn… http… Abstrac… #> # A tibble: 6 × 5 #> source subject title url abstract #> #> 1 medRxiv infectious_diseases Reactogenicity and immunogenicity … http… \"The re… #> 2 medRxiv infectious_diseases A next generation CRISPR diagnosti… http… \"The WH… #> 3 medRxiv infectious_diseases Hospital-onset bacteraemia and fun… http… \"Backgr… #> 4 medRxiv infectious_diseases Co-circulating pathogens of humans… http… \"Histor… #> 5 medRxiv infectious_diseases Integration of Group A Streptococc… http… \"The Ca… #> 6 medRxiv infectious_diseases Deep Learning Models for Predictin… http… \"The Nu… pp <- pp |> add_prompt() pp #> # A tibble: 60 × 6 #> source subject title url abstract prompt #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging Multi-R… http… Genetic… I am … #> 2 bioRxiv bioinformatics High-throughput bacterial aggre… http… The com… I am … #> 3 bioRxiv bioinformatics scParadise: Tunable highly accu… http… scRNA-s… I am … #> 4 bioRxiv bioinformatics Camera Paths, Modeling, and Ima… http… The enh… I am … #> 5 bioRxiv bioinformatics dScaff - an automatic bioinform… http… Rapid e… I am … #> 6 bioRxiv bioinformatics Jaeger: an accurate and fast de… http… Abstrac… I am … #> 7 bioRxiv bioinformatics AI-Augmented R-Group Exploratio… http… Efficie… I am … #> 8 bioRxiv bioinformatics OPLS-based Multiclass Classific… http… Multicl… I am … #> 9 bioRxiv bioinformatics STANCE: a unified statistical m… http… A signi… I am … #> 10 bioRxiv bioinformatics AsaruSim: a single-cell and spa… http… Motivat… I am … #> # ℹ 50 more rows pp <- get_preprints(subject=c(\"bioinformatics\", \"infectious_diseases\")) |> add_prompt() |> add_summary(model=\"llama3.2\") pp #> # A tibble: 60 × 7 #> source subject title url abstract prompt summary #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging… http… Genetic… I am … MedGra… #> 2 bioRxiv bioinformatics High-throughput bacteri… http… The com… I am … The co… #> 3 bioRxiv bioinformatics scParadise: Tunable hig… http… scRNA-s… I am … scAdam… #> 4 bioRxiv bioinformatics Camera Paths, Modeling,… http… The enh… I am … ArtiaX… #> 5 bioRxiv bioinformatics dScaff - an automatic b… http… Rapid e… I am … dScaff… #> 6 bioRxiv bioinformatics Jaeger: an accurate and… http… Abstrac… I am … Jaeger… #> 7 bioRxiv bioinformatics AI-Augmented R-Group Ex… http… Efficie… I am … The pa… #> 8 bioRxiv bioinformatics OPLS-based Multiclass C… http… Multicl… I am … OPLS-D… #> 9 bioRxiv bioinformatics STANCE: a unified stati… http… A signi… I am … STANCE… #> 10 bioRxiv bioinformatics AsaruSim: a single-cell… http… Motivat… I am … AsaruS… #> # ℹ 50 more rows biorecap_report(output_dir=\".\", subject=c(\"bioinformatics\", \"infectious_diseases\"), model=\"llama3.2\") subjects$biorxiv #> [1] \"all\" #> [2] \"animal_behavior_and_cognition\" #> [3] \"biochemistry\" #> [4] \"bioengineering\" #> [5] \"bioinformatics\" #> [6] \"biophysics\" #> [7] \"cancer_biology\" #> [8] \"cell_biology\" #> [9] \"clinical_trials\" #> [10] \"developmental_biology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"evolutionary_biology\" #> [14] \"genetics\" #> [15] \"genomics\" #> [16] \"immunology\" #> [17] \"microbiology\" #> [18] \"molecular_biology\" #> [19] \"neuroscience\" #> [20] \"paleontology\" #> [21] \"pathology\" #> [22] \"pharmacology_and_toxicology\" #> [23] \"plant_biology\" #> [24] \"scientific_communication_and_education\" #> [25] \"synthetic_biology\" #> [26] \"systems_biology\" #> [27] \"zoology\" subjects$medrxiv #> [1] \"all\" #> [2] \"addiction_medicine\" #> [3] \"allergy_and_immunology\" #> [4] \"anesthesia\" #> [5] \"cardiovascular_medicine\" #> [6] \"dentistry_and_oral_medicine\" #> [7] \"dermatology\" #> [8] \"dermatology\" #> [9] \"endocrinology\" #> [10] \"epidemiology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"forensic_medicine\" #> [14] \"gastroenterology\" #> [15] \"genetic_and_genomic_medicine\" #> [16] \"geriatric_medicine\" #> [17] \"health_economics\" #> [18] \"health_informatics\" #> [19] \"health_policy\" #> [20] \"health_systems_and_quality_improvement\" #> [21] \"hematology\" #> [22] \"hivaids\" #> [23] \"infectious_diseases\" #> [24] \"intensive_care_and_critical_care_medicine\" #> [25] \"medical_education\" #> [26] \"medical_ethics\" #> [27] \"nephrology\" #> [28] \"neurology\" #> [29] \"nursing\" #> [30] \"nutrition\" #> [31] \"obstetrics_and_gynecology\" #> [32] \"occupational_and_environmental_health\" #> [33] \"oncology\" #> [34] \"ophthalmology\" #> [35] \"orthopedics\" #> [36] \"otolaryngology\" #> [37] \"pain_medicine\" #> [38] \"palliative_medicine\" #> [39] \"pathology\" #> [40] \"pediatrics\" #> [41] \"pharmacology_and_therapeutics\" #> [42] \"primary_care_research\" #> [43] \"psychiatry_and_clinical_psychology\" #> [44] \"public_and_global_health\" #> [45] \"radiology_and_imaging\" #> [46] \"rehabilitation_medicine_and_physical_therapy\" #> [47] \"respiratory_medicine\" #> [48] \"rheumatology\" #> [49] \"sexual_and_reproductive_health\" #> [50] \"sports_medicine\" #> [51] \"surgery\" #> [52] \"toxicology\" #> [53] \"transplantation\" #> [54] \"urology\" biorecap_report(output_dir=\".\", subject=subjects, model=\"llama3.2\")"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":null,"dir":"Reference","previous_headings":"","what":"Add prompt to a data frame of preprints — add_prompt","title":"Add prompt to a data frame of preprints — add_prompt","text":"Add prompt data frame preprints","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Add prompt to a data frame of preprints — add_prompt","text":"","code":"add_prompt(preprints, ...)"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Add prompt to a data frame of preprints — add_prompt","text":"preprints Result get_preprints(). ... Additional arguments build_prompt_preprint().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Add prompt to a data frame of preprints — add_prompt","text":"data frame preprints prompt added.","code":""},{"path":[]},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Add prompt to a data frame of preprints — add_prompt","text":"","code":"preprints <- get_preprints(subject=c(\"bioinformatics\", \"genomics\")) preprints <- add_prompt(preprints) preprints #> # A tibble: 60 × 6 #> source subject title url abstract prompt #> #> 1 bioRxiv bioinformatics Cell-type-specific mapping of e… http… \"Mappin… \"I am… #> 2 bioRxiv bioinformatics Particular sequence characteris… http… \"Transp… \"I am… #> 3 bioRxiv bioinformatics Machine Learning-Enhanced Drug … http… \"Alzhei… \"I am… #> 4 bioRxiv bioinformatics Generalizable Morphological Pro… http… \"The in… \"I am… #> 5 bioRxiv bioinformatics Local Mean Suppression Filter f… http… \"We pre… \"I am… #> 6 bioRxiv bioinformatics FerroEnrich: An Interactive web… http… \"Ferrop… \"I am… #> 7 bioRxiv bioinformatics LOCAS: Multi-label mRNA Localiz… http… \"Tradit… \"I am… #> 8 bioRxiv bioinformatics RanBALL: An Ensemble Random Pro… http… \"As the… \"I am… #> 9 bioRxiv bioinformatics Continual integration of single… http… \"Single… \"I am… #> 10 bioRxiv bioinformatics Protein Sequence Modelling with… http… \"Explor… \"I am… #> # ℹ 50 more rows"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":null,"dir":"Reference","previous_headings":"","what":"Add prompts for an entire subject — add_prompt_subject","title":"Add prompts for an entire subject — add_prompt_subject","text":"Add prompts entire subject","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Add prompts for an entire subject — add_prompt_subject","text":"","code":"add_prompt_subject(preprints, ...)"},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Add prompts for an entire subject — add_prompt_subject","text":"preprints Output get_preprints() followed add_prompt() followed add_summary(). ... Additional arguments build_prompt_subject().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Add prompts for an entire subject — add_prompt_subject","text":"tibble subject prompt column.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_prompt_subject.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Add prompts for an entire subject — add_prompt_subject","text":"","code":"subjects <- example_preprints |> dplyr::group_by(subject) |> add_prompt_subject() #> Warning: Expecting a tibble of class 'preprints_prompt' returned from get_preprints() |> add_prompt(). subjects #> # A tibble: 2 × 2 #> subject prompt #> #> 1 bioinformatics \"I am giving you information about recent bioRxiv/medRxiv… #> 2 infectious_diseases \"I am giving you information about recent bioRxiv/medRxiv…"},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":null,"dir":"Reference","previous_headings":"","what":"Generate a summary from a data frame of prompts — add_summary","title":"Generate a summary from a data frame of prompts — add_summary","text":"Generate summary data frame prompts","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Generate a summary from a data frame of prompts — add_summary","text":"","code":"add_summary(preprints, model = \"llama3.2\")"},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Generate a summary from a data frame of prompts — add_summary","text":"preprints Output get_preprints() followed add_prompt(). model model available Ollama (run ollamar::list_models()) see available.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Generate a summary from a data frame of prompts — add_summary","text":"tibble, response column added.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/add_summary.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Generate a summary from a data frame of prompts — add_summary","text":"","code":"if (FALSE) { # \\dontrun{ # Individual papers preprints <- get_preprints(c(\"genomics\", \"bioinformatics\")) |> add_prompt() |> add_summary() preprints } # }"},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap-package.html","id":null,"dir":"Reference","previous_headings":"","what":"biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama — biorecap-package","title":"biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama — biorecap-package","text":"Retrieve summarize bioRxiv medRxiv preprints local LLM using ollama.","code":""},{"path":[]},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama — biorecap-package","text":"Maintainer: Stephen Turner vustephen@gmail.com (ORCID)","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":null,"dir":"Reference","previous_headings":"","what":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"Create report bioRxiv/medRxiv preprints","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"","code":"biorecap_report( output_dir = \".\", subject = NULL, nsentences = 2L, model = \"llama3.1\", use_example_preprints = FALSE, ... )"},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"output_dir Directory save report. subject Character vector subjects include report. nsentences Number sentences summarize paper . model model use generating summaries. See ollamar::list_models(). use_example_preprints Use example preprints data included package instead fetching new data bioRxiv/medRxiv. diagnostic/testing purposes . ... arguments passed rmarkdown::render().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"Nothing; called side effects produce report.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/biorecap_report.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create a report from bioRxiv/medRxiv preprints — biorecap_report","text":"","code":"if (FALSE) { # \\dontrun{ output_dir <- tempdir() biorecap_report(use_example_preprints=TRUE, output_dir=output_dir) biorecap_report(subject=c(\"bioinformatics\", \"genomics\", \"synthetic_biology\"), output_dir=output_dir) } # }"},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":null,"dir":"Reference","previous_headings":"","what":"Construct a prompt to summarize a paper — build_prompt_preprint","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"Construct prompt summarize paper","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"","code":"build_prompt_preprint( title, abstract, nsentences = 2L, instructions = c(\"I am giving you a paper's title and abstract.\", \"Summarize the paper in as many sentences as I instruct.\", \"Do not include any preamble text to the summary\", \"just give me the summary with no preface or intro sentence.\") )"},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"title title paper. abstract abstract paper. nsentences number sentences summarize paper . instructions Instructions prompt. can character vector gets collapsed single string.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"string containing prompt.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_preprint.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Construct a prompt to summarize a paper — build_prompt_preprint","text":"","code":"build_prompt_preprint(title=\"A great paper\", abstract=\"This is the abstract.\") #> [1] \"I am giving you a paper's title and abstract. Summarize the paper in as many sentences as I instruct. Do not include any preamble text to the summary just give me the summary with no preface or intro sentence.\\nNumber of sentences in summary: 2\\nTitle: A great paper\\nAbstract: This is the abstract.\""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":null,"dir":"Reference","previous_headings":"","what":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"Construct prompt summarize set papers subject","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"","code":"build_prompt_subject( subject, title, summary, nsentences = 5L, instructions = c(\"I am giving you information about recent bioRxiv/medRxiv preprints.\", \"I'll give you the subject, preprint titles, and short summary of each paper.\", \"Please provide a general summary new advances in this subject/field in general.\", \"Provide this summary of the field in as many sentences as I instruct.\", \"Do not include any preamble text to the summary\", \"just give me the summary with no preface or intro sentence.\") )"},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"subject name subject. title character vector titles subject summary character vector summaries paper provided get_preprints() followed add_prompt() followed add_summary(). nsentences number sentences summarize subject . instructions Instructions prompt. can character vector gets collapsed single string.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"string containing prompt.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/build_prompt_subject.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Construct a prompt to summarize a set of papers from a subject — build_prompt_subject","text":"","code":"title <- example_preprints |> dplyr::filter(subject==\"bioinformatics\") |> dplyr::pull(title) summary <- example_preprints |> dplyr::filter(subject==\"bioinformatics\") |> dplyr::pull(summary) build_prompt_subject(subject=\"bioinformatics\", title=title, summary=summary) #> [1] \"I am giving you information about recent bioRxiv/medRxiv preprints. I'll give you the subject, preprint titles, and short summary of each paper. Please provide a general summary new advances in this subject/field in general. Provide this summary of the field in as many sentences as I instruct. Do not include any preamble text to the summary just give me the summary with no preface or intro sentence.\\n\\nSubject: bioinformatics\\nNumber of sentences in summary: 5\\n\\nHere are the titles and summaries:\\n\\nTitle: MedGraphNet: Leveraging Multi-Relational Graph Neural Networks and Text Knowledge for Biomedical Predictions\\nSummary: MedGraphNet leverages multi-relational Graph Neural Networks and text knowledge to improve biomedical predictions by initializing nodes using informative embeddings from existing text knowledge, allowing for robust integration of various data types and improved generalizability. The model demonstrates superior performance compared to traditional single-relation approaches in scenarios with isolated or sparsely connected nodes, particularly in identifying disease-gene associations and drug-phenotype relationships, and shows promising results in accurately inferring drug side effects without direct training on such data.\\n\\nTitle: High-throughput bacterial aggregation analysis in droplets\\nSummary: The communal lifestyle of bacteria can contribute significantly to antimicrobial resistance by promoting biofilm formation. A key approach to addressing this issue is to develop novel techniques for analyzing bacterial behavior, such as those enabled by droplet-based platforms and image analysis methods.\\n\\nTitle: scParadise: Tunable highly accurate multi-task cell type annotation and surface protein abundance prediction\\nSummary: scAdam outperforms existing methods in annotating rare cell types with high accuracy and consistency across diverse datasets. scEve enhances clustering and cell type separation through improved surface protein prediction, leading to better characterization of complex tissues.\\n\\nTitle: Camera Paths, Modeling, and Image Processing Tools for ArtiaX\\nSummary: ArtiaX is a plugin that has been extended to improve the analysis and visualization of cryo-electron tomography data through advanced visualization techniques. The plugin allows for the generation of diverse models with putative particle positions and orientations, as well as a coarse grained algorithm to rectify overlaps in template matching, driving camera position and facilitating movie creation with fundamental image filtering options.\\n\\nTitle: dScaff - an automatic bioinformatics framework for scaffolding draft de novo assemblies based on reference genome data\\nSummary: dScaff is an automatic bioinformatics framework designed for scaffolding draft de novo assemblies based on reference genome data. The tool uses a series of bash and R scripts to create a minimal complete scaffold from a genome assembly, with potential future features to be implemented, including using reference chromosomes or scaffolds.\\n\\nTitle: Jaeger: an accurate and fast deep-learning tool to detect bacteriophage sequences\\nSummary: Jaeger's accuracy and speed in identifying bacteriophage sequences outperform existing deep-learning tools by consistently producing few false positives despite encountering diverse viral sequences. The novel method achieves an estimated 2-27% false discovery rate when applied to over 16,000 metagenomic assemblies, which is significantly lower than the benchmarking paper where deep-learning tools produced many false positives.\\n\\nTitle: AI-Augmented R-Group Exploration in Medicinal Chemistry\\nSummary: The paper presents a novel approach to enhancing free-wing QSAR models by embedding R-groups with atom-centric pharmacophoric features, allowing for the distinction of regioisomers and improved predictivity across 12 public datasets. The proposed method is integrated into an open-source program, enabling its application in various scenarios, including classic free-Wilson analysis and exploration of uncharted chemical space facilitated by AI-generated building blocks.\\n\\nTitle: OPLS-based Multiclass Classification and Data-Driven Inter-Class Relationship Discovery\\nSummary: OPLS-DA models are widely used in metabolomics for two-class comparisons due to their strong discrimination capabilities, but these models face challenges in multiclass settings. An extension of OPLS-DA called OPLS-HDA integrates Hierarchical Cluster Analysis with the OPLS-DA framework to create a decision tree that addresses multiclass classification challenges and provides intuitive visualization of inter-class relationships.\\n\\nTitle: STANCE: a unified statistical model to detect cell-type-specific spatially variable genes in spatial transcriptomics\\nSummary: STANCE, a unified statistical model to detect cell-type-specific spatially variable genes in spatial transcriptomics, was developed to address the challenges posed by existing methods in detecting spatially variable genes (SVGs) and cell type-specific spatially variable genes (ctSVGs). The proposed method integrates gene expression, spatial location, and cell type composition through a linear mixed-effect model to identify both SVGs and ctSVGs in an initial stage, followed by a second stage test dedicated to ctSVG detection.\\n\\nTitle: AsaruSim: a single-cell and spatial RNA-Seq Nanopore long-reads simulation workflow\\nSummary: AsaruSim simulates synthetic single-cell long-read Nanopore datasets that closely mimic real experimental data by employing a multi-step process. It includes the creation of a synthetic UMI count matrix, generation of perfect reads, optional PCR amplification, introduction of sequencing errors, and comprehensive quality control reporting.\\n\\nTitle: Building a literature knowledge base towards transparent biomedical AI\\nSummary: LiteralGraph extracts biomedical terms and relationships from PubMed literature, establishing a comprehensive knowledge graph. The resulting Genomic Literature Knowledge Base consolidates over 263 million biomedical terms, 14 million relationships, and 10 million genomic events across multiple sources, including nine established repositories.\\n\\nTitle: Accurate non-invasive quantification of astaxanthin content using hyperspectral images and machine learning\\nSummary: The authors investigated a method to accurately quantify astaxanthin content in Haematococcus pluvialis microalgae cultures using hyperspectral images and machine learning. They found that this approach, combining reflectance hyperspectral imaging with a 1-dimensional convolutional neural network, had low average prediction error across a range of astaxanthin contents, although it was unreliable at very low levels (<0.6 micrograms mg-1).\\n\\nTitle: AlphaMut: a deep reinforcement learning model to suggest helix-disrupting mutations\\nSummary: The authors propose a deep reinforcement learning model called AlphaMut to predict helix-disrupting mutations in proteins. AlphaMut identifies amino acids crucial for maintaining structural integrity and predicts key mutations that could alter protein function.\\n\\nTitle: Beyond Static Brain Atlases: AI-Powered Open Databasing and Dynamic Mining of Brain-Wide Neuron Morphometry\\nSummary: NeuroXiv is a large-scale database that provides detailed 3D morphologies of individual neurons mapped to a standard brain atlas, allowing for dynamic, interactive neuroscience applications. The database offers a comprehensive collection of 175,149 atlas-oriented reconstructed morphologies of individual neurons from over 518 mouse brains, classified into 292 distinct types and mapped into the Common Coordinate Framework Version 3 (CCFv3).\\n\\nTitle: Metabolic modeling identifies determinants of thermal growth responses in Arabidopsis thaliana\\nSummary: The paper developed an enzyme-constrained model of Arabidopsis thaliana's metabolism, which facilitates predictions of growth-related phenotypes at different temperatures and identifies genes affecting plant growth at suboptimal temperatures. This model was validated using mutant lines, demonstrating its potential in accurately predicting plant thermal responses and providing a template for developing climate-resilient crops.\\n\\nTitle: Decoding Protein Dynamics: ProFlex as a Linguistic Bridge in Normal Mode Analysis\\nSummary: Artificial intelligence has revolutionized structural bioinformatics with AlphaFold being arguably the most impactful development to date. The structural atlases generated by these methods present significant opportunities for unraveling biological mysteries, but also pose challenges in leveraging such massive datasets effectively.\\n\\nTitle: Exploring midgut expression dynamics: longitudinal transcriptomic analysis of adult female Amblyomma americanum midgut and comparative insights with other hard tick species\\nSummary: The study investigates the transcriptomic dynamics of the midgut in adult female Amblyomma americanum ticks during different feeding stages, revealing 15,599 putative DNA coding sequences and highlighting dynamic transcriptional changes as feeding progresses. The analysis also identified conserved transcripts across three hard tick species, providing insight into the physiological pathways relevant to the tick midgut and potential avenues for developing control methods targeting multiple tick species.\\n\\nTitle: Designing of thermostable proteins with a desired melting temperature\\nSummary: We developed a regression method for predicting protein melting temperatures (Tm) using 17,312 non-redundant proteins and achieved the highest Pearson correlation of 0.80 with an R2 of 0.63 between predicted and actual Tm values. Our best model, fine-tuned on large language models such as ProtBert, achieved a maximum correlation of 0.89 with an R2 of 0.80, demonstrating improved performance in predicting protein stability at higher temperatures.\\n\\nTitle: Joint Modeling of Cellular Heterogeneity and Condition Effects with scPCA in Single-Cell RNA-Seq\\nSummary: scRNA-seq in multi-condition experiments enables the systematic assessment of treatment effects by analyzing gene expression profiles. scPCA is a flexible DR framework that jointly models cellular heterogeneity and conditioning variables, allowing for an integrated factor representation and revealing transcriptional changes across conditions and components.\\n\\nTitle: Identification of potential inhibitors against Inosine 5'-Monophosphate Dehydrogenase of Cryptosporidium parvum through an integrated in silico approach\\nSummary: A total of 24 bioactive phytochemicals were screened virtually using molecular docking and ADMET analyses to identify potential inhibitors against Inosine 5'-Monophosphate Dehydrogenase (IMPDH) of Cryptosporidium parvum, with four lead compounds identified as Brevelin A, Vernodalin, Luteolin, and Pectolinarigenin. The lead compounds were found to possess favorable pharmacokinetic and pharmacodynamic properties, satisfactory toxicity analysis results, and no major side effects or violation of Lipinski's rules of five, indicating the possibility of oral bioavailability as potential drug candidates.\\n\\nTitle: Identification and Diagnostic Potential of Pyroptosis-Related Genes in Endometriosis: A Novel Bioinformatics Analysis\\nSummary: Pyroptosis-related genes were identified through a bioinformatics analysis of endometriosis (EM) transcriptomic datasets, resulting in 26 differentially expressed genes that play a crucial role in the pathogenesis of EM. A novel diagnostic model was constructed using LASSO regression based on pyroptosis scores, which included five key genes: KIF13B, BAG6, MYO5A, HEATR, and AK055981.\\n\\nTitle: Improving the accuracy of pose prediction by incorporating symmetry-related molecules\\nSummary: The study aimed to improve the accuracy of pose prediction in molecular docking by incorporating symmetry-related molecules (SRMs). Redocking protein-ligand complexes with and without SRMs revealed that using SRMs significantly improved the prediction of biologically significant poses, as indicated by MM-GBSA calculations.\\n\\nTitle: Identification and study of Prolyl Oligopeptidases and related sequences in bacterial lineages\\nSummary: The study examined ~32000 completely annotated bacterial genomes from the NCBI RefSeq Assembly database to identify annotated S9 family proteins, resulting in the discovery of ~53,000 bacterial S9 family proteins (referred to as POP homologues) which can be classified into distinct subfamilies through various machine-learning approaches and comprehensive analysis. These sequence homologues display distinct subclusters and class-specific motifs suggesting differences in substrate specificity in POP homologues.\\n\\nTitle: Learning-Augmented Sketching Offers Improved Performance for Privacy Preserving and Secure GWAS\\nSummary: The introduction of trusted execution environments (TEEs) such as Intel SGX technology has enabled secure and privacy-preserving computation on the cloud, but stringent resource limitations pose a challenge for some TEEs. The SkSES method, which identifies significant SNPs in GWAS without disclosing sensitive genotype information, has been improved upon with a learning-augmented approach that achieves up to 40% accuracy gain compared to the original SkSES method.\\n\\nTitle: Liberality is More Explainable than PCA of Transcriptome for Vertebrate Embryo Development\\nSummary: Liberality is a quantitative index of cellular differentiation and dedifferentiation that has been widely used for genome-scale data analysis, particularly in understanding vertebrate embryo development. The study analyzed a time course transcriptome dataset on vertebrate embryo development and found a trend that historically annotated embryo developmental stages matched changes in liberality, indicating the potential of liberality to analyze biological phenomena beyond just embryo development.\\n\\nTitle: Bacopa monnieri phytochemicals as promising BACE1 inhibitors for Alzheimers Disease Therapy\\nSummary: Bacopa monnieri phytochemicals are investigated as potential BACE1 inhibitors for Alzheimer's Disease Therapy, with Bacopaside I showing superior binding affinity and interaction profile compared to established synthetic inhibitors. The study highlights the promising role of natural compounds in AD treatment, emphasizing their potential to overcome limitations faced in clinical settings, and advocates for a paradigm shift towards integrating traditional medicinal knowledge into contemporary drug discovery efforts.\\n\\nTitle: Accurate Multiple Sequence Alignment of Ultramassive Genome Sets\\nSummary: The current state of multiple sequence alignment (MSA) is insufficient for handling ultramassive genome sets due to challenges in scalability and accuracy. The proposed algorithms, including directed acyclic graph construction, profile hidden Markov model training, and graph-based alignment, significantly improve accuracy and acceleration of MSA compared to widely used MAFFT for genome set sizes ranging from 40,000 to over 4 million.\\n\\nTitle: Machine Learning Driven Simulations of SARS-CoV-2 Fitness Landscape\\nSummary: The SARS-CoV-2 infection is caused by interactions between the receptor binding domain of viral spike proteins and host cell ACE2 receptors, with mutations in the spike protein leading to neutralizing antibody escape and breakthrough infections. Machine learning-driven simulations combined with deep mutational scanning data predict variants of concern not seen in the training data and sample statistics of the fitness landscape, providing insight into the relationship between RBD sequence elements and emerging viral strains.\\n\\nTitle: Modelling dynamics of human NDPK hexamer structure, stability and interactions\\nSummary: The precise assembly of the NDPK hexameric structure into homo- /hetero-oligomeric complexes is necessary for kinase activity but has been poorly understood due to high subunit homology, experimental challenges, and limited data on in vivo heterohexamer formation and subunit abundances across cellular compartments. A conserved Arg27 residue plays a key role in hexamer assembly, mediating inter- and intra-molecular monomeric interactions and ensuring similar hexameric assembly across subunits.\\n\\nTitle: GuaCAMOLE: GC-bias aware estimation improves the accuracy of metagenomic species abundances\\nSummary: GuaCAMOLE is a novel computational method that detects and removes GC bias from metagenomic sequencing data, which affects the accuracy of quantifying microbial community compositions. The algorithm reports unbiased abundances and corrects the abundance of clinically relevant GC-poor species by up to a factor of two in gut microbiomes of colorectal cancer patients.\""},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":null,"dir":"Reference","previous_headings":"","what":"Example preprints with summaries — example_preprints","title":"Example preprints with summaries — example_preprints","text":"Example preprints summaries August 6, 2024.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Example preprints with summaries — example_preprints","text":"","code":"example_preprints"},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Example preprints with summaries — example_preprints","text":"tibble returned get_preprints() followed add_prompt() followed add_summary().","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/example_preprints.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Example preprints with summaries — example_preprints","text":"","code":"example_preprints #> # A tibble: 60 × 7 #> source subject title url abstract prompt summary #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging… http… Genetic… I am … MedGra… #> 2 bioRxiv bioinformatics High-throughput bacteri… http… The com… I am … The co… #> 3 bioRxiv bioinformatics scParadise: Tunable hig… http… scRNA-s… I am … scAdam… #> 4 bioRxiv bioinformatics Camera Paths, Modeling,… http… The enh… I am … ArtiaX… #> 5 bioRxiv bioinformatics dScaff - an automatic b… http… Rapid e… I am … dScaff… #> 6 bioRxiv bioinformatics Jaeger: an accurate and… http… Abstrac… I am … Jaeger… #> 7 bioRxiv bioinformatics AI-Augmented R-Group Ex… http… Efficie… I am … The pa… #> 8 bioRxiv bioinformatics OPLS-based Multiclass C… http… Multicl… I am … OPLS-D… #> 9 bioRxiv bioinformatics STANCE: a unified stati… http… A signi… I am … STANCE… #> 10 bioRxiv bioinformatics AsaruSim: a single-cell… http… Motivat… I am … AsaruS… #> # ℹ 50 more rows"},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":null,"dir":"Reference","previous_headings":"","what":"Get bioRxiv/medRxiv preprints — get_preprints","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"Get bioRxiv/medRxiv preprints","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"","code":"get_preprints(subject = \"all\", clean = TRUE)"},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"subject character vector valid bioRxiv /medRxiv subjects. See subjects. clean Logical; try strip graphical abstract information? TRUE, strips away text O_FIG C_FIG, words graphical abstract abstract text RSS feed.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"data frame preprints bioRxiv /medRxiv.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/get_preprints.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get bioRxiv/medRxiv preprints — get_preprints","text":"","code":"preprints <- get_preprints(subject=c(\"bioinformatics\", \"Public_and_Global_Health\")) preprints #> # A tibble: 60 × 5 #> source subject title url abstract #> #> 1 bioRxiv bioinformatics Cell-type-specific mapping of enhancer… http… \"Mappin… #> 2 bioRxiv bioinformatics Particular sequence characteristics in… http… \"Transp… #> 3 bioRxiv bioinformatics Machine Learning-Enhanced Drug Discove… http… \"Alzhei… #> 4 bioRxiv bioinformatics Generalizable Morphological Profiling … http… \"The in… #> 5 bioRxiv bioinformatics Local Mean Suppression Filter for Effe… http… \"We pre… #> 6 bioRxiv bioinformatics FerroEnrich: An Interactive web tool f… http… \"Ferrop… #> 7 bioRxiv bioinformatics LOCAS: Multi-label mRNA Localization w… http… \"Tradit… #> 8 bioRxiv bioinformatics RanBALL: An Ensemble Random Projection… http… \"As the… #> 9 bioRxiv bioinformatics Continual integration of single-cell m… http… \"Single… #> 10 bioRxiv bioinformatics Protein Sequence Modelling with Bayesi… http… \"Explor… #> # ℹ 50 more rows"},{"path":"https://stephenturner.github.io/biorecap/reference/reexports.html","id":null,"dir":"Reference","previous_headings":"","what":"Objects exported from other packages — reexports","title":"Objects exported from other packages — reexports","text":"objects imported packages. Follow links see documentation. ollamar list_models, test_connection","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":null,"dir":"Reference","previous_headings":"","what":"bioRxiv subjects — subjects","title":"bioRxiv subjects — subjects","text":"Names subjects RSS feeds biorXiv","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"bioRxiv subjects — subjects","text":"","code":"subjects"},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"bioRxiv subjects — subjects","text":"list character vectors subjects, one bioRxiv, one medRxiv.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"bioRxiv subjects — subjects","text":"https://www.biorxiv.org/alertsrss","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/subjects.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"bioRxiv subjects — subjects","text":"","code":"subjects #> $biorxiv #> [1] \"all\" #> [2] \"animal_behavior_and_cognition\" #> [3] \"biochemistry\" #> [4] \"bioengineering\" #> [5] \"bioinformatics\" #> [6] \"biophysics\" #> [7] \"cancer_biology\" #> [8] \"cell_biology\" #> [9] \"clinical_trials\" #> [10] \"developmental_biology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"evolutionary_biology\" #> [14] \"genetics\" #> [15] \"genomics\" #> [16] \"immunology\" #> [17] \"microbiology\" #> [18] \"molecular_biology\" #> [19] \"neuroscience\" #> [20] \"paleontology\" #> [21] \"pathology\" #> [22] \"pharmacology_and_toxicology\" #> [23] \"plant_biology\" #> [24] \"scientific_communication_and_education\" #> [25] \"synthetic_biology\" #> [26] \"systems_biology\" #> [27] \"zoology\" #> #> $medrxiv #> [1] \"all\" #> [2] \"addiction_medicine\" #> [3] \"allergy_and_immunology\" #> [4] \"anesthesia\" #> [5] \"cardiovascular_medicine\" #> [6] \"dentistry_and_oral_medicine\" #> [7] \"dermatology\" #> [8] \"dermatology\" #> [9] \"endocrinology\" #> [10] \"epidemiology\" #> [11] \"ecology\" #> [12] \"epidemiology\" #> [13] \"forensic_medicine\" #> [14] \"gastroenterology\" #> [15] \"genetic_and_genomic_medicine\" #> [16] \"geriatric_medicine\" #> [17] \"health_economics\" #> [18] \"health_informatics\" #> [19] \"health_policy\" #> [20] \"health_systems_and_quality_improvement\" #> [21] \"hematology\" #> [22] \"hivaids\" #> [23] \"infectious_diseases\" #> [24] \"intensive_care_and_critical_care_medicine\" #> [25] \"medical_education\" #> [26] \"medical_ethics\" #> [27] \"nephrology\" #> [28] \"neurology\" #> [29] \"nursing\" #> [30] \"nutrition\" #> [31] \"obstetrics_and_gynecology\" #> [32] \"occupational_and_environmental_health\" #> [33] \"oncology\" #> [34] \"ophthalmology\" #> [35] \"orthopedics\" #> [36] \"otolaryngology\" #> [37] \"pain_medicine\" #> [38] \"palliative_medicine\" #> [39] \"pathology\" #> [40] \"pediatrics\" #> [41] \"pharmacology_and_therapeutics\" #> [42] \"primary_care_research\" #> [43] \"psychiatry_and_clinical_psychology\" #> [44] \"public_and_global_health\" #> [45] \"radiology_and_imaging\" #> [46] \"rehabilitation_medicine_and_physical_therapy\" #> [47] \"respiratory_medicine\" #> [48] \"rheumatology\" #> [49] \"sexual_and_reproductive_health\" #> [50] \"sports_medicine\" #> [51] \"surgery\" #> [52] \"toxicology\" #> [53] \"transplantation\" #> [54] \"urology\" #>"},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":null,"dir":"Reference","previous_headings":"","what":"Create a markdown table from prepreprint summaries — tt_preprints","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"Create markdown table prepreprint summaries","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"","code":"tt_preprints(preprints, cols = c(\"title\", \"summary\"), width = c(1, 3))"},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"preprints Output get_preprints() followed add_prompt() followed add_summary(). cols Columns display resulting markdown table. width Vector relative widths equal length(cols).","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"tinytable table.","code":""},{"path":"https://stephenturner.github.io/biorecap/reference/tt_preprints.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create a markdown table from prepreprint summaries — tt_preprints","text":"","code":"# Use built-in example data example_preprints #> # A tibble: 60 × 7 #> source subject title url abstract prompt summary #> #> 1 bioRxiv bioinformatics MedGraphNet: Leveraging… http… Genetic… I am … MedGra… #> 2 bioRxiv bioinformatics High-throughput bacteri… http… The com… I am … The co… #> 3 bioRxiv bioinformatics scParadise: Tunable hig… http… scRNA-s… I am … scAdam… #> 4 bioRxiv bioinformatics Camera Paths, Modeling,… http… The enh… I am … ArtiaX… #> 5 bioRxiv bioinformatics dScaff - an automatic b… http… Rapid e… I am … dScaff… #> 6 bioRxiv bioinformatics Jaeger: an accurate and… http… Abstrac… I am … Jaeger… #> 7 bioRxiv bioinformatics AI-Augmented R-Group Ex… http… Efficie… I am … The pa… #> 8 bioRxiv bioinformatics OPLS-based Multiclass C… http… Multicl… I am … OPLS-D… #> 9 bioRxiv bioinformatics STANCE: a unified stati… http… A signi… I am … STANCE… #> 10 bioRxiv bioinformatics AsaruSim: a single-cell… http… Motivat… I am … AsaruS… #> # ℹ 50 more rows tt_preprints(example_preprints| title | summary || [MedGraphNet: Leveraging Multi-Relational Graph Neural Networks and Text Knowledge for Biomedical Predictions](http://biorxiv.org/cgi/content/short/2024.09.24.614782v1?rss=1) | MedGraphNet leverages multi-relational Graph Neural Networks and text knowledge to improve biomedical predictions by initializing nodes using informative embeddings from existing text knowledge, allowing for robust integration of various data types and improved generalizability. The model demonstrates superior performance compared to traditional single-relation approaches in scenarios with isolated or sparsely connected nodes, particularly in identifying disease-gene associations and drug-phenotype relationships, and shows promising results in accurately inferring drug side effects without direct training on such data. || [High-throughput bacterial aggregation analysis in droplets](http://biorxiv.org/cgi/content/short/2024.09.24.613170v1?rss=1) | The communal lifestyle of bacteria can contribute significantly to antimicrobial resistance by promoting biofilm formation. A key approach to addressing this issue is to develop novel techniques for analyzing bacterial behavior, such as those enabled by droplet-based platforms and image analysis methods. | #> +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+"},{"path":"https://stephenturner.github.io/biorecap/news/index.html","id":"biorecap-020","dir":"Changelog","previous_headings":"","what":"biorecap 0.2.0","title":"biorecap 0.2.0","text":"Added medRxiv support. get_preprints() function now pull either bioRxiv medRxiv RSS feed depending subject passed . downstream functions reporting updated reflect change (fixes #5). Changed default model llama 3.2 3B. Added new source column returned preprints indicating whether preprint came bioRxiv medRxiv. Updated tests.","code":""},{"path":"https://stephenturner.github.io/biorecap/news/index.html","id":"biorecap-011","dir":"Changelog","previous_headings":"","what":"biorecap 0.1.1","title":"biorecap 0.1.1","text":"Fix bug add_summary() caused upstream changes ollamar (fixes #1). Bumped minimum required version ollamar 1.2.1.","code":""},{"path":"https://stephenturner.github.io/biorecap/news/index.html","id":"biorecap-010","dir":"Changelog","previous_headings":"","what":"biorecap 0.1.0","title":"biorecap 0.1.0","text":"Initial release.","code":""}]
Add prompts for an entire subject
- Source:R/biorecap.R
+ Source: R/biorecap.R
add_prompt_subject.Rd
Generate a summary from a data frame of prompts
- Source:R/biorecap.R
+ Source: R/biorecap.R
add_summary.Rd
biorecap: Retrieve and summarize bioRxiv and medRxiv preprints with a local LLM using ollama
- Source:R/biorecap-package.R
+ Source: R/biorecap-package.R
biorecap-package.Rd
Create a report from bioRxiv/medRxiv preprints
- Source:R/biorecap.R
+ Source: R/biorecap.R
biorecap_report.Rd
Construct a prompt to summarize a paper
- Source:R/biorecap.R
+ Source: R/biorecap.R
build_prompt_preprint.Rd
Construct a prompt to summarize a set of papers from a subject
- Source:R/biorecap.R
+ Source: R/biorecap.R
build_prompt_subject.Rd
Create a markdown table from prepreprint summaries
- Source:R/biorecap.R
+ Source: R/biorecap.R
tt_preprints.Rd