Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit Predictions File with Git Hash Link, Remove Artifact Upload #1350

Merged
merged 15 commits into from
Jan 20, 2025
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 22 additions & 9 deletions .github/workflows/paper_ranking.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ on:
- cron: '0 0 1 * *' # runs on the first day of every month
workflow_dispatch:

permissions:
contents: write
issues: write

jobs:
paper-ranking:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -41,11 +45,20 @@ jobs:
echo "PYTHONPATH=$PYTHONPATH" # Verify PYTHONPATH
python src/bioregistry/analysis/paper_ranking.py --start-date ${{ env.START_DATE }} --end-date ${{ env.END_DATE }}

- name: Upload Full List as Artifact
uses: actions/upload-artifact@v3
with:
name: full-predictions-list-${{ env.START_DATE }}-to-${{ env.END_DATE }}
path: exports/analyses/paper_ranking/predictions_${{ env.START_DATE }}_to_${{ env.END_DATE }}.tsv
- name: Configure Git
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
nagutm marked this conversation as resolved.
Show resolved Hide resolved

- name: Commit and Push Changes
run: |
git add exports/analyses/paper_ranking/predictions.tsv
git commit -m "Add predictions file for ${{ env.START_DATE }} to ${{ env.END_DATE }}"
bgyori marked this conversation as resolved.
Show resolved Hide resolved
bgyori marked this conversation as resolved.
Show resolved Hide resolved
git push

- name: Find Commit Hash
id: get-commit-hash
run: echo "COMMIT_HASH=$(git rev-parse HEAD)" >> $GITHUB_ENV

- name: Find Existing Issue
id: find-issue
Expand All @@ -70,17 +83,17 @@ jobs:
script: |
const fs = require('fs');
const issueNumber = ${{ steps.find-issue.outputs.result }};
const startDate = process.env.START_DATE;
const endDate = process.env.END_DATE;
const content = fs.readFileSync(`exports/analyses/paper_ranking/predictions_${startDate}_to_${endDate}.tsv`, 'utf8');
const commitHash = process.env.COMMIT_HASH;
const rankingFileLink = `https://github.com/${{ github.repository }}/blob/${commitHash}/exports/analyses/paper_ranking/predictions.tsv`;
const content = fs.readFileSync(`exports/analyses/paper_ranking/predictions.tsv`, 'utf8');
const lines = content.split('\n').slice(1, 21);
const rows = lines.map(line => {
const [pubmed, title] = line.split('\t');
const link = `https://bioregistry.io/pubmed:${pubmed}`;
return `| [${pubmed}](${link}) | ${title} |`;
});
const tableHeader = '| PubMed ID | Title |\n| --- | --- |\n';
const commentBody = `This issue contains monthly updates to an automatically ranked list of PubMed papers as candidates for curation in the Bioregistry. Papers may be relevant in at least three ways: \n(1) as a new prefix for a resource that can be added to the Bioregistry,\n(2) as a provider for an existing prefix, or\n(3) as a new publication for an existing prefix already in the Bioregistry.\n\nThese curations can happen in separate issues and pull requests. The full list of ranked papers can be found [here](https://github.com/${{ github.repository }}/blob/main/exports/analyses/paper_ranking/predictions_${startDate}_to_${endDate}.tsv). If you review any of these papers for relevance, you should edit the curated papers file [here](https://github.com/${{ github.repository }}/blob/main/src/bioregistry/data/curated_papers.tsv); these curations are taken into account when retraining the ranking model.\n\n**New entries for ${startDate} to ${endDate}:**\n\n${tableHeader}${rows.join('\n')}`;
const commentBody = `This issue contains monthly updates to an automatically ranked list of PubMed papers as candidates for curation in the Bioregistry. Papers may be relevant in at least three ways: \n(1) as a new prefix for a resource that can be added to the Bioregistry,\n(2) as a provider for an existing prefix, or\n(3) as a new publication for an existing prefix already in the Bioregistry.\n\nThese curations can happen in separate issues and pull requests. The full list of ranked papers can be found [here](https://github.com/${{ github.repository }}/blob/main/exports/analyses/paper_ranking/predictions_${startDate}_to_${endDate}.tsv). If you review any of these papers for relevance, you should edit the curated papers file [here](${rankingFileLink}); these curations are taken into account when retraining the ranking model.\n\n**New entries for ${startDate} to ${endDate}:**\n\n${tableHeader}${rows.join('\n')}`;

if (issueNumber) {
await github.rest.issues.createComment({
Expand Down
Loading
Loading