Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add community indexer #10680

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Add community indexer #10680

wants to merge 20 commits into from

Conversation

rbennettcw
Copy link
Contributor

@rbennettcw rbennettcw commented Jan 23, 2025

Link to Issue

Closes: #10618

TODO:

  • Verify community search works
  • Verify pagination performance
  • Verify clanker community metadata

Description of Changes

  • Adds standalone script to fetch all current clanker tokens and create a community for each
  • Adds cron job + community indexer policy which fetches clanker tokens and creates a community for each token found

Test Plan

  • Run migrations
  • Set envs:
    • COMMUNITY_INDEXER_CRON='* * * * *'
    • MAX_CLANKER_BACKFILL=1000
  • Run backfill script: pnpm backfill-clanker-tokens
    • It should create 1000 new communities from the latest clanker tokens
  • Run message relayer and consumer – it should pull in the latest tokens every minute
    • Check clanker.world home page to confirm that latest tokens show in logs every minute

Deployment Plan

  • Set envs:
    • MAX_CLANKER_BACKFILL=0 – will backfill all tokens
  • Run the pnpm backfill-clanker-tokens script on a new heroku instance
  • When finished, set env:
    • COMMUNITY_INDEXER_CRON='0 * * * *' – indexer will trigger every hour

Other Considerations

N/A

@dillchen dillchen added this to the Community Homepage milestone Jan 23, 2025
@rbennettcw rbennettcw requested a review from Rotorsoft January 23, 2025 22:03
@@ -0,0 +1,25 @@
import { z } from 'zod';

export const CommunityIndexer = z.object({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a new model for this? Looks like a cache for a retry utils, probably better in redis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's state, but I wouldn't say it's a cache.

There are 50K clanker tokens to initially fetch, so there should be a robust way of tracking it. Also, there will be multiple community indexers in the near future for different sources. For each indexer, it'll fetch many tokens initially, then periodically fetch the newest tokens.

It works pretty much exactly the way the evm listener tracks the last block that it polled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: using Redis

After thinking about it more, my main concern there is a lack of enforced typing and migrations. Right now, we just store the watermark and status, but if it ever becomes more sophisticated than that (which there's a good chance it will considering how quickly requirements change), we don't have a framework for migrating the cache– or I guess we'd just destroy the cache and refetch everything, which is going to eventually be 100K+ tokens from clanker + pump.fun– the choice is between being error-prone or inefficient.

Storing in postgres is a bit weird since it's infra state and not model state, but it's the best option for building something robust that handles future needs. The next best thing would be to have a separate PG DB for these things but that's overkill.

@rbennettcw rbennettcw marked this pull request as ready for review January 27, 2025 23:53
@rbennettcw rbennettcw requested a review from kurtassad January 27, 2025 23:53
@rbennettcw
Copy link
Contributor Author

Still need to fix the tags and image upload.

@rbennettcw rbennettcw requested a review from Rotorsoft January 27, 2025 23:54
@dillchen dillchen linked an issue Jan 31, 2025 that may be closed by this pull request
@mzparacha mzparacha linked an issue Jan 31, 2025 that may be closed by this pull request
@dillchen
Copy link
Contributor

dillchen commented Feb 3, 2025

let's wait to merge / deploy this until the other community homepage (product side tickets are merged in) because the communities won't be useful until then

@rbennettcw
Copy link
Contributor Author

I have a bunch of conflicts to fix, but the main idea is there.

@timolegros before you dig into this– it's worth noting that the clanker API doesn't allow you to jump to a specific ID/timestamp or any sort of watermark. You can jump to an arbitrary page number, but that's useless if you don't know what on those pages. So the implementation reflects that limitation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clanker Indexing (stub) Index Clanker Communities Generating Community Page
4 participants