Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic memory logging #4234

Merged
merged 3 commits into from
Mar 8, 2025
Merged

Add basic memory logging #4234

merged 3 commits into from
Mar 8, 2025

Conversation

Weves
Copy link
Contributor

@Weves Weves commented Mar 7, 2025

Description

Fixes https://linear.app/danswer/issue/DAN-1544/add-basic-memory-loggin

How Has This Been Tested?

Ran docker compose locally, verified I saw the logs as expected in the dir I expected + verified that on container restart the logs were persisted.

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@Weves Weves requested a review from a team as a code owner March 7, 2025 23:47
Copy link

vercel bot commented Mar 7, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 8, 2025 3:09am

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Added memory monitoring functionality to track and log process memory usage and CPU metrics across the application, with persistent storage and rotation capabilities.

  • Added new celery_worker_monitoring in backend/supervisord.conf to handle dedicated memory monitoring tasks
  • Implemented memory_monitoring.py with rotating file handler (10MB limit, 5 backups) and structured logging format for memory/CPU metrics
  • Added persistent log_store volume mounted at /var/log/persisted-logs across all Docker Compose configurations
  • Integrated emit_process_memory function in indexing tasks to track worker memory usage every 60 seconds
  • Added memory monitoring to both api_server and background services in production configuration for comprehensive coverage

10 file(s) reviewed, 6 comment(s)
Edit PR Review Bot Settings | Greptile


# Create a dedicated logger for memory monitoring
memory_logger = logging.getLogger("memory_monitoring")
memory_logger.setLevel(logging.INFO)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Set propagate=False to prevent duplicate logs if parent logger has handlers

Suggested change
memory_logger.setLevel(logging.INFO)
memory_logger.setLevel(logging.INFO)
memory_logger.propagate = False

Comment on lines +257 to +259
# optional, only for debugging purposes
volumes:
- log_store:/var/log/persisted-logs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Volume mount is marked as optional/debugging but is required for memory monitoring to work. Consider removing the 'optional' comment or clarifying when it's needed.

Comment on lines 59 to 60
except Exception as e:
logger.error(f"Error monitoring worker memory: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Generic exception handling masks specific issues like NoSuchProcess or AccessDenied. Consider handling psutil.Error explicitly.

try:
process = psutil.Process(pid)
memory_info = process.memory_info()
cpu_percent = process.cpu_percent(interval=0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: cpu_percent() with interval=0.1 blocks execution. Consider using cpu_percent() without interval for non-blocking operation.

Comment on lines +71 to +72
volumes:
- log_store:/var/log/persisted-logs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Volume mount is only added to background service, but memory_monitoring.py suggests api_server might also need access to write logs

Comment on lines +213 to +214
volumes:
- log_store:/var/log/persisted-logs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider adding size limits to the log_store volume to prevent unbounded disk usage

pid = job.process.pid
if pid is not None:
# Only emit memory info once per minute (60 seconds)
current_time = time.time()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.monotonic preferable in most cases

# Regular application logger
logger = setup_logger()

# Set up a dedicated memory monitoring logger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be sufficient information in the log line to distinguish between various indexing connector sources so we can zero in on what failed? Is there any intention to use this from sources other than indexing where we would need more metadata to distinguish those from indexing metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should already be possible via the process_name param (+ additional_metadata). I'm not sure if that's what you're referring to?

@Weves Weves enabled auto-merge March 8, 2025 03:05
@Weves Weves added this pull request to the merge queue Mar 8, 2025
Merged via the queue into main with commit 0d3c72a Mar 8, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants