Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backup fails due to not enough space for execution #242

Closed
harshad16 opened this issue Sep 20, 2022 · 4 comments
Closed

backup fails due to not enough space for execution #242

harshad16 opened this issue Sep 20, 2022 · 4 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/devsecops Categorizes an issue or PR as relevant to SIG DevSecOps. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@harshad16
Copy link
Member

Describe the bug

2022-09-19 01:33:08,760 1 ERROR thoth.graph_backup_job:111: Error saving the pg dump Command exited with non-zero status code (1): pg_dump: [archiver] could not write to output file: No space left on device
.
Traceback (most recent call last):
File "app.py", line 95, in main
run_command(
File "/opt/app-root/lib64/python3.8/site-packages/thoth/analyzer/command.py", line 107, in run_command
raise CommandError(error_msg, command=command, is_json=is_json)
thoth.analyzer.command.CommandError: Command exited with non-zero status code (1): pg_dump: [archiver] could not write to output file: No space left on device

To Reproduce
Steps to reproduce the behavior:

  1. Go to thoth-graph-stage
  2. See error in the backup job pod

Expected behavior
Backups the database periodically.

@harshad16
Copy link
Member Author

harshad16 commented Sep 20, 2022

The procedure to fix this is to utilize persisted volume, instead of in memory of the pod.

  • mount a volume to the graph-backup-job
  • adjust the application to utilize that mounted volume for backing up the db.

/triage accepted
/sig devsecops
/priority critical-urgent

@sesheta sesheta added triage/accepted Indicates an issue or PR is ready to be actively worked on. sig/devsecops Categorizes an issue or PR as relevant to SIG DevSecOps. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Sep 20, 2022
@harshad16 harshad16 added the kind/bug Categorizes issue or PR as related to a bug. label Sep 20, 2022
@harshad16 harshad16 self-assigned this Sep 20, 2022
@harshad16 harshad16 moved this to New in SIG-DevSecOps Sep 22, 2022
@VannTen
Copy link
Member

VannTen commented Sep 22, 2022

Is the dump sent elsewhere after that ? If it is, maybe we can use a emptyDir instead of a persistent volume ?

I don't have good sense of what local space the openshift have and what size the dumb is.

@codificat codificat moved this to 🆕 New in Planning Board Sep 26, 2022
@harshad16
Copy link
Member Author

harshad16 commented Sep 26, 2022

The dump is uploaded to s3 storage.
yes, we can use the emptydir, as we had pv storage so utilized that.
the dumb size is currently reached 15gb.
and node size on which emptydir depends varies from cluster to cluster.

@harshad16
Copy link
Member Author

This is completed, it is synced and working.
https://console-openshift-console.apps.ocp4.prod.psi.redhat.com/k8s/ns/thoth-graph-stage/cronjobs/graph-backup/jobs
closing the issue
Thanks everyone for working on it. 💯

Repository owner moved this from 🆕 New to ✅ Done in SIG-DevSecOps Oct 12, 2022
Repository owner moved this from 🏗 In progress to ✅ Done in Planning Board Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/devsecops Categorizes an issue or PR as relevant to SIG DevSecOps. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Done
Status: Done
Development

No branches or pull requests

3 participants