Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

garm keeps deleting runners forever #331

Closed
maigl opened this issue Jan 29, 2025 · 3 comments
Closed

garm keeps deleting runners forever #331

maigl opened this issue Jan 29, 2025 · 3 comments

Comments

@maigl
Copy link
Contributor

maigl commented Jan 29, 2025

Hi,

I have a garm instance with multiple pools for multiple cloud backends. Today one of my clouds (openstack) died and garm could not create or delete any instances on my openstack.

Garm now tried constantly to delete the runners it still had in it's list, which lead to high cpu load and many many errors in the logs.
Also I got database locks.

erros from the log:

garm-prod-6bd95c77fd-p7xvs garm 2025-01-28T17:51:45.597460732+01:00 
time=2025-01-28T16:51:45.597Z level=INFO msg="removing instance from pool"
 runner_name=road-runner-os022-s-s6XaosPoNiHt pool_id=xxx pool_type=enterprise
....

and

garm-prod-6bd95c77fd-wrc55 garm 2025-01-28T17:38:51.686397993+01:00 time=2025-01-28T16:38:51.686Z level=ERROR
msg="failed to update runner status" error="database is locked\nupdating instance\[ngithub.com/cloudbase/garm/database/sql.(*sqlDatabase).UpdateInstance](http://ngithub.com/cloudbase/garm/database/sql.(*sqlDatabase).UpdateInstance)\n\t/tmp/git_cache/garm_repo/database/sql/instances.go:245\[ngithub.com/cloudbase/garm/runner/pool.(*basePoolManager).setInstanceStatus]

I also tried to delete the 'lost' runners with --force, but due to the database locks I had try this multiple times until they were successfully deleted. After force deleteing the runners the cpu load went down and the database locks disappeared.

Michael Kuhnt [email protected] Mercedes-Benz Tech Innovation GmbH (ProviderInformation)

@maigl
Copy link
Contributor Author

maigl commented Jan 29, 2025

I'll try and see if this helps

#329
#328

@gabriel-samfira
Copy link
Member

Hi @maigl ,

Any updates after applying the two patches?

@maigl
Copy link
Contributor Author

maigl commented Feb 13, 2025

Hi @gabriel-samfira .. so we deployed both features and it's much much better .. However, the number of database locks is very low but not zero.

I think it's worth adding both features.

@maigl maigl closed this as completed Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants