Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clone failed: could not read Username for '$url': No such device or address #66

Closed
Rubonnek opened this issue Mar 22, 2023 · 35 comments
Closed

Comments

@Rubonnek
Copy link

Rubonnek commented Mar 22, 2023

Describe the bug

Say you have two pipelines defined where the first clones fine but takes about an hour to finish. The default clone operation fails on the second pipeline with the message:

fatal: could not read Username for '$url': No such device or address

when running against Forgejo v1.19 even when WOODPECKER_AUTHENTICATE_PUBLIC_REPOS=true is set.

I believe this issue is related to some timeout value on Forgejo's side since in its logs I get a 404 Unauthorized access message.

System Info
The /version slug is not working for me against my server instance -- I'm getting a 404.

Version: next-3a475ce2
Image: docker.io/woodpeckerci/woodpecker-server:next
Hash: 09e1c0597a92

Compose file example:

version: "3.8"
services:
  app:
    image: docker.io/woodpeckerci/woodpecker-server:next
    ports:
      - "<REDACTED>"
    environment:
      - WOODPECKER_OPEN=true
      - WOODPECKER_ADMIN=<REDACTED>
      - WOODPECKER_HOST=https://<REDACTED>
      - WOODPECKER_AGENT_SECRET=<REDACTED>
      - WOODPECKER_GITEA=true
      - WOODPECKER_GITEA_URL=https://<REDACTED>
      - WOODPECKER_GITEA_CLIENT=<REDACTED>
      - WOODPECKER_GITEA_SECRET=<REDACTED>
      - WOODPECKER_AUTHENTICATE_PUBLIC_REPOS=true
    volumes:
      - woodpecker:/var/lib/woodpecker/
volumes:
    woodpecker:

Additional Context

As a workaround I implemented my own clone step using an access token as the password instead with an alpine container. For example, cloning https://$USER:[email protected]/$ORG/$REPO works fine for me.

This is what I'm using specifically:

skip_clone: true

pipeline:
  clone:
    image: docker.io/alpine/git:latest
    secrets:
      - source: access-token-secret
        target: ACCESS_TOKEN
    commands:
      - git init -b $$CI_COMMIT_BRANCH
      # NOTE: Replace GIT_USER on the next line
      - MODIFIED_REPO_URL=$$(printf "%s\n" "$$CI_REPO_REMOTE" | sed -e "s|https://|https://GIT_USER:$$ACCESS_TOKEN@|g")
      - git remote add origin $$MODIFIED_REPO_URL
      - git fetch --no-tags --depth=1 --filter=tree:0 origin "+$$CI_COMMIT_REF"
      - git reset --hard -q $$CI_COMMIT_SHA
      - git submodule update --init --recursive
      - git lfs fetch
      - git lfs checkout
@6543
Copy link
Member

6543 commented Mar 23, 2023

that has something to do with the crafted netrc config ...

@6543
Copy link
Member

6543 commented Mar 23, 2023

... not sure what exactly go wrong through ... based on the info you provide

@Sebclem
Copy link

Sebclem commented Mar 23, 2023

I have the same issue and i think i can add somme informations:

.woodpecker.yml

clone:
  git:
    image: woodpeckerci/plugin-git:2.0.3
    settings:
      recursive: false

pipeline:
  Check docker-compose files:
    image: docker/compose
    pull: true
    commands:
      - apk add --no-cache bash
      - bash ./test-all
    when:
      - event: "push"
        branch: [main, master]
      - event: [pull_request, manual, deployment]

  Deploy dockers:
    image: appleboy/drone-ssh
    pull: true
    settings:
      host: xxx.xxxx.xxx
      username: root
      key:
        from_secret: ansible_private_key
      port: 22
      command_timeout: 2h
      script:
        - cd /opt/docker-compose
        - git pull
        - ./deploy-all
    when:
      environment: production
      event: deployment

when:
  - event: "push"
    branch: [main, master]
  - event: [pull_request, manual, deployment]

With this file i have the same issue:

+ git init -b master
Initialized empty Git repository in /woodpecker/src/git.xxxxx.xxxx/sebclem/docker-vps/.git/
+ git remote add origin https://git.xxx.xxx/sebclem/docker-vps.git
+ git fetch --no-tags --depth=1 --filter=tree:0 origin +refs/heads/master:
fatal: could not read Username for 'https://git.xxxxx.xxxx': No such device or address
exit status 128

BUT, if I remove the clone part, it's work (I have another error cosed by a submodule, it's why I use recursive: false) :

.woodpecker.yml

pipeline:
  Check docker-compose files:
    image: docker/compose
    pull: true
    commands:
      - apk add --no-cache bash
      - bash ./test-all
    when:
      - event: "push"
        branch: [main, master]
      - event: [pull_request, manual, deployment]

  Deploy dockers:
    image: appleboy/drone-ssh
    pull: true
    settings:
      host: xxxxx.xxxxx.xxxxx
      username: root
      key:
        from_secret: ansible_private_key
      port: 22
      command_timeout: 2h
      script:
        - cd /opt/docker-compose
        - git pull
        - ./deploy-all
    when:
      environment: production
      event: deployment

when:
  - event: "push"
    branch: [main, master]
  - event: [pull_request, manual, deployment]

Output:

+ git init -b master
Initialized empty Git repository in /woodpecker/src/git.sebclem.fr/sebclem/docker-vps/.git/
+ git remote add origin https://git.xxxxx.xxxxx/sebclem/docker-vps.git
+ git fetch --no-tags --depth=1 --filter=tree:0 origin +refs/heads/master:
From https://git.xxxx.xxxx/sebclem/docker-vps
 * branch            master     -> FETCH_HEAD
 * [new branch]      master     -> origin/master
+ git reset --hard -q e0b90a4126347b6bbb0092588dc7ff56d91784d0
+ git submodule update --init --recursive
fatal: No url found for submodule path 'django_vache/django-vache' in .gitmodules
exit status 128

Edit:
Forgot to add that before today, this was working great (last success: 2 days ago)
I'm using woodpecker:next docker image for runner and server.

@Rubonnek
Copy link
Author

Rubonnek commented Mar 23, 2023

BUT, if I remove the clone

Now that you mention it, I recall trying to configure the clone plugin like that and I stumbled upon the same issue.

In fact, I was able to reproduce the issue with just this:

clone:
  git:
    image: woodpeckerci/plugin-git:2.0.3

@anbraten
Copy link
Member

Could be related to the change from #1352

@Wojnr
Copy link

Wojnr commented Mar 28, 2023

Could be related to the change from #1352

Probably it is. After rollback to docker image before this pr almost everything works again.

@Sebclem
Copy link

Sebclem commented Apr 12, 2023

As a workaround, you can disable this in the repository settings, it's work for me:
image

@patrickuhlmann
Copy link

patrickuhlmann commented May 7, 2023

I do have the same issue (message fatal: could not read Username for '***': No such device or address). One thing I noticed is that it seems to work fine if only one pipeline is run at a time (which means that they can start immediately). As soon as jobs are queuing (which means that they start delayed) they run into this problem. I disabled (unchecked) "Only inject netrc credentials into trusted containers" but it still doesn't work.

I am also using the next version of woodpecker with Gitea/Forgejo and my step is configured like that:

clone:
  git:
    image: woodpeckerci/plugin-git
    environment:
      - PLUGIN_LFS=false
      - PLUGIN_SKIP_VERIFY=true

Also when I restart the pipeline later (without any change in configuration/any new commit) the build works successfully. I am pretty sure that the fact that more jobs are started than agents are available thus some jobs are delayed is the relevant factor.

@patrickuhlmann
Copy link

... not sure what exactly go wrong through ... based on the info you provide

what info would be useful to provide?

@pat-s
Copy link
Contributor

pat-s commented May 8, 2023

Unchecking "Only inject netrc credentials into trusted containers" worked for me.

@patrickuhlmann
Copy link

patrickuhlmann commented May 8, 2023

I found this issue today by coincidence: gitkraken/vscode-gitlens#1027. They report to having this problem when using HTTPS instead of SSH. I then saw in the log that the woodpeckerci/plugin-git indeed uses the html_url git remote add origin https://forgejo.***.ch/***/***.git.

I checked the content of the webhook in Forgejo. It contains both urls:

    "html_url": "https://forgejo.***.ch/***/***",
    "ssh_url": "git@forgejo.***.ch:***/***.git",

Maybe switching to SSH would make it more stable?

@lafriks
Copy link
Contributor

lafriks commented May 10, 2023

Maybe switching to SSH would make it more stable?

you can use ssh only with users ssh key and woodpecker does not have it and should not have it either so that's not really an option

@pat-s
Copy link
Contributor

pat-s commented May 15, 2023

Unchecking "Only inject netrc credentials into trusted containers" worked for me.

I wonder if this should be the default given how many people seem to face issues with it. And not all of them will arrive here and read through this issue?

@lafriks
Copy link
Contributor

lafriks commented May 16, 2023

As this is only in development version and 1.0 will be breaking anyway, it's better to use secure by default

@patrickuhlmann
Copy link

Unchecking "Only inject netrc credentials into trusted containers" worked for me.

I wonder if this should be the default given how many people seem to face issues with it. And not all of them will arrive here and read through this issue?

I still have the issue even when I uncheck the option. Doesn't seem to be a "reliable workaround".

@pat-s
Copy link
Contributor

pat-s commented Nov 6, 2023

@patrickuhlmann Do you still face this error with the latest plugin version and latest WP server? If so, could you post your setup in more detail and also what repo options are enabled/disabled?

@patrickuhlmann
Copy link

patrickuhlmann commented Nov 7, 2023

I updated all components and still face the same issue.

I run everything in docker containers on a Synology Diskstation. I have the following containers:

  • codeberg/forgejo/forgejo:1.20
  • woodpeckerci/woodpecker-server:v1.0.4
  • woodpecker/woodpecker-agent:v1.0.4 (first runner)
  • woodpecker/woodpecker-agent:v1.0.4 (second runner)

The error happens when I run renovate. This job is running more than one hours and occupies one runner. It creates many pull requests which in turn trigger builds on other repositories. These builds are running on the second runner. In the beginning everything works fine. After a while (when lots of jobs are queued up) they start to fail.

The job output is always like this:

+ git config --global http.sslCAInfo /opt/MyLAN.crt
+ git init -b master
Initialized empty Git repository in /woodpecker/src/forgejo.me.ch/My/repo.git/
+ git remote add origin https://forgejo.me.ch/My/repo.git
+ git fetch --no-tags --depth=1 --filter=tree:0 origin +refs/pull/108/head:
fatal: could not read Username for 'https://forgejo.me.ch': No such device or address
exit status 128

The configuration of the job(s) is:

Project settings
* Allow Pull Requests checked
* Trusted checked

Timeout
* 5min

All pipelines are similar. For example

clone:
  git:
    image: woodpeckerci/plugin-git
    environment:
      - PLUGIN_LFS=false
      - PLUGIN_CUSTOM_SSL_PATH=/opt/MeLAN.crt
    volumes:
      - /volume1/docker/woodpecker/MeLAN.crt:/opt/MeLAN.crt

pipeline:
  verify:
    image: gradle:8.1.0-jdk17-focal
    commands:
    - gradle assemble
    - gradle check
    volumes:
    - /volume1/docker/woodpecker/gradle:/root/.gradle

In the forgejo log I see

..rvices/auth/basic.go:130:Verify() [E] UserSignIn: user's password is invalid [uid: 1, name: patrick]
...s/auth/middleware.go:23:func1() [E] Failed to verify user: user's password is invalid [uid: 1, name: patrick]
 ...eb/routing/logger.go:102:func1() [I] router: completed GET /My/repo.git/info/refs?service=git-upload-pack for 172.17.0.9:0, 401 Unauthorized in 80.4ms @ auth/middleware.go:20(auth.Auth)

In the runner I see

ERR grpc error: wait(): code: Unknown: rpc error: code = Unknown desc = Step finished with exit code 1,  | error=rpc error: code = Unknown desc = Step finished with exit code 1, 
WRN cancel signal received | repo=My/repo pipeline=104 id=3156 error=rpc error: code = Unknown desc = Step finished with exit code 1, 

In the woodpecker logs, I see

 ip=172.17.0.3 latency=12411.027311 method=POST path=/hook status=500 user-agent=Go-http-client/1.1
ERR failure to save pipeline for My/repo | error=database is locked
ERR error=Error #01: failure to save pipeline for My/repo

One thing I am wondering is this "database is locked". Is this normal? Might the problem be that I am using an sqlite3 database?

@pat-s
Copy link
Contributor

pat-s commented Nov 7, 2023

One thing I am wondering is this "database is locked". Is this normal? Might the problem be that I am using an sqlite3 database?

Likely, I've seen this error with sqlite3 in the past when there was too much load on the DB. And as you're saying, during the renovate run more runs spin up from the PRs opened by renovate which then likely overload the sqlite3 DB.

sqlite3 is usually only suitable for dev purposes, it's better to use postgres or mysql even for semi-production home use.

@thechubbypanda
Copy link

Hi all, I'm running into this problem with Gitea latest and woodpecker latest as of today.
Checking/unchecking "Only inject netrc credentials into trusted containers" has no effect on the outcome of the job.
I've also tried a brand new repository (private and public), with and without the checkbox above enabled. All the same.

@patrickuhlmann
Copy link

I switched to postgres but face still the same problem. Btw. in the end it would have surprised me as Sqlite is very much underestimated. It was already able to handle thousands of selects and inserts in a very short time even on multiple gigabytes large databases years ago (you will find plenty of benchmarks info if you search for it).

@pat-s
Copy link
Contributor

pat-s commented Dec 19, 2023

@thechubbypanda can you verify the issue only exists in WP latest and not in WP 2.0.0? If so, maybe you can track it down further to a specific commit? All main branch commits have associated images.

@patrickuhlmann
Copy link

I am now running WP 2.0.0 and still have this issue

@thechubbypanda
Copy link

thechubbypanda commented Dec 19, 2023

Ok so:

WP Server: next-1ca549190b OR v2.0.0 OR v1.0.5
WP Agent: next-1ca549190b OR v2.0.0 OR v1.0.5
plugin-git: latest

Gives the same error unless I overwrite the clone step:

+ git fetch --no-tags origin +master:
fatal: no path specified; see 'git help pull' for valid url syntax
exit status 128

Conclusion: Something else is wrong

@pat-s
Copy link
Contributor

pat-s commented Dec 20, 2023

@thechubbypanda Thanks. Strange though, as many people use 2.0 + meanwhile and we haven't yet heard of more issues like this.

I also administrate multiple instances and haven't come across the issue in months.

Are you running WP in docker or via a host install?

And since you wrote

Hi all, I'm running into this problem with Gitea latest and woodpecker latest as of today.

Did it work before with an older version/different setup?

@thechubbypanda
Copy link

In terms of "today" I merely meant that I just tested it. Given that the issue is a few months old now.

I'm running dockerized at the moment.

I started spinning up a completely clean installation of both Gitea and woodpecker yesterday. Will report back if that works. Then it's a matter of narrowing down what setting or situation is causing the problem.

I will note that it's extremely tough to debug the docker images given they appear to not even have sh installed.

@thechubbypanda
Copy link

thechubbypanda commented Dec 20, 2023

FOUND IT @pat-s:
GITEA__service__REQUIRE_SIGNIN_VIEW: true

Set that to true and the error appears, set to false, it works as expected.

@qwerty287
Copy link
Contributor

Can you try to set https://woodpecker-ci.org/docs/administration/server-config#woodpecker_authenticate_public_repos to true?

@pat-s
Copy link
Contributor

pat-s commented Dec 21, 2023

I will note that it's extremely tough to debug the docker images given they appear to not even have sh installed.

Yes, this is (partly) known. Even though my guess would have been that it is very unlikely that it is an issue in the image as otherwise we would have gotten many more reports here.

It is likely that WOODPECKER_AUTHENTICATE_PUBLIC_REPOS helps, it might to partly the same that GITEA__service__REQUIRE_SIGNIN_VIEW does.

@qwerty287 Maybe we should add a warning if Gitea is used as a forge and WOODPECKER_AUTHENTICATE_PUBLIC_REPOS is not true?

@thechubbypanda
Copy link

Is there a scenario where just having that enabled by default is a bad idea? Or at least inversing it?

@qwerty287
Copy link
Contributor

qwerty287 commented Dec 21, 2023

Maybe we should add a warning if Gitea is used as a forge and WOODPECKER_AUTHENTICATE_PUBLIC_REPOS is not true?

No, because that's only necessary if you require logins for everything. If GITEA__service__REQUIRE_SIGNIN_VIEW is false everything's working. Also, this can be the same for other forges too and is not gitea-specific

@thechubbypanda
Copy link

It is likely that WOODPECKER_AUTHENTICATE_PUBLIC_REPOS helps

Well regardless, that has fixed the issue for me at least. Thanks

@pat-s
Copy link
Contributor

pat-s commented Dec 21, 2023

The error message is quote non-descriptive and it's not easy for users to find out how to solve the issue. Even when searching the main repo this is tricky as the issue is being discussed/reported here, and not all users will even arrive here.

I wonder how this can be better communicated - in the end, such details/issues lead to a bad user experience.

@pat-s
Copy link
Contributor

pat-s commented Dec 21, 2023

Closing here now as the issue seems to be resolved and other users arriving here should find the solution.

Yet I think we should somehow assert GITEA__service__REQUIRE_SIGNIN_VIEW if that's the real underlying cause of this error?

@pat-s pat-s closed this as completed Dec 21, 2023
@6543
Copy link
Member

6543 commented Dec 21, 2023

So it was #25 all along :/

@patrickuhlmann
Copy link

For me I think the problem was weak hardware. As soon as I switched from my diskstation to a dedicated computer and forgejo as well as woodpecker run much faster the problem was gone as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants