-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tarpit responder #48
base: main
Are you sure you want to change the base?
Conversation
5bb9f26
to
d9bea6c
Compare
Signed-off-by: circa10a <[email protected]>
d9bea6c
to
7fa53e5
Compare
I love the idea of it lol, but I don't think just I think something more akin to instantly responding with headers then very slowly reading out text (like the bee movie script I linked in #39 ) would be better. Right now I would assume most scrapers would simply timeout and move on instead of waiting for headers + response. perhaps something like the following func (t *TarpitResponder) ServeHTTP(w http.ResponseWriter, r *http.Request, _ caddyhttp.Handler) error {
// Send the headers
...
flusher, ok := w.(http.Flusher)
if !ok {
return errors.New("streaming not supported") // or just... return something else
}
ctx := r.Context()
go func() {
buf := []byte(t.Response) // t.Response being the movie script or whatever user chooses (could be loaded from a file
for i := 0; i < len(buf); i++ {
select {
case <-ctx.Done():
return
default:
w.Write([]byte{buf[i]})
flusher.Flush()
time.Sleep(100 * time.Millisecond) // Configurable bytes/sec
}
}
}()
return nil
} Do note my POC does not handle client disconnects. Also, do note my impl does byte-to-byte chunking which would NOT be efficient. a better chunking method would be needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment
Ah yeah I see what you mean. I wasn't quite sure of the meaning of #39 at first. I like the idea of being able to flood attackers/crawlers with more nonsense much more. I'll work on getting that implemented |
In that case, #1 may be of particular interest to you. I link the repository where I have my work for that done already but essentially it's Markov text chain generation. I experimented with a couple of different options on how to generate realistic looking text, but I couldn't think of anything besides Markov chains that would have a low enough memory and compute footprint to be usable. If you have any other ideas just put them in there! |
I would expand on this to include a few anchor tags that link back to itself. So the scraper gets stuck inside of an endless scraping loop. |
Addresses #38, which is to add a tarpit responder, ultimately designed to waste time of crawlers. The responder allows a user to configure headers, how long for the request to take, response code, and reuses the existing message field to optionally return a customized message. If a tarpit responder is enabled, but no configuration options are provided, the defaults are:
Validation:
Caddyfile:
Request/response: