Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_server: Disable keepalive automatically when HTTP/1.1 and keepalive is enabled for curl requests #9019

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Jun 28, 2024

When using default option of curl, the current in_splunk tries to keep its connections due to keepalive flags is enabled.
However, this behavior is not desired because Splunk HEC requests are expected OK response with:

HTTP/1.1 200 OK
content-type: application/json
{"text":"Success","code":0}

This is not compatible for HTTP1.1 keepalive.
So, we need to turn off for testing with curl case.

Closes #9010


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
[INPUT]
    Name splunk
    Tag splunk.test.ingest
    Port 8090
    Splunk_Token 7bba8847-4aee-4e62-ba7b-08c6139e42b9,4e63c0c9-c3b5-4a0a-bf4d-bfd5bc0d0070
    store_token_in_metadata Off
    Buffer_Max_Size 10M
    tls On
    tls.verify Off
    tls.crt_file ./cert/ca_cert.pem
    tls.key_file ./cert/ca_key.pem
    tls.key_passwd fluentd

[OUTPUT]
    Name stdout
    Match *
  • Debug log output from testing the change
Fluent Bit v3.1.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

___________.__                        __    __________.__  __          ________  
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \ 
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  < 
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/ 

[2024/06/28 17:14:18] [ info] Configuration:
[2024/06/28 17:14:18] [ info]  flush time     | 1.000000 seconds
[2024/06/28 17:14:18] [ info]  grace          | 5 seconds
[2024/06/28 17:14:18] [ info]  daemon         | 0
[2024/06/28 17:14:18] [ info] ___________
[2024/06/28 17:14:18] [ info]  inputs:
[2024/06/28 17:14:18] [ info]      splunk
[2024/06/28 17:14:18] [ info] ___________
[2024/06/28 17:14:18] [ info]  filters:
[2024/06/28 17:14:18] [ info] ___________
[2024/06/28 17:14:18] [ info]  outputs:
[2024/06/28 17:14:18] [ info]      stdout.0
[2024/06/28 17:14:18] [ info] ___________
[2024/06/28 17:14:18] [ info]  collectors:
[2024/06/28 17:14:18] [ info] [fluent bit] version=3.1.0, commit=060418c51b, pid=1345270
[2024/06/28 17:14:18] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/06/28 17:14:18] [ info] [output:stdout:stdout.0] worker #0 started
[2024/06/28 17:14:18] [ info] [storage] ver=1.1.6, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/06/28 17:14:18] [ info] [cmetrics] version=0.9.1
[2024/06/28 17:14:18] [ info] [ctraces ] version=0.5.1
[2024/06/28 17:14:18] [ info] [input:splunk:splunk.0] initializing
[2024/06/28 17:14:18] [ info] [input:splunk:splunk.0] storage_strategy='memory' (memory only)
[2024/06/28 17:14:18] [debug] [splunk:splunk.0] created event channels: read=21 write=22
[2024/06/28 17:14:18] [debug] [downstream] listening on 0.0.0.0:9880
[2024/06/28 17:14:18] [debug] [stdout:stdout.0] created event channels: read=24 write=25
[2024/06/28 17:14:18] [ info] [sp] stream processor started
[2024/06/28 17:14:20] [debug] [task] created task=0x5f10380 id=0 OK
[0] splunk.0: [[1719562460.101198583, {"hec_token"=>"Splunk secret-token"}], {"User"=>"Admin", "password"=>"my_secret_password", "Event"=>"Some text in the event"}]
[2024/06/28 17:14:20] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2024/06/28 17:14:20] [debug] [out flush] cb_destroy coro_id=0
[2024/06/28 17:14:20] [debug] [task] destroy task=0x5f10380 (task_id=0)
^C^C[2024/06/28 17:14:28] [engine] caught signal (SIGINT)
[2024/06/28 17:14:28] [ warn] [engine] service will shutdown in max 5 seconds
[2024/06/28 17:14:28] [ info] [input] pausing splunk.0
[2024/06/28 17:14:29] [ info] [engine] service has stopped (0 pending tasks)
[2024/06/28 17:14:29] [ info] [input] pausing splunk.0
[2024/06/28 17:14:29] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/06/28 17:14:29] [ info] [output:stdout:stdout.0] thread worker #0 stopped

CURL logs

% LC_ALL=C curl -vvv \
  --url http://localhost:8090/services/collector \
  --header 'Authorization: Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9' \
  --header 'Content-Type: application/json' \
  --data '{"User":"Admin","password":"my_secret_password","Event":"Some text in the event"}'
* processing: http://localhost:8090/services/collector
*   Trying [::1]:8090...
* connect to ::1 port 8090 failed: Connection refused
*   Trying 127.0.0.1:8090...
* Connected to localhost (127.0.0.1) port 8090
> POST /services/collector HTTP/1.1
> Host: localhost:8090
> User-Agent: curl/8.2.1
> Accept: */*
> Authorization: Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9
> Content-Type: application/json
> Content-Length: 81
> 
< HTTP/1.1 200 OK
< content-type: application/json
* no chunk, no close, no size. Assume close to signal end
< 
* Closing connection
{"text":"Success","code":0}

HTTP/2 upgrade with TLS:

% LC_ALL=C curl -k -vvv \
  --url https://localhost:8090/services/collector \
  --header 'Authorization: Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9' \
  --header 'Content-Type: application/json' \
  --data '{"User":"Admin","password":"my_secret_password","Event":"Some text in the event"}'
* processing: https://localhost:8090/services/collector
*   Trying [::1]:8090...
* connect to ::1 port 8090 failed: Connection refused
*   Trying 127.0.0.1:8090...
* Connected to localhost (127.0.0.1) port 8090
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
*  subject: C=US; ST=CA; L=Mountain View; CN=Fluentd Forward CA
*  start date: Jan  1 00:00:00 1970 GMT
*  expire date: Nov 26 07:39:41 2028 GMT
*  issuer: C=US; ST=CA; L=Mountain View; CN=Fluentd Forward CA
*  SSL certificate verify result: self-signed certificate (18), continuing anyway.
* using HTTP/2
* h2 [:method: POST]
* h2 [:scheme: https]
* h2 [:authority: localhost:8090]
* h2 [:path: /services/collector]
* h2 [user-agent: curl/8.2.1]
* h2 [accept: */*]
* h2 [authorization: Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9]
* h2 [content-type: application/json]
* h2 [content-length: 81]
* Using Stream ID: 1
> POST /services/collector HTTP/2
> Host: localhost:8090
> User-Agent: curl/8.2.1
> Accept: */*
> Authorization: Splunk 7bba8847-4aee-4e62-ba7b-08c6139e42b9
> Content-Type: application/json
> Content-Length: 81
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/2 200 
< content-type: application/json
< 
* Connection #0 to host localhost left intact
{"text":"Success","code":0}
  • Attached Valgrind output that shows no leaks or memory corruption was found
==1986034== 
==1986034== HEAP SUMMARY:
==1986034==     in use at exit: 0 bytes in 0 blocks
==1986034==   total heap usage: 17,425 allocs, 17,425 frees, 2,684,107 bytes allocated
==1986034== 
==1986034== All heap blocks were freed -- no leaks are possible
==1986034== 
==1986034== For lists of detected and suppressed errors, rerun with: -s
==1986034== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Not needed.

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@edsiper
Copy link
Member

edsiper commented Jun 28, 2024

@cosmo0920

Regarding keepalive functionality, in the Upstream interface, we implemented keepalive as an internal feature:

https://github.com/fluent/fluent-bit/blob/master/src/flb_upstream.c#L63-L66

For downstream, the options available are here, I am wondering if it makes sense or not to add it here instead of each plugin it self:

https://github.com/fluent/fluent-bit/blob/master/src/flb_downstream.c#L36-L61

@cosmo0920 cosmo0920 changed the title in_splunk: Provide keepalive option to be able to switch on/off in_splunk: http_server: Provide keepalive option to be able to switch on/off Jun 28, 2024
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-keepalive-option-to-provide-choices-for-it branch from ffdf963 to ce2d751 Compare June 28, 2024 15:50
@cosmo0920 cosmo0920 changed the title in_splunk: http_server: Provide keepalive option to be able to switch on/off in_splunk: http_server: Deny to enable keepalive on in_splunk and unified keepalive settings for downstream Jun 28, 2024
@edsiper edsiper added this to the Fluent Bit v3.1.0 milestone Jun 30, 2024
@cosmo0920
Copy link
Contributor Author

I also confirmed that upgrading process of HTTP/2 with TLS(h2) can works for HTTPS with TLS settings.

@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-keepalive-option-to-provide-choices-for-it branch from ce2d751 to 87ee0f5 Compare July 2, 2024 10:29
@cosmo0920 cosmo0920 changed the title in_splunk: http_server: Deny to enable keepalive on in_splunk and unified keepalive settings for downstream http_server: Disable keepalive automatically when HTTP/2 upgrade is failed Jul 2, 2024
@cosmo0920 cosmo0920 changed the title http_server: Disable keepalive automatically when HTTP/2 upgrade is failed http_server: in_splunk: Disable keepalive automatically when HTTP/1.1 and keepalive is enabled Jul 3, 2024
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-keepalive-option-to-provide-choices-for-it branch from 3bba03f to 8ebbf20 Compare July 3, 2024 02:11
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-keepalive-option-to-provide-choices-for-it branch from 8ebbf20 to b27bff1 Compare July 3, 2024 03:43
@cosmo0920 cosmo0920 changed the title http_server: in_splunk: Disable keepalive automatically when HTTP/1.1 and keepalive is enabled http_server: Disable keepalive automatically when HTTP/1.1 and keepalive is enabled for curl requests Jul 3, 2024
@edsiper
Copy link
Member

edsiper commented Jul 3, 2024

closing this since #9036 fixes the content-length problem

@edsiper edsiper closed this Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

in_splunk is not replying with the expected ok message
2 participants