-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report cache_peer context in probe and standby pool messages #277
base: bag51
Are you sure you want to change the base?
Report cache_peer context in probe and standby pool messages #277
Conversation
Initial implementation.
src/neighbors.cc
Outdated
@@ -1113,7 +1113,7 @@ neighborUp(const CachePeer * p) | |||
{ | |||
if (!p->tcp_up) { | |||
// TODO: When CachePeer gets its own CodeContext, pass that context instead of nullptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this PR completes this TODO:
// TODO: When CachePeer gets its own CodeContext, pass that context instead of nullptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/CachePeer.cc
Outdated
CachePeerProbeCodeContext::detailCodeContext(std::ostream &os) const | ||
{ | ||
if (peer.valid()) | ||
os << Debug::Extra << "current cache_peer probe: " << peer->name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see our commit a555a85 regarding CachePeer reporting rules.
os << Debug::Extra << "current cache_peer probe: " << peer->name; | |
os << Debug::Extra << "current cache_peer probe: " << *peer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
src/CachePeer.cc
Outdated
if (peer.valid()) | ||
return peer->id.detach(); | ||
|
||
return ScopedId("CachePeerProbeCodeContext w/o peer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify this code and not add CachePeer::id in this PR. We already have a TODO about adding InstanceId to such classes. Also, the whole idea of having codeContextGist() returning an identifier may be an API bug: We should rethink how FadingCodeContext works, so that we can avoid identifier copying and convert this method into an std::ostream-printing method.
if (peer.valid()) | |
return peer->id.detach(); | |
return ScopedId("CachePeerProbeCodeContext w/o peer"); | |
// Unfortunately, peer->name lifetime is too short. See also: AnyP::PortCfg::codeContextGist(). | |
return ScopedId("cache_peer probe"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
src/PeerPoolMgr.h
Outdated
@@ -18,6 +18,20 @@ class HttpRequest; | |||
class CachePeer; | |||
class CommConnectCbParams; | |||
|
|||
class PeerPoolMgrContext : public CodeContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see whether this class can be generalized into DetailedCodeContext class that takes an SBuf detail and a c-string gist as constructor parameters (and stores that info in its data members, to be reported using CodeContext APIs). If that works, use DetailedCodeContext instead of CachePeerProbeCodeContext as well.
Eventually, CachePeer will be reference-counted and, hence, may not need a dedicated CodeContext data member, but (re)using DetailedCodeContext for CachePeer now may help us make progress while those (complex) changes are pending (and waiting other/stuck smooth reconfiguration PRs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (30978d0).
src/CachePeer.cc
Outdated
if (peer.valid()) | ||
return peer->id.detach(); | ||
|
||
return ScopedId("CachePeerProbeCodeContext w/o peer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
src/CachePeer.cc
Outdated
CachePeerProbeCodeContext::detailCodeContext(std::ostream &os) const | ||
{ | ||
if (peer.valid()) | ||
os << Debug::Extra << "current cache_peer probe: " << peer->name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
src/neighbors.cc
Outdated
@@ -1113,7 +1113,7 @@ neighborUp(const CachePeer * p) | |||
{ | |||
if (!p->tcp_up) { | |||
// TODO: When CachePeer gets its own CodeContext, pass that context instead of nullptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/PeerPoolMgr.h
Outdated
@@ -18,6 +18,20 @@ class HttpRequest; | |||
class CachePeer; | |||
class CommConnectCbParams; | |||
|
|||
class PeerPoolMgrContext : public CodeContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/neighbors.cc
Outdated
ipcache_nbgethostbyname(p->host, peerDNSConfigure, p); | ||
for (p = Config.peers; p; p = p->next) { | ||
CallService(p->probeCodeContext, [&] { | ||
ipcache_nbgethostbyname(p->host, peerDNSConfigure, p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CallService() is for switching to "service" context. ipcache_nbgethostbyname() service is not peer-specific -- the DNS lookup service serves all peers and may even share one lookup across multiple service requesting clients. We should be setting the service caller context here, not the service context. I think it would be more appropriate to use explicit CodeContext::Reset() calls in this loop, just like in neighborsUdpPing() loop modified in this PR.
N.B. peerProbeConnect() is a peer-specific "service".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
src/PeerPoolMgr.cc
Outdated
@@ -32,7 +33,8 @@ PeerPoolMgr::PeerPoolMgr(CachePeer *aPeer): AsyncJob("PeerPoolMgr"), | |||
request(), | |||
transportWait(), | |||
encryptionWait(), | |||
addrUsed(0) | |||
addrUsed(0), | |||
context(new DetailedCodeContext("cache_peer standby pool", ToSBuf("current cache_peer standby pool", *peer))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing delimiter?
context(new DetailedCodeContext("cache_peer standby pool", ToSBuf("current cache_peer standby pool", *peer))) | |
context(new DetailedCodeContext("cache_peer standby pool", ToSBuf("current cache_peer standby pool: ", *peer))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/PeerPoolMgr.h
Outdated
@@ -69,6 +92,9 @@ class PeerPoolMgr: public AsyncJob | |||
JobWait<Security::BlindPeerConnector> encryptionWait; | |||
|
|||
unsigned int addrUsed; ///< counter for cycling through peer addresses | |||
|
|||
public: | |||
DetailedCodeContext::Pointer context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding CodeContext to PeerPoolMgr job is a red flag for me: We have numerous AsyncJobs that do master transaction-related work. Virtually none1 of them store a CodeContext pointer to report that master transaction -- after their ALE-based code context is established, those jobs automatically get the right context via AsyncCalls and such. DetailedCodeContext is different from ALE-based context, but I see no reason for PeerPoolMgr job to be different with regard to its overall context storage approach.
If we can apply existing ALE CodeContext handling/storage approach to PeerPoolMgr jobs without heroic efforts, let's do that instead of adding this data member.
Footnotes
-
TunnelStateData stores CodeContext but with a TODO to remove that storage. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's move it from the job, but we have to keep it somewhere anyway in order to pass to PeerPoolMgr::checkpoint(). I moved it to CachePeer::standby structure, which looks logical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's move it from the job,
Hm... Its precise location (inside the job) was not the red flag I was trying to describe. The red flag was the addition of that CodeContext data member (as such, in any class), not the addition of PeerPoolMgr::context (specifically). Sorry.
we have to keep it somewhere anyway in order to pass to PeerPoolMgr::checkpoint()
Aha, PeerPoolMgr::Checkpoint() is the piece I was missing! We have to store this CodeContext (somewhere) because PeerPoolMgr::Checkpoint() is not called via events or async calls and, hence, cannot reuse those mechanisms for relaying context back into PeerPoolMgr job. PeerPoolMgr is probably the best location for this context storage because this context is specific to PeerPoolMgr. Please undo that move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are zeroing-in on the final solution here. Please note that some change requests will require more code adjustments than a single GitHub suggestion can express.
src/PeerPoolMgr.h
Outdated
|
||
void setMasterXaction(const MasterXaction::Pointer &mx) { masterXaction = mx; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not add this unused method:
void setMasterXaction(const MasterXaction::Pointer &mx) { masterXaction = mx; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/PeerPoolMgr.cc
Outdated
transportWait(), | ||
encryptionWait(), | ||
addrUsed(0) | ||
{ | ||
const auto mx = MasterXaction::MakePortless<XactionInitiator::initPeerPool>(); | ||
context = new DetailedCodeContext("cache_peer standby pool", ToSBuf("current cache_peer standby pool: ", *peer), mx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR code avoids storing CachePeer inside DetailedCodeContext by computing the corresponding detail in advance. Let's assume that decision to precompute was correct.
If possible, please apply the same "let's precompute!" decision to MasterXaction storage, eliminating DetailedCodeContext::masterXaction data member and generalizing/simplifying DetailedCodeContext to better match the new class name/intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new SBuf field (storing mx->id) instead.
src/PeerPoolMgr.h
Outdated
public: | ||
typedef RefCount<DetailedCodeContext> Pointer; | ||
|
||
DetailedCodeContext(const char *gist, const SBuf &detail, const MasterXaction::Pointer &mx) : gist_(gist), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere, please use natural language "items: item1, item2" formatting for labeled lists:
DetailedCodeContext(const char *gist, const SBuf &detail, const MasterXaction::Pointer &mx) : gist_(gist), | |
DetailedCodeContext(const char *gist, const SBuf &detail, const MasterXaction::Pointer &mx): gist_(gist), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/PeerPoolMgr.h
Outdated
/// CodeContext for cache_peer related classes | ||
class DetailedCodeContext : public CodeContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming other change requests in this review are correct:
/// CodeContext for cache_peer related classes | |
class DetailedCodeContext : public CodeContext | |
/// CodeContext with constant details known at construction time | |
class PrecomputedCodeContext: public CodeContext |
And move this declaration to src/base/PrecomputedCodeContext.{h,cc}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/PeerPoolMgr.h
Outdated
@@ -32,6 +56,8 @@ class PeerPoolMgr: public AsyncJob | |||
explicit PeerPoolMgr(CachePeer *aPeer); | |||
~PeerPoolMgr() override; | |||
|
|||
DetailedCodeContext::Pointer context; ///< the pool manager context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename -- there are too many contexts in this context :-).
DetailedCodeContext::Pointer context; ///< the pool manager context | |
DetailedCodeContext::Pointer codeContext; |
I cannot think of any useful documentation for this self-documented declaration. The proposed description in current PR code reads like "foo is foo" to me. Existing official code has many self-documented codeContext data members, so I hope we do not have to describe this one either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (7fd1491).
src/PeerPoolMgr.cc
Outdated
transportWait(), | ||
encryptionWait(), | ||
addrUsed(0) | ||
{ | ||
const auto mx = MasterXaction::MakePortless<XactionInitiator::initPeerPool>(); | ||
context = new DetailedCodeContext("cache_peer standby pool", ToSBuf("current cache_peer standby pool: ", *peer), mx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new SBuf field (storing mx->id) instead.
src/PeerPoolMgr.h
Outdated
@@ -32,6 +56,8 @@ class PeerPoolMgr: public AsyncJob | |||
explicit PeerPoolMgr(CachePeer *aPeer); | |||
~PeerPoolMgr() override; | |||
|
|||
DetailedCodeContext::Pointer context; ///< the pool manager context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
src/PeerPoolMgr.h
Outdated
|
||
void setMasterXaction(const MasterXaction::Pointer &mx) { masterXaction = mx; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/PeerPoolMgr.h
Outdated
/// CodeContext for cache_peer related classes | ||
class DetailedCodeContext : public CodeContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/PeerPoolMgr.h
Outdated
public: | ||
typedef RefCount<DetailedCodeContext> Pointer; | ||
|
||
DetailedCodeContext(const char *gist, const SBuf &detail, const MasterXaction::Pointer &mx) : gist_(gist), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with two minor adjustments. Please resolve merge conflicts as well.
Please also review new PR title/description and adjust as needed.
|
||
codeContext = new PrecomputedCodeContext("cache_peer standby pool", ToSBuf("current cache_peer standby pool: ", *peer, | ||
Debug::Extra, "current master transaction: ", mx->id)); | ||
// ErrorState, getOutgoingAddress(), and other APIs may require a request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep request-related code lines separate:
// ErrorState, getOutgoingAddress(), and other APIs may require a request. | |
// ErrorState, getOutgoingAddress(), and other APIs may require a request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
#include "base/InstanceId.h" | ||
#include "sbuf/SBuf.h" | ||
|
||
/// CodeContext with constant details known at construction time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, compiling this header code requires including <ostream>
because this code uses "<<" expressions. Please include that STL header even if it is currently included (directly or indirectly) by the headers above.
/// CodeContext with constant details known at construction time | |
#include <ostream> | |
/// CodeContext with constant details known at construction time |
IIRC, including <iosfwd>
is not enough in such cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
The absence of the usual "current master transaction:..." detail in
certain errors raises “Has Squid lost the current transaction context?”
red flags. In some cases, Squid may have lost that context, but for
cache_peer TCP probes, Squid has not because those probes are not
associated with master transactions. It is difficult to distinguish the
two cases because no processing context is reported. To address those
concerns, we now report current cache_peer probing context (instead of
just not reporting absent master transaction context):
When maintaining a cache_peer standy=N connection pool, Squid has and
now reports both contexts, attributing messages to pool maintenance:
The new PrecomputedCodeContext class handles both reporting cases and
can be reused whenever the cost of pre-computing detailCodeContext()
output is acceptable.