-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNDB-12425: A few reproduction tests and a preliminary patch, WIP #1529
base: main
Are you sure you want to change the base?
Conversation
9b73538
to
e03373a
Compare
@@ -77,7 +76,7 @@ public void tesConcurrencyFactor() | |||
// verify that a low concurrency factor is not capped by the max concurrency factor | |||
PartitionRangeReadCommand command = command(cfs, 50, 50); | |||
try (RangeCommandIterator partitions = RangeCommands.rangeCommandIterator(command, ONE, System.nanoTime(), ReadTracker.NOOP); | |||
ReplicaPlanIterator ranges = new ReplicaPlanIterator(command.dataRange().keyRange(), command.indexQueryPlan(), keyspace, ONE)) | |||
ReplicaPlanIterator ranges = new ReplicaPlanIterator(command.dataRange().keyRange(), command.indexQueryPlan(), keyspace, ONE, false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be command.rowFilter().allowFiltering
@@ -362,13 +365,24 @@ public synchronized Future<?> addIndex(IndexMetadata indexDef, boolean isNewCF) | |||
* @param queryPlan a query plan | |||
* @throws IndexNotAvailableException if the query plan has any index that is not queryable | |||
*/ | |||
public void checkQueryability(Index.QueryPlan queryPlan) | |||
public boolean isQueryableThroughIndex(Index.QueryPlan queryPlan, boolean allowsFiltering) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can now return true or false, or throw an exception, and it also throws a client warning. That looks like too many side effects for a method starting with boolean is...
, which might suggest a simpler behaviour. I would either:
a) Split it into two simpler separate boolean methods to know if all the indexes in the plan are building/queryable, and let ReadCommand#executeLocally
do the AF check and throw exceptions and warnings.
b) Transform it into a SecondaryIndexManger#searcherFor(Index.QueryPlan, boolean)
method keeping most of it's responsibilities, returning the searcher if it's possible to build it, null if it's building, and exception if it's not queryable.
src/java/org/apache/cassandra/index/sai/StorageAttachedIndexGroup.java
Outdated
Show resolved
Hide resolved
* @return a new query plan for the specified {@link RowFilter} and {@link Index}, {@code null} otherwise | ||
*/ | ||
@Nullable | ||
QueryPlan queryPlanForIndices(RowFilter rowFilter, Set<Index> indexes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still thinking about how could make this the only method and get rid of queryPlanFor(RowFilter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't addressed this yet, I will come back to it soon
7138378
to
8e0677f
Compare
src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
public void testAllowFilteringDuringIndexBuildsOn3NodeCluster(boolean isCreateIndex, Index.Status buildStatus) throws Exception | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is long and ugly, but covers all cases. I will refactor it soon. It wasn't a priority for now
@@ -1156,6 +1156,8 @@ public void testIndexQueriesWithIndexNotReady() | |||
{ | |||
execute("DROP index " + KEYSPACE + ".testIndex"); | |||
} | |||
|
|||
execute("SELECT value FROM %s WHERE value = 2 ALLOW FILTERING"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this is the only non-sai test.... we need more
final Injections.Barrier blockIndexBuild = Injections.newBarrier("block_index_build", 2, false) | ||
.add(InvokePointBuilder.newInvokePoint().onClass(StorageAttachedIndex.class) | ||
.onMethod("startInitialBuild")) | ||
.build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to test here also 2i, but in practice this tests only SAI.... for now...
test/unit/org/apache/cassandra/index/sai/cql/AllowFilteringTest.java
Outdated
Show resolved
Hide resolved
Still not ready for full review... |
|
✔️ Build ds-cassandra-pr-gate/PR-1529 approved by ButlerApproved by Butler |
src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
Show resolved
Hide resolved
|
||
// if the status of the index is building and there is allow filtering - that is ok too | ||
if (considerAllowFiltering && status == Index.Status.INITIALIZED && allowFiltering) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to think about it more thoroughly, but this looks like a good place to place the client warnings that are currently thrown on the replica side. We might have a warning message per index-building replica, so clients can know what nodes are still initializing their indexes and are going to use filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added:
// if the status of the index is building and there is allow filtering - that is ok too
if (considerAllowFiltering && status == Index.Status.INITIAL_BUILD_STARTED && !index.isQueryable(status) && allowFiltering)
{
ClientWarn.instance.warn(String.format("Query fell back to ALLOW FILTERING because index %s is still building on endpoint %s",
index.getIndexMetadata().name,
replica.endpoint()));
continue;
}
which led to multiple warnings for the same node in tests.
I decided to just bring on single node C* and try single query on index build:
cqlsh:k> CREATE CUSTOM INDEX ON t(k) USING 'StorageAttachedIndex';
cqlsh:k> SELECT * FROM t WHERE k=200 ALLOW FILTERING;
pk | i | j | k | vec
----+---+---+---+-----
(0 rows)
Warnings :
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoint localhost/127.0.0.1:7000
I have to dig into this tomorrow.... no more energy today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that's because of virtual nodes, with 16 tokens per node. Rather than throwing the client warning immediately, the endpoints can be collected in a set:
Set<InetAddressAndPort> filteringEndpoints = new HashSet<>();
and then throw a single warning after the loop with the unique addresses. For example:
Query fell back to ALLOW FILTERING because index t_k_idx is still building on endpoints 192.168.0.1:7000, 192.168.0.2:7000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether fell back to ALLOW FILTERING
will be clear enough for users, considering that they have just written ALLOW FILTERING
in the query and, strictly, ALLOW FILTERING
is a permission to filter and not the action of filtering. Perhaps the message would be a bit clearer this way:
The query won't use the indexes a, b and c on endpoints 192.168.0.1:7000, 192.168.0.2:7000 because the indexes are still building on those nodes.
Feel free to ignore if you don't agree; I'm just giving ideas.
...org/apache/cassandra/distributed/test/sai/AllowFilteringDuringIndexBuildDistributedTest.java
Outdated
Show resolved
Hide resolved
...org/apache/cassandra/distributed/test/sai/AllowFilteringDuringIndexBuildDistributedTest.java
Outdated
Show resolved
Hide resolved
...org/apache/cassandra/distributed/test/sai/AllowFilteringDuringIndexBuildDistributedTest.java
Outdated
Show resolved
Hide resolved
...org/apache/cassandra/distributed/test/sai/AllowFilteringDuringIndexBuildDistributedTest.java
Outdated
Show resolved
Hide resolved
8e0677f
to
552a7b6
Compare
test/distributed/org/apache/cassandra/distributed/test/ByteBuddyUtils.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/ByteBuddyUtils.java
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/index/IndexTestBase.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/index/IndexTestBase.java
Outdated
Show resolved
Hide resolved
…t include: - feature flag - checks we are on the new messaging version added for ANNOptions - we fall back to allow filtering only on Index Creation. Currently we also fall back to ALLOW FILTERING if we use nodetool to rebuild indexes
…ull rebuilds Added some ugly testing to IndexAvailabilityTest to confirm queries with the two build statuses
…ordinator and know the plan may change when it reaches the replica and we rebuild it. This will change with CNDB-13129 Added new IndexBuildDuringBootstrapTest. Handle bootstrapping if we think we should? Address other nits and fixes. Rebased on top of Michael's messaging version bump and related fixes.
552a7b6
to
8c95eb6
Compare
Split to prevent timeouts. Also add cluster sharing to speed it up.
Fix AllIndexImplementationsTest and extend it to cover other index implementations.
if (requiresFiltering) | ||
assertInvalidThrowMessage(error, InvalidRequestException.class, query); | ||
else if (duringInitialBuild) | ||
assertInvalidThrow(IndexNotAvailableException.class, query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adelapena , after I rebased on top of CNDB-12620, and I realized that no matter what exception I put here, the tests with all other indexes but SAI always pass...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't really need to do the build injections for anything but SAI for two reasons:
- all indexes but SAI are always queryable according to
Index.isQueryable
. Though it seems during the build at least SASI does not return results - https://github.com/riptano/cndb/issues/12931 - I think there was a bug in the test and we get actually
An index involved in this query does not support disjunctive queries using the OR operator
from the first query once I fix it. We were never hitting theif (duringInitialBuild)
with non-SAI indexes
Queries that could use an index will fail while the index is being built, even if allow filtering is specified.
This is an availability issue for people who migrate from using
ALLOW FILTERING
to indexes....
What does this PR fix and why was it fixed
...
We enable CC to fall back to
ALLOW FILTERING
on the initial index build, which is considered safe. This is done by adding allowFiltering information (whether it exists in the query and whether the query is supported withALLOW FILTERING
) to the RowFilter and also newIndex.Status
-INITIAL_BUILD_STARTED
In CNDB, later rebuilds are also safe as the index is queryable while the compactor is rebuilding.
This is acknowledged in CC patch as we check not only that an index is building and we have
ALLOW FILTERING
, but also that the index is not queryable. The difference between Astra and CC is that in CC index is always not queryable during building.While I still have to address additional testing for index build during bootstrapping and 2i testing of the patch in CC, that does not matter for Astra, so I believe the patch can be reviewed in parallel. I added two tests as per my conversation with @jasonstack, and they pass. Please let me know if there are any other cases they may need to address and whether the tests are what they had to be.
For the addition of
allowFiltering
, I had to bump the messaging version.The migration tests failed but I believe this will be addressed with https://github.com/riptano/cndb/pull/13095. It is solving the issues from bumping the messaging version in CC. It is safe to ignore them for now.
Everything else seems to have passed.
Still missing additional 2i testing and testing of builds during bootstrapping. Also, I want to add a feature flag.
Checklist before you submit for review
NoSpamLogger
for log lines that may appear frequently in the logs