strongly connected components are not strongly connected? #348

arekfu · 2016-04-29T16:17:16Z

I generate strongly connected components of random graphs and subsequently check if they are strongly connected. The result is systematically false. Am I missing something?

import LightGraphs: strongly_connected_components, is_strongly_connected,
                    induced_subgraph, erdos_renyi

function scc_ok(graph)
  scc = strongly_connected_components(graph)
  scc_as_subgraphs = map(i -> induced_subgraph(graph, i), scc)
  return all(is_strongly_connected, scc_as_subgraphs)
end

println(all(scc_ok, [erdos_renyi(400, 0.01, is_directed=true) for i in 1:100]))

The text was updated successfully, but these errors were encountered:

sbromberger · 2016-04-29T16:34:14Z

Yup, this definitely smells like a bug, though it's not "systematically" false. I've got one graph that does it.

The graph: https://gist.github.com/sbromberger/af5912dd34190a8a2ed6e0ff9d0c0974

The subgraph that fails the check: https://gist.github.com/sbromberger/21bc112240f0308fc63de8e198fe2715

CarloLucibello · 2016-04-29T16:43:33Z

Could this be a bug introduced in #332 ?

Can someone try the same test on a previous commit? I don't have time at the moment, but eventually I can get to it in the next days

sbromberger · 2016-04-29T16:44:44Z

I can, but there are tons of warnings thrown on the new build: see https://gist.github.com/sbromberger/6a81432250b093b09d442a008177ab94. I'll file a new issue.

sbromberger · 2016-04-29T16:52:48Z

@CarloLucibello looks like you may be off the hook - this problem existed in 0.5.2.

sbromberger · 2016-04-29T16:56:41Z

sccfail.jgz.zip

Unzip this, then load sccfail.jgz. The big graph is g, the one that fails the scc test is bad.

arekfu · 2016-04-30T08:54:12Z

Incidentally, I think it would be a good idea to make extensive use of the testing strategy in the example above (asserting invariants on several instances of random graphs) in the package tests. Do you currently do this?

CarloLucibello · 2016-04-30T13:13:27Z

a smaller graph failing the test:

julia> println(scc_ok(erdos_renyi(20, 0.1, is_directed=true,seed=17)))
false

CarloLucibello · 2016-04-30T13:15:20Z

Incidentally, I think it would be a good idea to make extensive use of the testing strategy in the example above (asserting invariants on several instances of random graphs) in the package tests. Do you currently do this?

We mostly do tests on handmade small graphs, and we certainly could check for more invariants. Would be nice to see more functions like scc_ok in the testsuite

CarloLucibello · 2016-04-30T13:26:04Z

smaller

julia> ok=scc_ok(erdos_renyi(11, 0.12, is_directed=true,seed=17))
false

sbromberger · 2016-04-30T15:20:49Z

Incidentally, I think it would be a good idea to make extensive use of the testing strategy in the example above (asserting invariants on several instances of random graphs) in the package tests. Do you currently do this?

No, because by definition they are not deterministic tests and can yield varying results. See #323 for more discussion.

arekfu · 2016-04-30T19:35:22Z

They are deterministic if you specify the seeds, aren't they?

sbromberger · 2016-04-30T20:13:25Z

I'm not sure determinism crosses system bounds.

arekfu · 2016-04-30T20:31:26Z

Still (last comment about this side issue), testing on random graphs might be nondeterministic and you might run into false negatives now and then, but at the same time it would enormously improve your coverage. If your code works on thousands on random graphs, the probability that it contains a bug becomes negligible.

sbromberger · 2016-04-30T23:55:35Z

That's certainly true, but the set of functions that we could run these on is limited to those that return boolean (or can be coerced to boolean) since we'd use any / all. Centrality measures, for example, won't be able to be tested this way. Balance this against the time it takes to run tests and it's not a straightforward "let's definitely do this".

That said, I do like the idea as long as we're not imposing too much of a burden on the testing framework. It might be worthwhile to see what functions can benefit from tests of random graphs and then make sure they don't take forever to run.

(...and I hope this isn't really your last comment on this issue, since it's worthy of discussion - please reconsider.)

sbromberger · 2016-04-30T23:55:48Z

This is still a bug.

arekfu · 2016-05-02T08:25:53Z

Even smaller:

erdos_renyi(4, 0.25, seed=436, is_directed=true)

arekfu · 2016-05-02T11:05:05Z

OK, so this is interesting:

julia> g=erdos_renyi(4, 0.25, seed=436, is_directed=true)
{4, 4} directed graph

julia> collect(edges(g))
4-element Array{Pair{Int64,Int64},1}:
 edge 1 - 3
 edge 1 - 4
 edge 2 - 3
 edge 4 - 2

julia> g2=DiGraph(4);  add_edge!(g2, 1, 2); add_edge!(g2, 2, 4); add_edge!(g2, 4, 3); add_edge!(g2, 1, 3);

julia> scc_ok(g)
false

julia> scc_ok(g2)
true

The interesting point is that g and g2 are isomorphic (g2 is g with vertices 2 and 4 exchanged). So I instrumented the calls to the TarjanVisitor methods (forgive my OO jargon) and I got:

julia> scc_ok(g)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0,0,0,0],Array{Int64,1}[]), 1)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1],[1],[1,0,0,0],Array{Int64,1}[]), 1, 3, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,0,0,0],Array{Int64,1}[]), 3)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,3],[1,2],[1,0,2,0],Array{Int64,1}[]), 3)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1],[1],[1,0,2,0],[[3]]), 1, 4, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,0,2,0],[[3]]), 4)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1,4],[1,2],[1,0,2,2],[[3]]), 4, 2, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1,4],[1,2],[1,0,2,2],[[3]]), 2)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1,4,2],[1,2,3],[1,3,2,2],[[3]]), 2, 3, 2, 0)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,4,2],[1,2],[1,3,2,2],[[3]]), 2)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,4,2],[1,2],[1,3,2,2],[[3]]), 4)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,3,2,2],[[3],[4,2]]), 1)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0],Array{Int64,1}[]), 1)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1],Array{Int64,1}[]), 1)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0,0],Array{Int64,1}[]), 1)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1],[1],[1,0],Array{Int64,1}[]), 1, 2, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,0],Array{Int64,1}[]), 2)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,2],[1,2],[1,2],Array{Int64,1}[]), 2)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,2],[[2]]), 1)
false

julia> scc_ok(g2)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0,0,0,0],Array{Int64,1}[]), 1)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1],[1],[1,0,0,0],Array{Int64,1}[]), 1, 2, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,0,0,0],Array{Int64,1}[]), 2)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1,2],[1,2],[1,2,0,0],Array{Int64,1}[]), 2, 4, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1,2],[1,2],[1,2,0,0],Array{Int64,1}[]), 4)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1,2,4],[1,2,3],[1,2,0,3],Array{Int64,1}[]), 4, 3, 0, 0)
INFO: discover_vertex!(LightGraphs.TarjanVisitor([1,2,4],[1,2,3],[1,2,0,3],Array{Int64,1}[]), 3)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,2,4,3],[1,2,3,4],[1,2,4,3],Array{Int64,1}[]), 3)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,2,4],[1,2,3],[1,2,4,3],[[3]]), 4)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1,2],[1,2],[1,2,4,3],[[3],[4]]), 2)
INFO: examine_neighbor!(LightGraphs.TarjanVisitor([1],[1],[1,2,4,3],[[3],[4],[2]]), 1, 3, 4, 0)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1,2,4,3],[[3],[4],[2]]), 1)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0],Array{Int64,1}[]), 1)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1],Array{Int64,1}[]), 1)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0],Array{Int64,1}[]), 1)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1],Array{Int64,1}[]), 1)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0],Array{Int64,1}[]), 1)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1],Array{Int64,1}[]), 1)
INFO: discover_vertex!(LightGraphs.TarjanVisitor(Int64[],Int64[],[0],Array{Int64,1}[]), 1)
INFO: close_vertex!(LightGraphs.TarjanVisitor([1],[1],[1],Array{Int64,1}[]), 1)
true

arekfu · 2016-05-02T12:01:30Z

Pull request submitted.

arekfu · 2016-05-02T12:48:08Z

@sbromberger concerning the general issue of testing on random inputs, I think the non-determinism is not really an issue if you can track which seeds caused the tests to fail. Also, in my domain (Monte-Carlo simulations), a pseudo-random number generator that yields different sequences for the same seed on different system would probably be considered to be buggy.

I have no idea how long your tests take right now, but you could amortize the time necessary to generate the test graphs by generating them once and reusing them for all your testable predicates.

Finally, I think this strategy could apply to non-boolean functions, too, provided that you can say something non-trivial about the probability distribution of your statistics. Take a centrality measure, for instance, and construct a confidence interval for the resulting value. Your test fails if the sample statistics falls outside the confidence interval. This requires you to set some significance level, i.e. the false positive rate that you are ready to tolerate.

sbromberger · 2016-05-02T14:04:27Z

@arekfu I like the idea of MC simulation. If you have some time, would you mind mocking up a few example tests?

sbromberger · 2016-05-04T13:44:38Z

Closed by #351

sbromberger added the bug confirmed bug producing incorrect results label Apr 29, 2016

arekfu mentioned this issue May 2, 2016

Fix Tarjan's algorithm for strongly connected components #351

Merged

sbromberger closed this as completed May 4, 2016

sbromberger mentioned this issue May 4, 2016

Random Graph Generator Validation #323

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

strongly connected components are not strongly connected? #348

strongly connected components are not strongly connected? #348

arekfu commented Apr 29, 2016

sbromberger commented Apr 29, 2016

CarloLucibello commented Apr 29, 2016

sbromberger commented Apr 29, 2016

sbromberger commented Apr 29, 2016

sbromberger commented Apr 29, 2016 •

edited

Loading

arekfu commented Apr 30, 2016

CarloLucibello commented Apr 30, 2016

CarloLucibello commented Apr 30, 2016

CarloLucibello commented Apr 30, 2016

sbromberger commented Apr 30, 2016

arekfu commented Apr 30, 2016

sbromberger commented Apr 30, 2016

arekfu commented Apr 30, 2016

sbromberger commented Apr 30, 2016

sbromberger commented Apr 30, 2016

arekfu commented May 2, 2016

arekfu commented May 2, 2016

arekfu commented May 2, 2016

arekfu commented May 2, 2016

sbromberger commented May 2, 2016

sbromberger commented May 4, 2016

strongly connected components are not strongly connected? #348

strongly connected components are not strongly connected? #348

Comments

arekfu commented Apr 29, 2016

sbromberger commented Apr 29, 2016

CarloLucibello commented Apr 29, 2016

sbromberger commented Apr 29, 2016

sbromberger commented Apr 29, 2016

sbromberger commented Apr 29, 2016 • edited Loading

arekfu commented Apr 30, 2016

CarloLucibello commented Apr 30, 2016

CarloLucibello commented Apr 30, 2016

CarloLucibello commented Apr 30, 2016

sbromberger commented Apr 30, 2016

arekfu commented Apr 30, 2016

sbromberger commented Apr 30, 2016

arekfu commented Apr 30, 2016

sbromberger commented Apr 30, 2016

sbromberger commented Apr 30, 2016

arekfu commented May 2, 2016

arekfu commented May 2, 2016

arekfu commented May 2, 2016

arekfu commented May 2, 2016

sbromberger commented May 2, 2016

sbromberger commented May 4, 2016

sbromberger commented Apr 29, 2016 •

edited

Loading