Skip to content

Instantly share code, notes, and snippets.

@DiannaHohensee
Created November 28, 2023 23:55
Show Gist options
  • Save DiannaHohensee/1c426d666f67ec2eff6eb564e7954ab4 to your computer and use it in GitHub Desktop.
Save DiannaHohensee/1c426d666f67ec2eff6eb564e7954ab4 to your computer and use it in GitHub Desktop.
///////// ~~~~~~~~ there should be code that marks node_t1 as healthy at some point
//// publishing never happens to node 1
//// node 1 gets successfully join response, but never receives publication
//// node 1 never receives the cluster publication with itself, never cancels running for master.
//// onClusterChange cancels the election?
//// BUT problem is faulty node
//// Node 2 allowed the join, but left on the faulty list
//// Check in with Tanguy
// Disruption Scheme node_t1, node_t0 node_t2
[2023-11-29T09:29:44,024][INFO ][o.e.d.ClusterDisruptionIT] [testAckedIndexing] disruption scheme [network disruption (disruption type: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))] added
---------------------------
// node_t0 is master
[2023-11-29T09:29:42,758][INFO ][o.e.c.s.MasterService ] [node_t0] elected-as-master ([2] nodes joined in term 1)[_FINISH_ELECTION_, {node_t0}{VX3YUxdaRE2JctmIIQ5Stg}{KwOfqsRNTsCtflzeIIo4yg}{node_t0}{127.0.0.1}{127.0.0.1:13483}{cdfhilmrstw}{8.12.0}{7000099-8500004} completing election, {node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} completing election], term: 1, version: 1, delta: master node changed {previous [], current [{node_t0}{VX3YUxdaRE2JctmIIQ5Stg}{KwOfqsRNTsCtflzeIIo4yg}{node_t0}{127.0.0.1}{127.0.0.1:13483}{cdfhilmrstw}{8.12.0}{7000099-8500004}]}, added {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// add
[2023-11-29T09:29:42,958][INFO ][o.e.c.s.MasterService ] [node_t0] node-join[{node_t1}{JaUPCfTPTBCogOwdG9o2Qw}{Xxw8zst0QM-aB2Fd9eAFMQ}{node_t1}{127.0.0.1}{127.0.0.1:13481}{cdfhilmrstw}{8.12.0}{7000099-8500004} joining], term: 1, version: 2, delta: added {{node_t1}{JaUPCfTPTBCogOwdG9o2Qw}{Xxw8zst0QM-aB2Fd9eAFMQ}{node_t1}{127.0.0.1}{127.0.0.1:13481}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// DISRUPTION --------
[2023-11-29T09:29:44,424][INFO ][o.e.t.d.NetworkDisruption] [testAckedIndexing] start disrupting (disruption type: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))
// mark faulty -- isolated
[2023-11-29T09:29:44,461][DEBUG][o.e.c.c.FollowersChecker ] [node_t0] FollowerChecker{discoveryNode={node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}, failureCountSinceLastSuccess=1, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=1} disconnected
[2023-11-29T09:29:44,461][DEBUG][o.e.c.c.FollowersChecker ] [node_t0] FollowerChecker{discoveryNode={node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}, failureCountSinceLastSuccess=0, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=1} marking node as faulty
// remove -- isolated
[2023-11-29T09:29:44,515][INFO ][o.e.c.s.MasterService ] [node_t0] node-left[{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} reason: disconnected], term: 1, version: 13, delta: removed {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// STOP DISRUPTING -------
[2023-11-29T09:29:48,471][INFO ][o.e.t.d.NetworkDisruption] [testAckedIndexing] stop disrupting (disruption scheme: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))
// add -- isolated
[2023-11-29T09:29:48,598][INFO ][o.e.c.s.MasterService ] [node_t0] node-join[{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} joining, removed [3.8s/3867ms] ago with reason [disconnected]], term: 1, version: 19, delta: added {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// DISRUPTING -------
[2023-11-29T09:29:48,876][INFO ][o.e.t.d.NetworkDisruption] [testAckedIndexing] start disrupting (disruption type: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))
// mark faulty -- isolated
[2023-11-29T09:29:49,008][DEBUG][o.e.c.c.FollowersChecker ] [node_t0] FollowerChecker{discoveryNode={node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}, failureCountSinceLastSuccess=0, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=1} marking node as faulty
// remove -- isolated
[2023-11-29T09:29:49,017][INFO ][o.e.c.s.MasterService ] [node_t0] node-left[{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} reason: disconnected], term: 1, version: 20, delta: removed {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// STOP DISRUPTING -------
[2023-11-29T09:29:49,806][INFO ][o.e.t.d.NetworkDisruption] [testAckedIndexing] stop disrupting (disruption scheme: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))
// add -- isolated
[2023-11-29T09:29:49,970][INFO ][o.e.c.s.MasterService ] [node_t0] node-join[{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} joining, removed [1s/1021ms] ago with reason [disconnected], [2] total removals], term: 1, version: 21, delta: added {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// DISRUPTION --------
[2023-11-29T09:29:50,075][INFO ][o.e.t.d.NetworkDisruption] [testAckedIndexing] start disrupting (disruption type: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))
// mark faulty -- isolated
[2023-11-29T09:29:50,140][DEBUG][o.e.c.c.FollowersChecker ] [node_t0] FollowerChecker{discoveryNode={node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}, failureCountSinceLastSuccess=0, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=1} marking node as faulty
// remove -- isolated
[2023-11-29T09:29:50,164][INFO ][o.e.c.s.MasterService ] [node_t0] node-left[{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} reason: disconnected], term: 1, version: 23, delta: removed {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// STOP DISRUPTING -------
[2023-11-29T09:29:55,129][INFO ][o.e.t.d.NetworkDisruption] [testAckedIndexing] stop disrupting (disruption scheme: network disconnects, disrupted links: two partitions (partition 1: [node_t2] and partition 2: [node_t1, node_t0]))
// add -- isolated
[2023-11-29T09:29:55,190][INFO ][o.e.c.s.MasterService ] [node_t0] node-join[{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004} joining, removed [4.8s/4885ms] ago with reason [disconnected], [3] total removals], term: 1, version: 24, delta: added {{node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}}
// mark faulty -- isolated -- suite stopping
[2023-11-29T09:29:57,006][DEBUG][o.e.c.c.FollowersChecker ] [node_t0] FollowerChecker{discoveryNode={node_t2}{l1lBNs0yTfihvodX-rBZGw}{WRnfVKMFTHWdnyqCfchohA}{node_t2}{127.0.0.1}{127.0.0.1:13482}{cdfhilmrstw}{8.12.0}{7000099-8500004}, failureCountSinceLastSuccess=0, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=1} marking node as faulty
// mark faulty -- t1, friendly -- suite stopping
[2023-11-29T09:29:57,016][DEBUG][o.e.c.c.FollowersChecker ] [node_t0] FollowerChecker{discoveryNode={node_t1}{JaUPCfTPTBCogOwdG9o2Qw}{Xxw8zst0QM-aB2Fd9eAFMQ}{node_t1}{127.0.0.1}{127.0.0.1:13481}{cdfhilmrstw}{8.12.0}{7000099-8500004}, failureCountSinceLastSuccess=0, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=1} marking node as faulty
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment