-
Notifications
You must be signed in to change notification settings - Fork 6.7k
fix: incorrect member matching when removing etcd nodes #11488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: incorrect member matching when removing etcd nodes #11488
Conversation
Signed-off-by: bo.jiang <[email protected]>
/ok-to-test Are there any more precise examples? If the filtering is done through IP, there should not be any duplication. Not sure if adding |
Can't we instead use the json output from etcdctl and use ansible filter / json_query to have an exact match ? |
😀 I have tested and verified the reliability of -w. We can assume that the output list from $ cat test_etcd_member_list
86c62471680302d1, started, node1, https://10.20.0.1:2380, https://10.20.0.1:2379, false
86c62471680302d2, started, node2, https://10.20.0.2:2380, https://10.20.0.2:2379, false
86c62471680302d3, started, node3, https://10.20.0.21:2380, https://10.20.0.21:2379, false
86c62471680302d4, started, node4, https://10.20.0.22:2380, https://10.20.0.22:2379, false
86c62471680302d5, started, node5, https://10.20.0.23:2380, https://10.20.0.23:2379, false when we do not use -w, it will match multiple lines: $ cat test_etcd_member_list | grep 10.20.0.2
86c62471680302d2, started, node2, https://10.20.0.2:2380, https://10.20.0.2:2379, false
86c62471680302d3, started, node3, https://10.20.0.21:2380, https://10.20.0.21:2379, false
86c62471680302d4, started, node4, https://10.20.0.22:2380, https://10.20.0.22:2379, false
86c62471680302d5, started, node5, https://10.20.0.23:2380, https://10.20.0.23:2379, false when using -w, it treats the string cat test_etcd_member_list | grep -w 10.20.0.2
86c62471680302d2, started, node2, https://10.20.0.2:2380, https://10.20.0.2:2379, false 🙂 additionally, I have seen similar usage in the project, for example: https://github.com/kubernetes-sigs/kubespray/blob/v2.25.0/roles/etcd/tasks/join_etcd_member.yml#L31 🤔 finally, using json_query might also be feasible, but it means I would need to add a task to obtain the register variable from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@ErikJiang Thanks for the explanation. We can first mitigate this problem and think about whether or not we need to change it to json_output. |
🤔 finally, using json_query might also be feasible, but it means I would need to add a task to obtain the register variable from `etcdctl member list`, which complicates the process.
Not really, you just register the whole output in 'Lookup etcd member id' (well in that case we would probably rename that), and extract it directly in the remove tasks with a filter (something like `etcdctl member remove {{ etcd_member_list | json_query(<query) }}` )
It might even be a `from_json | <other filter>`, depending on the structure.
If the query is too hard, fine, but we should at least try to reduce shell usage in kubespray.
And btw, this makes me think that we might use something simpler than `json_query` without a shell.
`(output.stdout_lines | select('contains', ':{{ node_ip }}:'))[0] | split(',')[1]` => this would do the trick, don't you think ?
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ErikJiang, mzaian, tico88612 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
👍 Your suggestion makes sense, and I will consider it in future optimizations. Thanks! |
…igs#11488) Signed-off-by: bo.jiang <[email protected]>
What type of PR is this?
/kind bug
What this PR does / why we need it:
When lookup etcd member ID, using grep for partial matching on node_ip can lead to multiple etcd members being matched. Adding the -w option allows for precisely matching the unique line corresponding to node_ip, thereby avoiding this issue.
Which issue(s) this PR fixes:
Fixes #11482
Special notes for your reviewer:
Does this PR introduce a user-facing change?: