Skip to content

Conversation

ErikJiang
Copy link
Member

@ErikJiang ErikJiang commented Aug 30, 2024

What type of PR is this?
/kind bug

What this PR does / why we need it:
When lookup etcd member ID, using grep for partial matching on node_ip can lead to multiple etcd members being matched. Adding the -w option allows for precisely matching the unique line corresponding to node_ip, thereby avoiding this issue.

Which issue(s) this PR fixes:

Fixes #11482

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix: incorrect member matching when removing etcd nodes

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 30, 2024
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 30, 2024
@tico88612
Copy link
Member

/ok-to-test

Are there any more precise examples? If the filtering is done through IP, there should not be any duplication. Not sure if adding -w would improve it?

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Aug 30, 2024
@VannTen
Copy link
Contributor

VannTen commented Aug 30, 2024

Can't we instead use the json output from etcdctl and use ansible filter / json_query to have an exact match ?

@ErikJiang
Copy link
Member Author

ErikJiang commented Aug 30, 2024

😀 I have tested and verified the reliability of -w. We can assume that the output list from etcdctl member list is as follows:

$ cat test_etcd_member_list
86c62471680302d1, started, node1, https://10.20.0.1:2380, https://10.20.0.1:2379, false
86c62471680302d2, started, node2, https://10.20.0.2:2380, https://10.20.0.2:2379, false
86c62471680302d3, started, node3, https://10.20.0.21:2380, https://10.20.0.21:2379, false
86c62471680302d4, started, node4, https://10.20.0.22:2380, https://10.20.0.22:2379, false
86c62471680302d5, started, node5, https://10.20.0.23:2380, https://10.20.0.23:2379, false

when we do not use -w, it will match multiple lines:

$ cat test_etcd_member_list | grep 10.20.0.2
86c62471680302d2, started, node2, https://10.20.0.2:2380, https://10.20.0.2:2379, false
86c62471680302d3, started, node3, https://10.20.0.21:2380, https://10.20.0.21:2379, false
86c62471680302d4, started, node4, https://10.20.0.22:2380, https://10.20.0.22:2379, false
86c62471680302d5, started, node5, https://10.20.0.23:2380, https://10.20.0.23:2379, false

when using -w, it treats the string 10.20.0.2 as a whole for independent processing,
allowing us to obtain a unique result line:

cat test_etcd_member_list | grep -w 10.20.0.2
86c62471680302d2, started, node2, https://10.20.0.2:2380, https://10.20.0.2:2379, false

🙂 additionally, I have seen similar usage in the project, for example: https://github.com/kubernetes-sigs/kubespray/blob/v2.25.0/roles/etcd/tasks/join_etcd_member.yml#L31

🤔 finally, using json_query might also be feasible, but it means I would need to add a task to obtain the register variable from etcdctl member list, which complicates the process.

@VannTen @tico88612

Copy link
Member

@tico88612 tico88612 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 30, 2024
@tico88612
Copy link
Member

@ErikJiang Thanks for the explanation.

We can first mitigate this problem and think about whether or not we need to change it to json_output.

@VannTen
Copy link
Contributor

VannTen commented Aug 30, 2024 via email

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ErikJiang, mzaian, tico88612

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 31, 2024
@k8s-ci-robot k8s-ci-robot merged commit db0138b into kubernetes-sigs:master Aug 31, 2024
40 checks passed
@ErikJiang
Copy link
Member Author

👍 Your suggestion makes sense, and I will consider it in future optimizations. Thanks!

@VannTen

kpoxo6op pushed a commit to kpoxo6op/kubespray that referenced this pull request Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove unexpected members from etcd member list
5 participants