_ _ | |_ ___ __| |____ ___ _ _ | __/ _ \/ _` |_ / / _ \ | | | | || __/ (_| |/ / | __/ |_| | \__\___|\__,_/___(_)___|\__,_|
Given a plain text CSV file like the following...
one,three,two
apple,pear,orange
dog,cat,parrot
parrot,cat,dog
three,two,one
three,one,two
chair,table,lamp
...I want to identify sets of three lines that include the same three items in any order.
In Vim, I visual select the lines to be processed and execute the following command to sort words on each line into alphabetical order:
:%!while read line; do echo $line | python3 -c 'import sys; print(",".join(sorted(sys.stdin.read().strip().split(","))))'; done
Then to sort the lines in alphabetical order, count (and label) duplicates, then sort from most duplicates to least duplicates:
:%!sort | uniq -c | sort -r
That yields the following:
3 one,three,two
2 cat,dog,parrot
1 chair,lamp,table
1 apple,orange,pear
It can be combined into a single line command, as follows:
:%!while read line; do echo $line | python3 -c 'import sys; print(",".join(sorted(sys.stdin.read().strip().split(","))))'; done | sort | uniq -c | sort -r