wdcnt [-p|-z] [-e] files ...
wdcnt [-p|-z] [-e] < file
gnuplot> set log xy gnuplot> plot "< wdcnt file"
Reports probability instead of number of occurrences. Each frequency is normalized by 1.0.
Reports relative frequency instead of number of occurrences. 1.0 for the most occurring word.
Does not use KAKASI. This option is NOT useful to Japanese documents.
Prints usage and version then exit.
For English document, a traditional one-liner is known:
% tr -s '\040' '\012' files ... | sort -n | uniq -c | sort -n -r
Word separation is not accurate.