A Rust command-line tool that makes a list of homophones from Wikitionary.
- Some lists of homophones
- A Rust command-line tool to generate your own list of homophones, based on an inputted word list.
If you're looking for lists of homophones, look in the homophone-lists
directory.
Lists that are labeled "as pairs" contain pairs of homophones separated by commas. Example:
acts,ax
adds,ads
adds,adze
Lists labeled "as singles" are these same words, but with each on its own line, no commas:
acts
ax
adds
ads
adds
adze
Any lists with "cleaned" in the file name have been cleaned in some way, likely trimmed of whitespace, de-duplicated, and sorted alphabetically. I'd recommend using cleaned files.
To make your own list of homophones, you'll want to use the included command-line tool.
The command-line tool will search and scrape Wikitionary for each word in the inputted word list file(s), searching for homophones.
- Install Rust if you haven't already
- Run:
cargo install --git https://github.com/sts10/homophones --branch main
Usage: homophones [OPTIONS] <Inputted word lists>...
Arguments:
<Inputted word lists>... Word list input files. Can provide more than one (they'll be combined)
Options:
-p, --pairs <PAIRS_OUTPUT> Path for outputted file for list of PAIRS of homophones
-s, --singles <SINGLES_OUTPUT> Path for outputted file for list of SINGLE homophones
-f, --force Force overwrite of output file(s), if it exists
-h, --help Print help
-V, --version Print version
Take words from a file called input_list.txt
and print a list of homophones, one per line, to a new file called some_homophones_as_a_single_list.txt
:
homophones -s some_homophones_as_a_single_list.txt input_list.txt
Take words from a file called input_list.txt
and print a list of homophones, a pair of homophones per line, to a new file called some_homophones_as_pairs_list.txt
:
homophones -p some_homophones_as_pairs_list.txt input_list.txt
Do both of those things at once!
homophones -s some_homophones_as_a_single_list.txt -p some_homophones_as_pairs_list.txt input_list.txt
- Make it async!
This project scrapes homophones from Wikitionary. In an effort to comply with Wikitionary's Terms of Use, the lists/results generated by this project, including files in the ./homophone-lists
directory, are, like the text of Wikitionary itself, available under the Creative Commons Attribution-ShareAlike License.
The code of this project is available under the Blue Oak Model License.