eliza.py

ELIZA is a natural language processing program developed from 1964 to 1966 by Joseph Weizenbaum, originally implemented in MAD-SLIP. You can read the 1966 paper here.

ELIZA uses pattern matching, decomposition and reassembly rules to emulate a Rogerian psychotherapist.

(Inputs taken from Weizenbaum's 1966 paper, pp.1-2.)

Program flow

Download

Requirements

Python 3.x (should work with Python 2 by adjusting back the syntax of some commands)

Via command-line

# Clone the repository
$ git clone https://github.com/rdimaio/eliza-py

Usage

Starting the program

# Go into the project folder
$ cd eliza-py

# Execute the file
$ python eliza.py
Eliza: Hi.
You: [type something]

Example conversation (using the same inputs as the original paper, pp. 1-2)

You: Men are all alike.
Eliza: In what way?
You: They're always bugging us about something or other.
Eliza: Can you think of a specific example?
You: Well, my boyfriend made me come here.
Eliza: Your boyfriend made you come here?
You: He says I'm depressed much of the time.
Eliza: I am sorry to hear that you are depressed.
You: It's true. I am unhappy.
Eliza: Do you think coming here will help you not to be unhappy?
You: I need some help, that much seems certain.
Eliza: What would it mean to you if you got some help?
You: Perhaps I could learn to get along with my mother.
Eliza: Tell me more about your family.

Exiting the program

# You can type bye, goodbye, done, quit or exit. Or use CTRL+C, if you want to be rude about it...
You: bye
Eliza: Goodbye.

Script structure

`scripts/general.json`

This script handles general English language information that is not necessarily tied into the other script the program uses, as well as useful inputs for the program.

substitutions: specifies which keywords should be substituted before applying a custom script
tags: specifies keywords within the same semantic field
memory_inputs: array of keywords that prompt the generation of an additional response added to the memory stack
exit_inputs: array of keywords that can be used to quit the program

`scripts/doctor.json`

This script simulates a Rogerian psychotherapist. It has been filled according to the appendix in the original paper (p. 9), including ranks. An additional great reference is the script file from Charles Hayen's Java implementation of ELIZA. Some small additions have been made to make the program feel a bit nicer (e.g. the program responds to greetings).

Each element in the JSON file follows this structure:

keyword: keyword that the program looks for in the user's input (after substitution, like in the original implementation)
- Two special keywords exist:
  - $: specifies that a generic answer should be given
  - ^: specifies that an answer from the memory stack should be given
rank: rank of that keyword
rules: Array of decomposition rules and matching reassembly rules in the form:
- decomp: Decomposition rule (using the same syntax as the original 1966 paper)
- reassembly: Array of reassembly rules to be used with the decomposition rule specified in decomp
  - Reassembly rules use 1-indexing like in the original paper; note that when a tag in a decomposition rule is equivalent to two components in its reassembly rules instead of one (to be able to use regex)
- last_used_reassembly_rule: ID of last used reassembly rule for this decomposition rule (0-indexed); it is incremented everytime the decomposition rule is matched and it cycles back to the beginning when the last reassembly rule in the array is used.

FAQ

Differences from original implementation

Keyword ranking:
- Original implementation: keywords are not guaranteed to be ranked in descending order; as seen in Fig. 2 on p. 4 of the original paper, a keyword is placed on top of the keystack if its rank is higher than the highest rank encountered in the sentence so far, otherwise it is placed on the bottom of the keystack.
- This implementation: keywords are guaranteed to be ranked in descending order.
Sentence tokenization:
- Original implementation: if a comma/period is encountered and a keyword has already been found, all subsequent text is deleted (p. 2).
- This implementation: sentences are split based on punctuation (—,.:;-), and the sentence with the highest ranked keyword is chosen to be decomposed.
- Main reasons:
  - The emphasis of the user's input may not necessarily be in the first section of the sentence
  - The section with the highest ranked keyword has a higher chance of having decomposition rules for that keyword, as it has a rank in the first place
Tags:
- Original implementation: DLIST is used to indicate tags.
- This implementation: tag is used to indicate tags.
- The functionality is the same.
Memory stack:
- Original implementation: the keyword my is associated with the memory stack (p. 6);
- This implementation: the memory stack is called when no matching decomposition rule is found.

Why are scripts stored in JSON and not CSV?

In the doctor script, each keyword has a variable amount of decomposition rules, and each decomposition rule has a variable amount of reassembly rules. I think JSON can store this information structure in a much more intuitive way.

The general script could be stored in .csv as there is no nesting, but I preferred to use JSON again to remain consistent with the other script.

Future work

Allow the user to edit the script during a session by typing "edit" as in the original implementation (p. 7 of the paper)
Translate to other languages (Italian, Spanish..)
Consider including a randomized delay before the program responds, strengthening the human-like feel of the conversation

References

J. Weizenbaum, “ELIZA-a computer program for the study of natural language communication between man and machine,” Communications of the ACM, vol. 9, no. 1, pp. 36–45, Jan. 1966. Link
The script file from Charles Hayen's Java implementation of ELIZA

Tools

Demo animation: asciinema and termtosvg
Flowchart: draw.io

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.svg		demo.svg
eliza.py		eliza.py
flowchart.svg		flowchart.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

eliza.py

Program flow

Download

Requirements

Via command-line

Usage

Starting the program

Example conversation (using the same inputs as the original paper, pp. 1-2)

Exiting the program

Script structure

`scripts/general.json`

`scripts/doctor.json`

FAQ

Differences from original implementation

Why are scripts stored in JSON and not CSV?

Future work

References

Tools

About

Releases

Packages

Languages

License

rdimaio/eliza-py

Folders and files

Latest commit

History

Repository files navigation

eliza.py

Program flow

Download

Requirements

Via command-line

Usage

Starting the program

Example conversation (using the same inputs as the original paper, pp. 1-2)

Exiting the program

Script structure

scripts/general.json

scripts/doctor.json

FAQ

Differences from original implementation

Why are scripts stored in JSON and not CSV?

Future work

References

Tools

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`scripts/general.json`

`scripts/doctor.json`

Packages