Here are some suggestions:
Use descriptive variable names
use more descriptive names, such as word_length, instead of z, etc.
Use functions to avoid repetition
As it is, there is quite a bit of repetition on your code. The first is the reading in of the file. I suggest to use something like this:
def read_file(file_name): return [line.strip() for line in open(file_name).readlines()]
Similarly, you should put the printing part of the code into a function as well:
def print_animal(words, mode): assert(mode in range(1, 5)) s = words["name"] if mode in (3, 5): s += " " + words["adjective"] if mode in (1, 4, 5): s += " " + words["origin"] if mode > 1: s += " " + words["bodypart"] + " a " + words["color"] print s
This builds a string first, using some simplifications of your logic, and then prints it. Note, if you don't change the filenames, you will have to change here the keys of the words dictionary as well. This string has the words separated by spaces instead of newlines. I think this makes it more natural to read. If you do need/want the newlines, you will have to replace the appropriate spaces with newlines (\n).
Note that your branch of
r == 0
will never be called, because random.randint(1,5) generates a random integer in the closed range [1,5].Use list/dictionary comprehensions
Become familiar with comprehensions, they allow you to avoid a lot of code duplication. They allow you to do for example:
WORD_TYPES = ["name", "adjective", "origin", "bodypart", "color"]
ALL_WORDS = word_type: read_file("ani_"+word_type+".txt") for word_type in word_types
The above comprehension loops over the different input files and stores the resulting words as lists in a dictionary where the key is the word type (name, origin, ...).
For this code to work, you'd have to rename the files to the format ani_name.txt, ani_adjective.txt and so forth. Since I don't speak French very well, it was easier for me to use more distinctive names here.
Btw, global variables should be written in CAPITAL_LETTERS to distinguish them from local variables.
Similarly, you can use them to define first the words of the right length:
word_length = random.randint(1, 100)
words = word_type: filter(lambda word: len(word) == (word_length, word_length-2)[word_type == "bodypart"], ALL_WORDS[word_type]) for word_type in WORD_TYPES
Here the
(word_length, word_length-2)[word_type == "bodypart"]
part is a short way to give a word of length word_length-2 if a body-part is requested and of length word_length otherwise, since python uses True == 1
and False == 0
For the words chosen to make up the description you can also use a dictionary comprehension:
chosen words = word_type: random.choice(words[word_type]) for word_type in WORD_TYPES
You will still need the try...except IndexError part in case one of the word types does not contain any word with the requested length. Since the word_length can be up to 100 this will likely happen quite often, so you might end up with a lot less than 120 descriptions. Maybe calculate the minimum maximum word_length (above which you will always hit the IndexError for one of the word_types), to ameliorate this somewhat:
MIN_MAX_LENGTH = min([max([len(word) for word in ALL_WORDS[word_type]]) for word_type in WORD_TYPES])
Document your code
See below.
Result
My final code using the above suggestions (untested):
#!/usr/bin/env python
import time
import random # Program to read in word parts from the files
# 'ani_name.txt', 'ani_adjective.txt', 'ani_origin.txt', 'ani_bodypart.txt', 'ani_color.txt' and construct 120 random descriptions of animals from them. The words in each description of an animal are ensured to have the same length. def read_file(file_name): """Return a list of all whitespace-stripped lines taken from file_name""" return [line.strip() for line in open(file_name).readlines()] def print_animal(words, mode): """Print a description of an animal, given a dictionary of words with different word_types as key and a mode detailing which word_types to use mode 1) prints name and origin 2) prints name and a bodypart with an associated color 3) prints name, adjective and a bodypart with an associated color 4) prints name, origin and a bodypart with an associated color 5) prints name, adjective, origin and a bodypart with an associated color """ s = words["name"] if mode in (3, 5): s += " " + words["adjective"] if mode in (1, 4, 5): s += " " + words["origin"] if mode > 1: s += " " + words["bodypart"] + " a " + words["color"] print s WORD_TYPES = ["name", "adjective", "origin", "bodypart", "color"]
ALL_WORDS = word_type: read_file("ani_"+word_type+".txt") for word_type in WORD_TYPES
MAX_MIN_LENGTH = max([min([len(word) for word in ALL_WORDS[word_type]]) for word_type in WORD_TYPES])
MIN_MAX_LENGTH = min([max([len(word) for word in ALL_WORDS[word_type]]) for word_type in WORD_TYPES]) for n in range(120): word_length = random.randint(MAX_MIN_LENGTH, MIN_MAX_LENGTH) words = word_type: filter(lambda word: len(word) == (word_length, word_length-2)[word_type == "bodypart"], ALL_WORDS[word_type]) for word_type in WORD_TYPES try: chosen words = word_type: random.choice(words[word_type]) for word_type in WORD_TYPES except IndexError: continue print_animal(chosen_words, mode=random.randint(1, 5))
Further improvements
Use command line arguments. Support at least giving n (the number of descriptions generated), maybe even the filenames of the word lists.
Maybe combine the word lists and add a column specifying what kind of word type each word is. This will lead to a lot of modifications, though.
Don't use a magic integer as mode for print_animal, but build a list of included word_types, so its signature becomes simply
print_animal(word, word_types)
Even better, build an animal object and only fill in the needed parts to describe it. Give it a print function that prints all existing parts.
Post a Comment