Sunday, April 22, 2007

Farm News 04-22-07

Sunday morning, after chores, 66°

Bunny and Kitty Adventures
On Tuesday the bunnies and kittens went to school. They visited two classrooms at the school and then went to the day care center for more visiting. They were very well behaved and left no messes behind. By Thursday the bunnies were hopping out of their nest box and bouncing around the cage. They will soon be big bunnies.

Shotgun, the mother cat, has moved her kittens from where they were easy for children to reach to a place that requires long arms to reach them. Silly cat, she doesn't understand that the more children play with her kittens, the more likely the kittens will find good homes.

Bebe the goose is still on her nest, keeping her eggs warm. She usually leaves around 5:00pm to go eat a bit, have a drink of water, and exercise. After twenty minutes or so she is back on her nest, protecting her eggs. Her babies are due to hatch at the end of this month.

After the Freeze
About half of the garden is planted with lettuce, radishes, beets, peas, carrots, onions, potatoes, zinnias, and marigolds. The asparagus is coming back after the freeze. The daylilies look terrible, worse than they looked in February. There probably won't be any fruit this year. The apples hadn't bloomed before the freeze, but the apples have never bloomed or bore fruit. I'm ready to cut them down and replace them.

The flowering dogwood is now covered with brown, crumpled flowers. The tulips that are left are missing half their petals. Enough of the Koreanspice Viburnum survived to bring in a few flower clusters that filled the house with fragrance.

Generating Text
Icon is a programming language developed at Arizona State University by Ralph Griswold. It is a great language for doing things with text. Icon is not widely used, the last time I checked there were two people in Kansas who were on the mailing list, and that was at least twenty years ago. Nevertheless, Icon is what I am going to use, because it fits the task so well.

Most programmers are comfortable with only one programming language and try to use that language for every task. I prefer to use a language that best matches the task at hand. I am not an expert in any language but I manage, especially now that I'm retired.

This task will, at first, be broken down into two programs, one to create data files and the other to use those data files to create travesties. The first, Scan, will go through the text and create a list of ngrams. An ngram is the little two to five character group that is found in the original text, where n is the number of characters. Take the sentence, “See Spot run.” If we set n equal to 2, the ngrams in the sentence are “Se, ee, e_, _S, Sp, po, ot, t_, _r, ru, un, and n.”. I substituted underscores for spaces to make it more clear. Also notice that the period at the end of the sentence is included.

It helps to know that, in most personal computers, a 'line' is a paragraph. The actual length of the line on the screen usually depends on the width of the window in which it is displayed.

I'm going to start with a very simple program, Scan0, which simply copies the original file to another file. This is what the Icon code looks like for that program:

# File: scan0.icn
# Subject: Simply copies input file to output file, one line at a time
# Author: The Geezer
# Date: 04/18/07
#
# file copier, a basis for a text scanner
#
#
procedure main
# Open files
lgf := open("Scan0.log", "w")
inf := open("Sample.txt", "r")
outf := open("Scan0out.txt", "w")
write(lgf, "Files opened")
#
# Initialize the variables
LinesRead := 0
line := ""
write(lgf, "Variables initialized")
# Process lines
while line := read(inf) do
{
LinesRead +:= 1
write(outf, line)
}
# Done! Write the number of lines to the log file
write(lgf, LinesRead)
# Close the files
close(inf)
close(outf)
write(lgf, "Files closed")
close(lgf)
# End of program.
end

In Icon any line starting with '#' is a comment, not a working part of the program. When the program runs, lines beginning with '#' are ignored. So, skipping all the comments, the first line of the program is procedure main(), which is the required name for the main part of an Icon program. The parentheses at the end are for command line arguments, which will be discussed later.

The first job is to open the files for use. A file variable is a name for the place where you put stuff you want to write to the file, or where you go to find what is read from the file. In some programming languages the file name is used instead of a file variable, but that practice can create some problems, so Icon uses file variables. The programmer can use any name he wishes for the file variable. I chose to use lgf for the log file, inf for the input file, and outf for the output file.

First the program opens the log file, with a “w” to tell the program it will be writing the log file. The the input file is opened with “r” to tell the program that it will be reading that file. Finally, the output file is opened for writing. When the files are opened the program writes a note to the log file that the files have been opened. A log file is not required, but it can be a help when something isn't working right and the programmer can't find the problem.

Next the program creates a variable I chose to call LinesRead and intializes it to zero. Because a line is a paragraph to the computer, this will keep track of the number of paragraphs processed. The next variable is line, which is used to carry the paragraph from the input file to the output file, and it is initialized to be an empty line. Notice that the operator := is used to assign a value to a variable. The := is used to avoid confusion with = which is used to compare to variables to see if they are equal.

Now we come to the actual work of the program. The statement while line := read(inf) do tells the program to do whatever follows every time it can read a line from the input file. When there are no more lines to be read the statement will fail and the program will move on to the next statement.

There are two things to do each time a line is read: increment LinesRead by 1 and write line to the output file. The operator +:= is a very handy thing, it tells the program to increase the value in the variable to the left by whatever number is to the right. The write statement is pretty much self explanatory, write the paragraph in line to the file indicated by outf.

Once the while statement has finished, the work is done. The only things left to do are to write the number of lines to the log file, which might be helpful if things didn't work correctly, and then to close the files. Those are simple tasks, and then end is the end of the program.

So, there is a program that does nothing worthwhile, but it works. Next week I'll start changing it to bring it a step closer to the desired program, one that creates a data file for a travesty generator.


To subscribe to receive Farm News by email, unsubscribe, contribute stories, or complain, send an email to FarmNews@GeezerNet.com . The editor reserves the right to steal ideas submitted, rewrite submissions, and sign false names to them whenever it strikes his fancy to do so.



Labels: , , , , , , , ,

0 Comments:

Post a Comment

<< Home