





     


                 UUNNIIXX FFoorr BBeeggiinnnneerrss ---- SSeeccoonndd EEddiittiioonn


                          _B_r_i_a_n _W_. _K_e_r_n_i_g_h_a_n

                 _(_U_p_d_a_t_e_d _f_o_r _4_._3_B_S_D _b_y _M_a_r_k _S_e_i_d_e_n_)


                               _A_B_S_T_R_A_C_T

            This  paper  is  meant  to  help  new  users get
          started  on  the  UNIX(R)  operating  system.   It
          includes:

           +o
            basics  needed  for day-to-day use of the system
            -- typing commands, correcting typing  mistakes,
            logging  in and out, mail, inter-terminal commu-
            nication, the file system, printing files, redi-
            recting I/O, pipes, and the shell.

           +o
            document  preparation  --  a brief discussion of
            the major formatting programs  and  macro  pack-
            ages,  hints on preparing documents, and capsule
            descriptions of some supporting software.

           +o
            UNIX programming -- using the  editor,  program-
            ming  the  shell,  programming  in C, other lan-
            guages and tools.

           +o
            An annotated UNIX bibliography.


     IINNTTRROODDUUCCTTIIOONN

       From the user's point of view, the UNIX  operating  system
     is  easy  to  learn  and  use, and presents few of the usual
     impediments to getting the job done.  It is  hard,  however,
     for the beginner to know where to start, and how to make the
     best use of the facilities available.  The purpose  of  this
     introduction is to help new users get used to the main ideas
     of the UNIX system and start  making  effective  use  of  it
     quickly.

       You  should  have a couple of other documents with you for
     easy reference as you read this one.  The most important  is
     _T_h_e  _U_N_I_X _P_r_o_g_r_a_m_m_e_r_'_s _M_a_n_u_a_l; it's often easier to tell you
     to read about something in the manual  than  to  repeat  its









     USD:1-2                                   UNIX For Beginners


     contents  here.   The  other  useful  document is _A _T_u_t_o_r_i_a_l
     _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _T_e_x_t _E_d_i_t_o_r_, which  will  tell  you
     how  to  use the editor to get text -- programs, data, docu-
     ments -- into the computer.

       A word of warning: the UNIX system has become quite  popu-
     lar, and there are several major variants in widespread use.
     Of course details also change with time.   So  although  the
     basic  structure  of UNIX and how to use it is common to all
     versions, there will certainly be a  few  things  which  are
     different  on  your  system from what is described here.  We
     have tried to minimize the problem, but be aware of it.   In
     cases of doubt, this paper describes Version 7 UNIX.

       This paper has five sections:

       1.
       Getting  Started:  How  to log in, how to type, what to do
       about mistakes in typing, how to log out.  Some of this is
       dependent on which system you log into (phone numbers, for
       example) and what terminal you use, so this  section  must
       necessarily be supplemented by local information.

       2.
       Day-to-day  Use: Things you need every day to use the sys-
       tem effectively: generally useful commands; the file  sys-
       tem.

       3.
       Document  Preparation: Preparing manuscripts is one of the
       most common uses for UNIX systems.  This section  contains
       advice,  but not extensive instructions on any of the for-
       matting tools.

       4.
       Writing Programs: UNIX is an excellent system for develop-
       ing programs.  This section talks about some of the tools,
       but again is not a tutorial in any of the programming lan-
       guages provided by the system.

       5.
       A  UNIX  Reading List.  An annotated bibliography of docu-
       ments that new users should be aware of.

     II..  GGEETTTTIINNGG SSTTAARRTTEEDD

     LLooggggiinngg IInn

       You must have a UNIX login name, which you  can  get  from
     whoever  administers your system.  You also need to know the
     phone number, unless your system uses permanently  connected
     terminals.   The  UNIX  system  is capable of dealing with a
     wide variety of terminals: Terminet 300's; Execuport, TI and
     similar  portables;  video  (CRT) terminals like the HP2640,









     UNIX For Beginners                                   USD:1-3


     etc.; high-priced  graphics  terminals  like  the  Tektronix
     4014;  plotting  terminals like those from GSI and DASI; and
     even the venerable Teletype in its various forms.  But note:
     UNIX  is  strongly oriented towards devices with _l_o_w_e_r _c_a_s_e_.
     If your terminal produces only upper case  (e.g.,  model  33
     Teletype,  some  video and portable terminals), life will be
     so difficult that you should look for another terminal.

       Be sure to set the switches appropriately on your  device.
     Switches  that  might need to be adjusted include the speed,
     upper/lower case mode, full duplex,  even  parity,  and  any
     others  that  local  wisdom advises.  Establish a connection
     using whatever magic is needed for your terminal;  this  may
     involve  dialing  a  telephone  call  or  merely  flipping a
     switch.  In either case, UNIX should type _`_`llooggiinn::'' at you.
     If  it  types  garbage, you may be at the wrong speed; check
     the switches.  If that fails, push the ``break'' or ``inter-
     rupt''  key a few times, slowly.  If that fails to produce a
     login message, consult a guru.

       When you get a llooggiinn:: message, type  your  login  name  _i_n
     _l_o_w_e_r  _c_a_s_e_.   Follow it by a RETURN; the system will not do
     anything  until  you  type  a  RETURN.   If  a  password  is
     required, you will be asked for it, and (if possible) print-
     ing will be turned off while  you  type  it.   Don't  forget
     RETURN.

       The  culmination of your login efforts is a ``prompt char-
     acter,'' a single character that indicates that  the  system
     is  ready to accept commands from you.  The prompt character
     is usually a dollar sign $$ or a percent sign  %%.   (You  may
     also get a message of the day just before the prompt charac-
     ter, or a notification that you have mail.)

     TTyyppiinngg CCoommmmaannddss

       Once you've seen the prompt character, you can  type  com-
     mands, which are requests that the system do something.  Try
     typing

         ddaattee

     followed by RETURN.  You should get back something like

         MMoonn JJaann 1166 1144::1177::1100 EESSTT 11997788

     Don't forget the RETURN after the command, or  nothing  will
     happen.   If  you think you're being ignored, type a RETURN;
     something should happen.  RETURN won't be  mentioned  again,
     but don't forget it -- it has to be there at the end of each
     line.

       Another command you might try  is  wwhhoo,  which  tells  you
     everyone who is currently logged in:









     USD:1-4                                   UNIX For Beginners


         wwhhoo

     gives something like

         mmbb   ttttyy0011JJaann 1166    0099::1111
         sskkii  ttttyy0055JJaann 1166    0099::3333
         ggaamm  ttttyy1111JJaann 1166    1133::0077

     The  time  is when the user logged in; ``ttyxx'' is the sys-
     tem's idea of what terminal the user is on.

       If you make a mistake typing the command name,  and  refer
     to  a  non-existent command, you will be told.  For example,
     if you type

         wwhhoomm

     you will be told

         wwhhoomm:: nnoott ffoouunndd

     Of course, if you inadvertently type the name of some  other
     command,  it will run, with more or less mysterious results.

     SSttrraannggee TTeerrmmiinnaall BBeehhaavviioorr

       Sometimes you can get into a  state  where  your  terminal
     acts  strangely.   For  example,  each  letter  may be typed
     twice, or the RETURN may not cause a line feed or  a  return
     to  the  left margin.  You can often fix this by logging out
     and  logging back in.[+]  Or you can read the description of
     the command ssttttyy in section 1 of the manual.  To get  intel-
     ligent  treatment  of tab characters (which are much used in
     UNIX) if your terminal doesn't have tabs, type the command

         ssttttyy --ttaabbss

     and the system will convert each tab into the  right  number
     of blanks for you.

     MMiissttaakkeess iinn TTyyppiinngg

       If you make a typing mistake, and see it before RETURN has
     been typed, there are two ways to recover.  The  sharp-char-
     acter  ## erases the last character typed; in fact successive
     uses of ## erase characters back to the beginning of the line
     (but  not beyond).  So if you type badly, you can correct as
     you go:


     -----------
     [+] In Berkeley Unix, the command  "reset<control-
     j>"  will  often  reset a terminal apparently in a
     strange state because a fullscreen editor crashed.









     UNIX For Beginners                                   USD:1-5


         dddd##aattttee####ee

     is the same as ddaattee.[++]

       The at-sign @@ erases all of the characters typed so far on
     the  current  input  line,  so  if the line is irretrievably
     fouled up, type an @@ and start the line over.

       What if you must enter a sharp or at-sign as part  of  the
     text?   If  you  precede  either ## or @@ by a backslash \\, it
     loses its erase meaning.  So to enter a sharp or at-sign  in
     something,  type  \\##  or  \\@@.  The system will always echo a
     newline at you after your at-sign, even  if  preceded  by  a
     backslash.  Don't worry -- the at-sign has been recorded.

       To  erase  a backslash, you have to type two sharps or two
     at-signs, as in \\####.  The backslash is used  extensively  in
     UNIX to indicate that the following character is in some way
     special.

     RReeaadd--aahheeaadd

       UNIX has full read-ahead, which means that you can type as
     fast  as you want, whenever you want, even when some command
     is typing at you.  If you type  during  output,  your  input
     characters  will  appear  intermixed with the output charac-
     ters, but they will be stored away and  interpreted  in  the
     correct  order.   So you can type several commands one after
     another without waiting for the  first  to  finish  or  even
     begin.

     SSttooppppiinngg aa PPrrooggrraamm

       You can stop most programs by typing the character ``DEL''
     (perhaps called ``delete'' or ``rubout'' on your  terminal).
     The  ``interrupt''  or ``break'' key found on most terminals
     can also be used.[+]  In a few programs, like the text  edi-
     tor,  DEL stops whatever the program is doing but leaves you
     in that program.  Hanging up the phone will stop  most  pro-
     grams.[++]




     -----------
     [++] Many installations set  the  erase  character
     for  display  terminals to the delete or backspace
     key. "stty all" tells you what it actually is.
     [+] In Berkeley Unix, "control-c" is the usual way
     to  stop  programs. "stty all" tells you the value
     of your "intr" key.
     [++]  In  most  modern shells, programs running in
     the background continue running even if  you  hang
     up.









     USD:1-6                                   UNIX For Beginners


     LLooggggiinngg OOuutt

       The  easiest  way to log out is to hang up the phone.  You
     can also type

         llooggiinn

     and  let  someone else use the terminal you were on.*  It is
     usually not sufficient just to turn off the terminal.   Most
     UNIX  systems  do not use a time-out mechanism, so you'll be
     there forever unless you hang up.

     MMaaiill

       When you log in, you may sometimes get the message

         YYoouu hhaavvee mmaaiill..

     UNIX provides a postal system so you  can  communicate  with
     other users of the system.  To read your mail, type the com-
     mand

         mmaaiill

     The headers of your mail will be printed, in  the  order  of
     their  receipt.   A  message can be read with the pprriinntt com-
     mand, or specified directly by number.  Other  commands  are
     described  in  the manual.  (Earlier versions of mmaaiill do not
     process one message at a time, but are otherwise similar.)

       How do you send mail to someone else?  Suppose it is to go
     to  ``joe'' (assuming ``joe'' is someone's login name).  The
     easiest way is this:

         mmaaiill jjooee
         _n_o_w _t_y_p_e _i_n _t_h_e _t_e_x_t _o_f _t_h_e _l_e_t_t_e_r
         _o_n _a_s _m_a_n_y _l_i_n_e_s _a_s _y_o_u _l_i_k_e _._._.
         _A_f_t_e_r _t_h_e _l_a_s_t _l_i_n_e _o_f _t_h_e _l_e_t_t_e_r
         _t_y_p_e _t_h_e _c_h_a_r_a_c_t_e_r _`_`_._'_'_,
         _a_l_o_n_e _o_n _t_h_e _l_a_s_t _l_i_n_e_,
         _l_i_k_e _s_o_:
         _.

     And that's it.

       For practice, send  mail  to  yourself.   (This  isn't  as
     strange  as  it  might  sound  -- mail to oneself is a handy
     reminder mechanism.)

       There are other ways to send mail -- you can send a previ-
     ously  prepared letter, and you can mail to a number of peo-
     ple all at once.   For  more  details,  see  mmaaiill(1).   (The
     -----------
     * "control-d" and "logout" are other alternatives.









     UNIX For Beginners                                   USD:1-7


     notation  mmaaiill(1) means the command mmaaiill in section 1 of the
     _U_N_I_X _P_r_o_g_r_a_m_m_e_r_'_s _M_a_n_u_a_l.)

     WWrriittiinngg ttoo ootthheerr uusseerrss[[++]]

       At some point, out of the blue will come a message like

         MMeessssaaggee ffrroomm jjooee ttttyy0077......

     accompanied by a startling beep.  It means that Joe wants to
     talk  to  you, but unless you take explicit action you won't
     be able to talk back.  To respond, type the command

         wwrriittee jjooee

     This establishes a two-way communication path.  Now whatever
     Joe  types  on  his  terminal  will appear on yours and vice
     versa.  The path is slow, rather like talking to  the  moon.
     (If you are in the middle of something, you have to get to a
     state where you can type a command.  Normally, whatever pro-
     gram  you are running has to terminate or be terminated.  If
     you're editing, you can escape temporarily from  the  editor
     -- read the editor tutorial.)

       A  protocol  is  needed to keep what you type from getting
     garbled up with what Joe types.  Typically it's like this:

         Joe types wwrriittee ssmmiitthh and waits.
         Smith types wwrriittee jjooee and waits.
         Joe now types his message (as many lines as  he  likes).
         When  he's  ready  for  a reply, he signals it by typing
         ((oo)), which stands for ``over''.
         Now Smith types a reply, also terminated by ((oo)).
         This cycle repeats until someone  gets  tired;  he  then
         signals  his  intent  to  quit with ((oooo)), for ``over and
         out''.
         To terminate the conversation, each  side  must  type  a
         ``control-d'' character alone on a line.  When the other
         person types his ``control-d'', you will get the message
         EEOOFF on your terminal.



       If  you  write  to  someone  who  isn't  logged in, or who
     doesn't want to be disturbed, you'll be told.  If the target
     is  logged  in  but  doesn't answer after a decent interval,
     simply type ``control-d''.

     -----------
     [+] Although "write" works on Berkeley UNIX, there
     is  a  much  nicer way of communicating using dis-
     play-terminals -- "talk" splits  the  screen  into
     two  sections, and both of you can type simultane-
     ously (see talk(1)).









     USD:1-8                                   UNIX For Beginners


     OOnn--lliinnee MMaannuuaall

       The _U_N_I_X _P_r_o_g_r_a_m_m_e_r_'_s _M_a_n_u_a_l is  typically  kept  on-line.
     If  you  get stuck on something, and can't find an expert to
     assist you, you can print on your terminal some manual  sec-
     tion  that  might help.  This is also useful for getting the
     most up-to-date information on a command.  To print a manual
     section,  type ``man command-name''.  Thus to read up on the
     wwhhoo command, type

         mmaann wwhhoo

     and, of course,

         mmaann mmaann

     tells all about the mmaann command.

     CCoommppuutteerr AAiiddeedd IInnssttrruuccttiioonn

       Your UNIX system  may  have  available  a  program  called
     lleeaarrnn, which provides computer aided instruction on the file
     system and basic commands, the editor, document preparation,
     and even C programming.  Try typing the command

         lleeaarrnn

     If  lleeaarrnn exists on your system, it will tell you what to do
     from there.

     IIII..  DDAAYY--TTOO--DDAAYY UUSSEE

     CCrreeaattiinngg FFiilleess ---- TThhee EEddiittoorr

       If you have to type a paper or a letter or a program,  how
     do  you  get the information stored in the machine?  Most of
     these tasks are done  with  the  UNIX  ``text  editor''  eedd.
     Since  eedd is thoroughly documented in eedd(1) and explained in
     _A _T_u_t_o_r_i_a_l _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _T_e_x_t  _E_d_i_t_o_r_,  we  won't
     spend  any  time here describing how to use it.  All we want
     it for right now is to make some _f_i_l_e_s_.  (A file is  just  a
     collection  of information stored in the machine, a simplis-
     tic but adequate definition.)

       To create a file called jjuunnkk with some text in it, do  the
     following:

         eedd jjuunnkk(invokes the text editor)
         aa     (command to ``ed'', to add text)
         _n_o_w _t_y_p_e _i_n
         _w_h_a_t_e_v_e_r _t_e_x_t _y_o_u _w_a_n_t _._._.
         ..     (signals the end of adding text)

     The ``..'' that signals the end of adding text must be at the









     UNIX For Beginners                                   USD:1-9


     beginning of a line by itself.  Don't forget it,  for  until
     it  is  typed,  no  other  eedd commands will be recognized --
     everything you type will be treated as text to be added.

       At this point you can do various editing operations on the
     text  you  typed  in,  such as correcting spelling mistakes,
     rearranging paragraphs and  the  like.   Finally,  you  must
     write  the  information  you have typed into a file with the
     editor command ww:

         ww

     eedd will respond with the number of characters it wrote  into
     the file jjuunnkk.

       Until  the ww command, nothing is stored permanently, so if
     you  hang  up  and  go home the information is lost.[+]  But
     after ww the information is there permanently;  you  can  re-
     access it any time by typing

         eedd jjuunnkk

     Type  a  qq  command to quit the editor.  (If you try to quit
     without writing, eedd will print a ?? to remind you.  A  second
     qq gets you out regardless.)

       Now  create  a second file called tteemmpp in the same manner.
     You should now have two files, jjuunnkk and tteemmpp.

     WWhhaatt ffiilleess aarree oouutt tthheerree??

       The llss (for ``list'') command lists the  names  (not  con-
     tents)  of  any  of the files that UNIX knows about.  If you
     type

         llss

     the response will be

         jjuunnkk
         tteemmpp

     which are indeed the two files just created.  The names  are
     sorted  into  alphabetical  order  automatically,  but other
     variations are possible.  For example, the command

         llss --tt

     causes the files to be listed in the  order  in  which  they
     -----------
     [+] This is not strictly true -- if  you  hang  up
     while  editing,  the  data  you were working on is
     saved in a file called eedd..hhuupp, which you can  con-
     tinue with at your next session.









     USD:1-10                                  UNIX For Beginners


     were last changed, most recent first.  The --ll option gives a
     ``long'' listing:

         llss --ll

     will produce something like

         --rrww--rrww--rrww--  11 bbwwkk  uusseerrss 4411 JJuull 2222 22::5566 jjuunnkk
         --rrww--rrww--rrww--  11 bbwwkk  uusseerrss 7788 JJuull 2222 22::5577 tteemmpp

     The date and time are of the last change to the  file.   The
     41  and  78 are the number of characters (which should agree
     with the numbers you got from eedd).  bbwwkk is the owner of  the
     file, that is, the person who created it.  uusseerrss is the name
     of the file's group.  The --rrww--rrww--rrww-- tells who  has  permis-
     sion to read and write the file, in this case everyone.

       Options  can  be  combined: llss --lltt gives the same thing as
     llss --ll, but sorted into time order.  You can  also  name  the
     files you're interested in, and llss will list the information
     about them only.  More details can be found in llss(1).

       The use of optional arguments  that  begin  with  a  minus
     sign,  like --tt and --lltt, is a common convention for UNIX pro-
     grams.  In general, if a program accepts such optional argu-
     ments,  they  precede  any  filename  arguments.  It is also
     vital that you separate the various arguments  with  spaces:
     llss--ll is not the same as llss  --ll.

     PPrriinnttiinngg FFiilleess

       Now that you've got a file of text, how do you print it so
     people can look at it?  There are a host of programs that do
     that, probably more than are needed.

       One  simple  thing is to use the editor, since printing is
     often done just before making changes anyway.  You can say

         eedd jjuunnkk
         11,,$$pp

     eedd will reply with the count of the characters in  jjuunnkk  and
     then  print  all the lines in the file.  After you learn how
     to use the editor, you can be selective about the parts  you
     print.

       There  are  times when it's not feasible to use the editor
     for printing.  For example, there is a limit on  how  big  a
     file  eedd  can handle (several thousand lines).  Secondly, it
     will only print one file at a time, and sometimes  you  want
     to  print  several, one after another.  So here are a couple
     of alternatives.











     UNIX For Beginners                                  USD:1-11


       First is ccaatt, the simplest of all the  printing  programs.
     ccaatt  simply  prints  on the terminal the contents of all the
     files named in a list.  Thus

         ccaatt jjuunnkk

     prints one file, and

         ccaatt jjuunnkk tteemmpp

     prints two.  The files are simply  concatenated  (hence  the
     name _`_`ccaatt'') onto the terminal.

       pprr produces formatted printouts of files.  As with ccaatt, pprr
     prints all the files named in a  list.   The  difference  is
     that  it  produces headings with date, time, page number and
     file name at the top of each page, and extra lines  to  skip
     over the fold in the paper.  Thus,

         pprr jjuunnkk tteemmpp

     will  print  jjuunnkk neatly, then skip to the top of a new page
     and print tteemmpp neatly.

       pprr can also produce multi-column output:

         pprr --33 jjuunnkk

     prints jjuunnkk in 3-column format.  You can use any  reasonable
     number  in  place  of ``3'' and pprr will do its best.  pprr has
     other capabilities as well; see pprr(1).

       It should be noted that pprr is _n_o_t a formatting program  in
     the  sense of shuffling lines around and justifying margins.
     The true formatters are nnrrooffff and ttrrooffff, which we  will  get
     to in the section on document preparation.

       There  are  also programs that print files on a high-speed
     printer.  Look in your manual under llpprr.

     SShhuufffflliinngg FFiilleess AAbboouutt

       Now that you have some files in the file system  and  some
     experience in printing them, you can try bigger things.  For
     example, you can move a  file  from  one  place  to  another
     (which amounts to giving it a new name), like this:

         mmvv jjuunnkk pprreecciioouuss

     This  means  that  what  used  to  be ``junk'' is now ``pre-
     cious''.  If you do an llss command now, you will get












     USD:1-12                                  UNIX For Beginners


         pprreecciioouuss
         tteemmpp

     Beware that if you move a file to another one  that  already
     exists, the already existing contents are lost forever.

       If you want to make a _c_o_p_y of a file (that is, to have two
     versions of something), you can use the ccpp command:

         ccpp pprreecciioouuss tteemmpp11

     makes a duplicate copy of pprreecciioouuss in tteemmpp11.

       Finally, when you get tired of creating and moving  files,
     there  is  a  command  to remove files from the file system,
     called rrmm.

         rrmm tteemmpp tteemmpp11

     will remove both of the files named.

       You will get a warning message if one of the  named  files
     wasn't  there,  but  otherwise  rrmm, like most UNIX commands,
     does its work silently.  There is no prompting  or  chatter,
     and error messages are occasionally curt.  This terseness is
     sometimes disconcerting to newcomers, but experienced  users
     find it desirable.

     WWhhaatt''ss iinn aa FFiilleennaammee

       So far we have used filenames without ever saying what's a
     legal name, so it's time for  a  couple  of  rules.   First,
     filenames  are  limited to 14 characters, which is enough to
     be  descriptive.[+]  Second, although you can use almost any
     character in a filename, common sense says you should  stick
     to ones that are visible, and that you should probably avoid
     characters that might be used with other meanings.  We  have
     already  seen,  for  example,  that in the llss command, llss --tt
     means to list in time order.  So if you  had  a  file  whose
     name was --tt, you would have a tough time listing it by name.
     Besides the minus sign, there  are  other  characters  which
     have  special meaning.  To avoid pitfalls, you would do well
     to use only letters, numbers and  the  period  until  you're
     familiar with the situation.

       On to some more positive suggestions.  Suppose you're typ-
     ing a large document like a book.   Logically  this  divides
     into  many small pieces, like chapters and perhaps sections.
     Physically it must be divided too, for eedd  will  not  handle
     really  big  files.   Thus you should type the document as a
     number of files.  You might have a separate  file  for  each
     -----------
     [+] In  4.2 BSD the  limit  was  extended  to  255
     characters.









     UNIX For Beginners                                  USD:1-13


     chapter, called

         cchhaapp11
         cchhaapp22
         etc...

     Or,  if  each  chapter  were  broken into several files, you
     might have

         cchhaapp11..11
         cchhaapp11..22
         cchhaapp11..33
         ......
         cchhaapp22..11
         cchhaapp22..22
         ......

     You can now tell at a glance where a  particular  file  fits
     into the whole.

       There  are  advantages  to  a systematic naming convention
     which are not obvious to the novice UNIX user.  What if  you
     wanted to print the whole book?  You could say

         pprr cchhaapp11..11 cchhaapp11..22 cchhaapp11..33 ............

     but you would get tired pretty fast, and would probably even
     make mistakes.  Fortunately, there is a shortcut.   You  can
     say

         pprr cchhaapp**

     The  **  means  ``anything  at all,'' so this translates into
     ``print all files whose names begin with cchhaapp'',  listed  in
     alphabetical order.

       This  shorthand  notation is not a property of the pprr com-
     mand, by the way.  It is system-wide, a service of the  pro-
     gram that interprets commands (the ``shell,'' sshh(1)).  Using
     that fact, you can see how to list the names of the files in
     the book:

         llss cchhaapp**

     produces

         cchhaapp11..11
         cchhaapp11..22
         cchhaapp11..33
         ......

     The  **  is not limited to the last position in a filename --
     it can be anywhere and can occur several times.  Thus










     USD:1-14                                  UNIX For Beginners


         rrmm **jjuunnkk** **tteemmpp**

     removes all files that contain jjuunnkk or tteemmpp as any  part  of
     their  name.   As  a special case, ** by itself matches every
     filename, so

         pprr **

     prints all your files (alphabetical order), and

         rrmm **

     removes _a_l_l _f_i_l_e_s_.  (You had better be _v_e_r_y sure that's what
     you wanted to say!)

       The  ** is not the only pattern-matching feature available.
     Suppose you want to print only chapters 1 through 4  and  9.
     Then you can say

         pprr cchhaapp[[1122334499]]**

     The  [[......]]  means  to match any of the characters inside the
     brackets.  A range of consecutive letters or digits  can  be
     abbreviated, so you can also do this with

         pprr cchhaapp[[11--4499]]**

     Letters  can also be used within brackets: [[aa--zz]] matches any
     character in the range aa through zz.

       The ?? pattern matches any single character, so

         llss ??

     lists all files which have single-character names, and

         llss --ll cchhaapp??..11

     lists information about  the  first  file  of  each  chapter
     _(cchhaapp11..11, cchhaapp22..11, etc.).

       Of these niceties, ** is certainly the most useful, and you
     should get used to it.  The others  are  frills,  but  worth
     knowing.

       If you should ever have to turn off the special meaning of
     **, ??, etc., enclose the entire argument in single quotes, as
     in

         llss ''??''

     We'll see some more examples of this shortly.











     UNIX For Beginners                                  USD:1-15


     WWhhaatt''ss iinn aa FFiilleennaammee,, CCoonnttiinnuueedd

       When  you  first  made  that file called jjuunnkk, how did the
     system know that there wasn't another jjuunnkk  somewhere  else,
     especially since the person in the next office is also read-
     ing this tutorial?  The answer is that generally  each  user
     has  a private _d_i_r_e_c_t_o_r_y, which contains only the files that
     belong to him.  When you log in, you are ``in'' your  direc-
     tory.  Unless you take special action, when you create a new
     file, it is made in the directory that you are currently in;
     this  is most often your own directory, and thus the file is
     unrelated to any other file of  the  same  name  that  might
     exist in someone else's directory.

       The  set  of  all  files is organized into a (usually big)
     tree, with your files  located  several  branches  into  the
     tree.   It is possible for you to ``walk'' around this tree,
     and to find any file in the system, by starting at the  root
     of  the  tree  and walking along the proper set of branches.
     Conversely, you can start where you are and walk toward  the
     root.

       Let's  try  the latter first.  The basic tools is the com-
     mand ppwwdd (``print working  directory''),  which  prints  the
     name of the directory you are currently in.

       Although the details will vary according to the system you
     are on, if you give the command ppwwdd, it will print something
     like

         //uussrr//yyoouurr--nnaammee

     This says that you are currently in the directory yyoouurr--nnaammee,
     which is in turn in the directory //uussrr, which is in turn  in
     the  root  directory  called by convention just //.  (Even if
     it's not called //uussrr on your system, you will get  something
     analogous.   Make  the  corresponding  mental adjustment and
     read on.)

       If you now type

         llss //uussrr//yyoouurr--nnaammee

     you should get exactly the same list of file  names  as  you
     get  from  a  plain llss: with no arguments, llss lists the con-
     tents of the current directory; given the name of  a  direc-
     tory, it lists the contents of that directory.

       Next, try

         llss //uussrr

     This  should  print  a  long series of names, among which is
     your own login name yyoouurr--nnaammee.  On many systems,  uussrr  is  a









     USD:1-16                                  UNIX For Beginners


     directory  that  contains  the directories of all the normal
     users of the system, like you.

       The next step is to try

         llss //

     You should get a  response  something  like  this  (although
     again the details may be different):

         bbiinn
         ddeevv
         eettcc
         lliibb
         ttmmpp
         uussrr

     This  is a collection of the basic directories of files that
     the system knows about; we are at the root of the tree.

       Now try

         ccaatt //uussrr//yyoouurr--nnaammee//jjuunnkk

     (if jjuunnkk is still around in your directory).  The name

         //uussrr//yyoouurr--nnaammee//jjuunnkk

     is called the ppaatthhnnaammee of the file that you  normally  think
     of  as  ``junk''.   ``Pathname''  has an obvious meaning: it
     represents the full name of the path you have to follow from
     the root through the tree of directories to get to a partic-
     ular file.  It is a universal rule in the UNIX  system  that
     anywhere  you  can  use  an ordinary filename, you can use a
     pathname.

       Here is a picture which may make this clearer:

                                  (root)
                                  / | \
                                 /  |  \
                                /   |   \
                       bin    etc    usr    dev   tmp
                  / | \   / | \   / | \   / | \   / | \
                                 /  |  \
                                /   |   \
                             adam  eve   mary
                         /        /   \        \
                                  /     \       junk
                                junk temp


     Notice that Mary's jjuunnkk is unrelated to Eve's.










     UNIX For Beginners                                  USD:1-17


       This isn't too exciting if all the files of  interest  are
     in  your own directory, but if you work with someone else or
     on several projects concurrently, it becomes  handy  indeed.
     For example, your friends can print your book by saying

         pprr //uussrr//yyoouurr--nnaammee//cchhaapp**

     Similarly,  you can find out what files your neighbor has by
     saying

         llss //uussrr//nneeiigghhbboorr--nnaammee

     or make your own copy of one of his files by

         ccpp //uussrr//yyoouurr--nneeiigghhbboorr//hhiiss--ffiillee yyoouurrffiillee


       If your neighbor doesn't want you  poking  around  in  his
     files,  or  vice  versa, privacy can be arranged.  Each file
     and directory has  read-write-execute  permissions  for  the
     owner,  a group, and everyone else, which can be set to con-
     trol access.  See llss(1) and cchhmmoodd(1) for details.  As a mat-
     ter of observed fact, most users most of the time find open-
     ness of more benefit than privacy.

       As a final experiment with pathnames, try

         llss //bbiinn //uussrr//bbiinn

     Do some of the names look familiar?  When you run a program,
     by  typing  its  name after the prompt character, the system
     simply looks for a file of that  name.   It  normally  looks
     first  in  your  directory  (where it typically doesn't find
     it), then in //bbiinn and finally in //uussrr//bbiinn.  There is nothing
     magic  about  commands like ccaatt or llss, except that they have
     been collected into a couple of places to be  easy  to  find
     and administer.

       What  if  you  work  regularly with someone else on common
     information in his directory?  You could just log in as your
     friend  each time you want to, but you can also say ``I want
     to work on his files instead of my own''.  This is  done  by
     changing the directory that you are currently in:

         ccdd //uussrr//yyoouurr--ffrriieenndd

     (On  some systems, ccdd is spelled cchhddiirr.)  Now when you use a
     filename in something like ccaatt or pprr, it refers to the  file
     in  your  friend's  directory.  Changing directories doesn't
     affect any permissions associated with  a  file  --  if  you
     couldn't  access a file from your own directory, changing to
     another directory won't alter that fact.  Of course, if  you
     forget what directory you're in, type










     USD:1-18                                  UNIX For Beginners


         ppwwdd

     to find out.

       It is usually convenient to arrange your own files so that
     all the files related to one thing are in a directory  sepa-
     rate  from other projects.  For example, when you write your
     book, you might want to keep all the  text  in  a  directory
     called bbooookk.  So make one with

         mmkkddiirr bbooookk

     then go to it with

         ccdd bbooookk

     then  start typing chapters.  The book is now found in (pre-
     sumably)

         //uussrr//yyoouurr--nnaammee//bbooookk

     To remove the directory bbooookk, type

         rrmm bbooookk//**
         rrmmddiirr bbooookk

     The first command removes all files from the directory;  the
     second removes the empty directory.

       You can go up one level in the tree of files by saying

         ccdd ....

     _`_`....''  is  the name of the parent of whatever directory you
     are currently in.  For completeness, _`_`..'' is  an  alternate
     name for the directory you are in.

     UUssiinngg FFiilleess iinnsstteeaadd ooff tthhee TTeerrmmiinnaall

       Most of the commands we have seen so far produce output on
     the terminal; some, like the editor, also take  their  input
     from the terminal.  It is universal in UNIX systems that the
     terminal can be replaced by a file for  either  or  both  of
     input and output.  As one example,

         llss

     makes a list of files on your terminal.  But if you say

         llss >>ffiilleelliisstt

     a  list  of  your  files will be placed in the file ffiilleelliisstt
     (which will be created if it doesn't already exist, or over-
     written if it does).  The symbol >> means ``put the output on









     UNIX For Beginners                                  USD:1-19


     the following file, rather than on the terminal.''   Nothing
     is  produced on the terminal.  As another example, you could
     combine several files into one by capturing  the  output  of
     ccaatt in a file:

         ccaatt ff11 ff22 ff33 >>tteemmpp


       The  symbol >>>> operates very much like >> does, except that
     it means ``add to the end of.''  That is,

         ccaatt ff11 ff22 ff33 >>>>tteemmpp

     means to concatenate ff11, ff22 and ff33 to the end of whatever is
     already  in  tteemmpp,  instead of overwriting the existing con-
     tents.  As with >>, if tteemmpp doesn't exist, it will be created
     for you.

       In a similar way, the symbol << means to take the input for
     a program from the following file, instead of from the  ter-
     minal.   Thus,  you  could make up a script of commonly used
     editing commands and put them into  a  file  called  ssccrriipptt.
     Then you can run the script on a file by saying

         eedd ffiillee <<ssccrriipptt

     As  another  example,  you can use eedd to prepare a letter in
     file lleett, then send it to several people with

         mmaaiill aaddaamm eevvee mmaarryy jjooee <<lleett


     PPiippeess

       One of the novel contributions of the UNIX system  is  the
     idea  of a _p_i_p_e_.  A pipe is simply a way to connect the out-
     put of one program to the input of another program,  so  the
     two run as a sequence of processes -- a pipeline.

       For example,

         pprr ff gg hh

     will  print  the  files ff, gg, and hh, beginning each on a new
     page.  Suppose you want  them  run  together  instead.   You
     could say

         ccaatt ff gg hh >>tteemmpp
         pprr <<tteemmpp
         rrmm tteemmpp

     but  this is more work than necessary.  Clearly what we want
     is to take the output of ccaatt and connect it to the input  of
     pprr.  So let us use a pipe:









     USD:1-20                                  UNIX For Beginners


         ccaatt ff gg hh || pprr

     The  vertical bar || means to take the output from ccaatt, which
     would normally have gone to the terminal, and put it into pprr
     to be neatly formatted.

       There are many other examples of pipes.  For example,

         llss || pprr --33

     prints  a  list of your files in three columns.  The program
     wwcc counts the number of lines, words and characters  in  its
     input,  and  as  we  saw  earlier, wwhhoo prints a list of cur-
     rently-logged on people, one per line.  Thus

         wwhhoo || wwcc

     tells how many people are logged on.  And of course

         llss || wwcc

     counts your files.

       Any program that reads from the terminal can read  from  a
     pipe  instead;  any  program that writes on the terminal can
     drive a pipe.  You can have as many elements in  a  pipeline
     as you wish.

       Many  UNIX  programs  are  written  so that they will take
     their input from one or more files  if  file  arguments  are
     given;  if  no  arguments  are given they will read from the
     terminal, and thus can be used  in  pipelines.   pprr  is  one
     example:

         pprr --33 aa bb cc

     prints files aa, bb and cc in order in three columns.  But in

         ccaatt aa bb cc || pprr --33

     pprr prints the information coming down the pipeline, still in
     three columns.

     TThhee SShheellll

       We have already mentioned once  or  twice  the  mysterious
     ``shell,'' which is in fact sshh(1).  The shell is the program
     that interprets what you type as commands and arguments.  It
     also  looks  after  translating **, etc., into lists of file-
     names, and <<, >>, and || into  changes  of  input  and  output
     streams.

       The  shell  has  other capabilities too.  For example, you
     can run two programs with one command line by separating the









     UNIX For Beginners                                  USD:1-21


     commands  with  a  semicolon; the shell recognizes the semi-
     colon and breaks the line into two commands.  Thus

         ddaattee;; wwhhoo

     does both commands before returning with a prompt character.

       You can also have more than one program running _s_i_m_u_l_t_a_n_e_-
     _o_u_s_l_y if you wish.  For example, if you are doing  something
     time-consuming,  like  the  editor script of an earlier sec-
     tion, and you don't want to  wait  around  for  the  results
     before starting something else, you can say

         eedd ffiillee <<ssccrriipptt &&

     The ampersand at the end of a command line says ``start this
     command running, then take further commands from the  termi-
     nal  immediately,''  that is, don't wait for it to complete.
     Thus the script will begin, but you can do something else at
     the same time.  Of course, to keep the output from interfer-
     ing with what you're doing on the terminal, it would be bet-
     ter to say

         eedd ffiillee <<ssccrriipptt >>ssccrriipptt..oouutt &&

     which saves the output lines in a file called ssccrriipptt..oouutt.

       When  you  initiate  a  command with &&, the system replies
     with a number called the process  number,  which  identifies
     the  command  in case you later want to stop it.  If you do,
     you can say

         kkiillll pprroocceessss--nnuummbbeerr

     If you forget the process number, the command ppss  will  tell
     you  about everything you have running.  (If you are desper-
     ate, kkiillll 00 will kill all your processes.)   And  if  you're
     curious  about  other  people,  ppss aa will tell you about _a_l_l
     programs that are currently running.

       You can say

         ((ccoommmmaanndd--11;; ccoommmmaanndd--22;; ccoommmmaanndd--33)) &&

     to start three commands in the background, or you can  start
     a background pipeline with

         ccoommmmaanndd--11 || ccoommmmaanndd--22 &&


       Just as you can tell the editor or some similar program to
     take its input from a file instead of from the terminal, you
     can  tell  the  shell  to read a file to get commands.  (Why
     not? The shell, after all,  is  just  a  program,  albeit  a









     USD:1-22                                  UNIX For Beginners


     clever  one.)  For instance, suppose you want to set tabs on
     your terminal, and find out the date and who's on the system
     every time you log in.  Then you can put the three necessary
     commands _(ttaabbss, ddaattee,  wwhhoo)  into  a  file,  let's  call  it
     ssttaarrttuupp, and then run it with

         sshh ssttaarrttuupp

     This  says  to run the shell with the file ssttaarrttuupp as input.
     The effect is as if you had typed the contents of ssttaarrttuupp on
     the terminal.

       If  this  is  to be a regular thing, you can eliminate the
     need to type sshh: simply type, once only, the command

         cchhmmoodd ++xx ssttaarrttuupp

     and thereafter you need only say

         ssttaarrttuupp

     to run the sequence of commands.  The cchhmmoodd(1) command marks
     the  file  executable; the shell recognizes this and runs it
     as a sequence of commands.

       If you want ssttaarrttuupp to run automatically  every  time  you
     log  in,  create  a  file  in  your  login  directory called
     ..pprrooffiillee, and place in it the line ssttaarrttuupp.  When the  shell
     first  gains  control  when  you  log  in,  it looks for the
     ..pprrooffiillee  file and does whatever commands it finds in it.[+]
     We'll get back to the shell in the section on programming.


     IIIIII.. DDOOCCUUMMEENNTT PPRREEPPAARRAATTIIOONN

       UNIX systems are used extensively  for  document  prepara-
     tion.   There  are  two  major formatting programs, that is,
     programs that produce a text with justified  right  margins,
     automatic page numbering and titling, automatic hyphenation,
     and the like.  nnrrooffff is designed to produce output on termi-
     nals  and  line-printers.   ttrrooffff  (pronounced ``tee-roff'')
     instead drives a phototypesetter, which produces  very  high
     quality  output  on photographic paper.  This paper was for-
     matted with ttrrooffff.

     FFoorrmmaattttiinngg PPaacckkaaggeess

       The basic idea of nnrrooffff and ttrrooffff is that the text  to  be
     formatted  contains  within  it ``formatting commands'' that
     indicate in detail how the formatted text is to  look.   For
     example, there might be commands that specify how long lines
     are, whether to use  single  or  double  spacing,  and  what
     -----------
     [+] The c shell instead reads a file called ..llooggiinn









     UNIX For Beginners                                  USD:1-23


     running titles to use on each page.

       Because  nnrrooffff  and  ttrrooffff are relatively hard to learn to
     use effectively, several ``packages'' of  canned  formatting
     requests  are  available to let you specify paragraphs, run-
     ning titles, footnotes, multi-column output, and so on, with
     little  effort  and without having to learn nnrrooffff and ttrrooffff.
     These packages take  a  modest  effort  to  learn,  but  the
     rewards  for  using  them  are so great that it is time well
     spent.

       In this section, we will  provide  a  hasty  look  at  the
     ``manuscript''  package  known  as --mmss.  Formatting requests
     typically consist of a period and  two  upper-case  letters,
     such  as  ..TTLL, which is used to introduce a title, or ..PPPP to
     begin a new paragraph.

       A document is typed so it looks something like this:

         ..TTLL
         ttiittllee ooff ddooccuummeenntt
         ..AAUU
         aauutthhoorr nnaammee
         ..SSHH
         sseeccttiioonn hheeaaddiinngg
         ..PPPP
         ppaarraaggrraapphh ......
         ..PPPP
         aannootthheerr ppaarraaggrraapphh ......
         ..SSHH
         aannootthheerr sseeccttiioonn hheeaaddiinngg
         ..PPPP
         eettcc..

     The lines that  begin  with  a  period  are  the  formatting
     requests.   For  example, ..PPPP calls for starting a new para-
     graph.  The precise meaning of ..PPPP depends  on  what  output
     device is being used (typesetter or terminal, for instance),
     and on what publication the document will  appear  in.   For
     example,  --mmss  normally assumes that a paragraph is preceded
     by a space (one line in nnrrooffff, 1/2 line in ttrrooffff),  and  the
     first  word  is indented.  These rules can be changed if you
     like, but they are changed by changing the interpretation of
     ..PPPP, not by re-typing the document.

       To  actually  produce  a document in standard format using
     --mmss, use the command

         ttrrooffff --mmss ffiilleess ......

     for the typesetter, and

         nnrrooffff --mmss ffiilleess ......










     USD:1-24                                  UNIX For Beginners


     for a terminal.  The --mmss argument tells ttrrooffff and  nnrrooffff  to
     use the manuscript package of formatting requests.

       There  are  several  similar  packages; check with a local
     expert to determine which ones are in  common  use  on  your
     machine.

     SSuuppppoorrttiinngg TToooollss

       In  addition  to  the basic formatters, there is a host of
     supporting programs that  help  with  document  preparation.
     The list in the next few paragraphs is far from complete, so
     browse through the manual and check with people  around  you
     for other possibilities.

       eeqqnn  and  nneeqqnn let you integrate mathematics into the text
     of a document, in an  easy-to-learn  language  that  closely
     resembles  the  way  you would speak it aloud.  For example,
     the eeqqnn input

         ssuumm ffrroomm ii==00 ttoo nn xx ssuubb ii ~~==~~ ppii oovveerr 22

     produces the output

                sum from i=0 to n x sub i ~=~ pi over 2


       The program ttbbll provides an analogous service for  prepar-
     ing tabular material; it does all the computations necessary
     to  align  complicated  columns  with  elements  of  varying
     widths.

       rreeffeerr  prepares  bibliographic citations from a data base,
     in whatever style is defined by the formatting package.   It
     looks  after  all  the  details  of  numbering references in
     sequence, filling in page and volume  numbers,  getting  the
     author's initials and the journal name right, and so on.

       ssppeellll and ttyyppoo detect possible spelling mistakes in a doc-
     ument.[+]  ssppeellll works by comparing the words in your  docu-
     ment  to  a  dictionary,  printing those that are not in the
     dictionary.  It  knows  enough  about  English  spelling  to
     detect  plurals  and  the  like, so it does a very good job.
     ttyyppoo looks for  words  which  are  ``unusual'',  and  prints
     those.   Spelling mistakes tend to be more unusual, and thus
     show up early when the most unusual words are printed first.

       ggrreepp looks through a set of files for lines that contain a
     particular text pattern (rather like  the  editor's  context
     search does, but on a bunch of files).  For example,


     -----------
     [+] "typo" is not provided with Berkeley Unix.









     UNIX For Beginners                                  USD:1-25


         ggrreepp ''iinngg$$'' cchhaapp**

     will  find  all  lines  that end with the letters iinngg in the
     files cchhaapp**.  (It is almost always a good  practice  to  put
     single  quotes  around  the pattern you're searching for, in
     case it contains characters like ** or $$ that have a  special
     meaning to the shell.)  ggrreepp is often useful for finding out
     in which of a set of files the misspelled words detected  by
     ssppeellll are actually located.

       ddiiffff  prints  a list of the differences between two files,
     so you can compare two versions of  something  automatically
     (which certainly beats proofreading by hand).

       wwcc  counts  the  words,  lines  and characters in a set of
     files.  ttrr translates characters into other characters;  for
     example  it will convert upper to lower case and vice versa.
     This translates upper into lower:

         ttrr AA--ZZ aa--zz <<iinnppuutt >>oouuttppuutt


       ssoorrtt sorts files in a variety of ways; sseedd  provides  many
     of the editing facilities of eedd, but can apply them to arbi-
     trarily long inputs.  aawwkk provides the ability  to  do  both
     pattern  matching  and  numeric  computations, and to conve-
     niently process fields within lines.  These programs are for
     more  advanced  users,  and they are not limited to document
     preparation.  Put them on  your  list  of  things  to  learn
     about.

       Most of these programs are either independently documented
     (like eeqqnn and ttbbll), or  are  sufficiently  simple  that  the
     description  in  the  _U_N_I_X  _P_r_o_g_r_a_m_m_e_r_'_s  _M_a_n_u_a_l is adequate
     explanation.

     HHiinnttss ffoorr PPrreeppaarriinngg DDooccuummeennttss

       Most documents go through several  versions  (always  more
     than   you  expected)  before  they  are  finally  finished.
     Accordingly, you should do whatever possible to make the job
     of changing them easy.

       First,  when  you  do  the purely mechanical operations of
     typing, type so that subsequent editing will be easy.  Start
     each  sentence  on  a new line.  Make lines short, and break
     lines at natural places, such  as  after  commas  and  semi-
     colons, rather than randomly.  Since most people change doc-
     uments by rewriting phrases and adding, deleting  and  rear-
     ranging  sentences,  these  precautions simplify any editing
     you have to do later.

       Keep the individual files of a  document  down  to  modest
     size,  perhaps  ten  to fifteen thousand characters.  Larger









     USD:1-26                                  UNIX For Beginners


     files edit more slowly, and of course if  you  make  a  dumb
     mistake  it's  better  to have clobbered a small file than a
     big one.  Split into files at natural boundaries in the doc-
     ument,  for the same reasons that you start each sentence on
     a new line.

       The second aspect of making change easy is to  not  commit
     yourself to formatting details too early.  One of the advan-
     tages of formatting packages like --mmss is  that  they  permit
     you to delay decisions to the last possible moment.  Indeed,
     until a document is printed, it is not even decided  whether
     it will be typeset or put on a line printer.

       As a rule of thumb, for all but the most trivial jobs, you
     should type a document in terms of a set  of  requests  like
     ..PPPP, and then define them appropriately, either by using one
     of the canned packages (the better way) or by defining  your
     own  nnrrooffff  and ttrrooffff commands.  As long as you have entered
     the text in some systematic way, it can always be cleaned up
     and  re-formatted by a judicious combination of editing com-
     mands and request definitions.

     IIVV..  PPRROOGGRRAAMMMMIINNGG

       There will be no attempt made to teach any of the program-
     ming  languages  available  but a few words of advice are in
     order.  One of the reasons why the UNIX system is a  produc-
     tive programming environment is that there is already a rich
     set of tools available, and facilities like pipes, I/O redi-
     rection,  and  the  capabilities  of the shell often make it
     possible to do a  job  by  pasting  together  programs  that
     already exist instead of writing from scratch.

     TThhee SShheellll

       The  pipe  mechanism  lets you fabricate quite complicated
     operations out of spare parts that already exist.  For exam-
     ple, the first draft of the ssppeellll program was (roughly)

         ccaatt ......     _c_o_l_l_e_c_t _t_h_e _f_i_l_e_s
         || ttrr ......    _p_u_t _e_a_c_h _w_o_r_d _o_n _a _n_e_w _l_i_n_e
         || ttrr ......    _d_e_l_e_t_e _p_u_n_c_t_u_a_t_i_o_n_, _e_t_c_.
         || ssoorrtt      _i_n_t_o _d_i_c_t_i_o_n_a_r_y _o_r_d_e_r
         || uunniiqq      _d_i_s_c_a_r_d _d_u_p_l_i_c_a_t_e_s
         || ccoommmm      _p_r_i_n_t _w_o_r_d_s _i_n _t_e_x_t
                 _b_u_t _n_o_t _i_n _d_i_c_t_i_o_n_a_r_y

     More  pieces  have  been added subsequently, but this goes a
     long way for such a small effort.

       The editor can be made to do things  that  would  normally
     require  special programs on other systems.  For example, to
     list the first and last lines of each of  a  set  of  files,
     such as a book, you could laboriously type









     UNIX For Beginners                                  USD:1-27


         eedd
         ee cchhaapp11..11
         11pp
         $$pp
         ee cchhaapp11..22
         11pp
         $$pp
         etc.

     But you can do the job much more easily.  One way is to type

         llss cchhaapp** >>tteemmpp

     to get the list of filenames into a file.   Then  edit  this
     file to make the necessary series of editing commands (using
     the global commands of eedd), and write it into  ssccrriipptt.   Now
     the command

         eedd <<ssccrriipptt

     will  produce  the same output as the laborious hand typing.
     Alternately (and more easily), you can use the fact that the
     shell  will  perform loops, repeating a set of commands over
     and over again for a set of arguments:

         ffoorr ii iinn cchhaapp**
         ddoo
              eedd $$ii <<ssccrriipptt
         ddoonnee

     This sets the shell variable ii to each file  name  in  turn,
     then  does  the  command.   You can type this command at the
     terminal, or put it in a file for later execution.

     PPrrooggrraammmmiinngg tthhee SShheellll

       An option often overlooked by newcomers is that the  shell
     is  itself  a  programming language, with variables, control
     flow _(iiff--eellssee, wwhhiillee, ffoorr, ccaassee), subroutines, and interrupt
     handling.  Since there are many building-block programs, you
     can sometimes avoid writing a new program merely by  piecing
     together  some  of  the  building  blocks with shell command
     files.

       We will not go into any details here; examples  and  rules
     can  be found in _A_n _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _S_h_e_l_l, by S. R.
     Bourne.

     PPrrooggrraammmmiinngg iinn CC

       If you are undertaking anything substantial, C is the only
     reasonable choice of programming language: everything in the
     UNIX system is tuned to it.  The system itself is written in
     C,  as  are most of the programs that run on it.  It is also









     USD:1-28                                  UNIX For Beginners


     an easy language to use once you get started.  C  is  intro-
     duced  and  fully described in _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e by
     B. W. Kernighan and D.  M.  Ritchie  (Prentice-Hall,  1978).
     Several  sections  of  the manual describe the system inter-
     faces, that is, how you do I/O and similar functions.   Read
     _U_N_I_X _P_r_o_g_r_a_m_m_i_n_g for more complicated things.

       Most  input and output in C is best handled with the stan-
     dard I/O library, which provides a set of I/O functions that
     exist  in  compatible form on most machines that have C com-
     pilers.  In general,  it's  wisest  to  confine  the  system
     interactions in a program to the facilities provided by this
     library.

       C programs that don't depend too much on special  features
     of UNIX (such as pipes) can be moved to other computers that
     have C compilers.  The list of such machines grows daily; in
     addition  to  the  original PDP-11, it currently includes at
     least Honeywell 6000, IBM 370  and  PC  families,  Interdata
     8/32,  Data  General  Nova  and Eclipse, HP 2100, Harris /7,
     Motorola 68000 family (including machines like Sun Microsys-
     tems  and Apple Macintosh), VAX 11 family, SEL 86, and Zilog
     Z80.  Calls to the standard I/O library will work on all  of
     these machines.

       There  are a number of supporting programs that go with C.
     lliinntt checks C programs for potential  portability  problems,
     and  detects  errors  such  as mismatched argument types and
     uninitialized variables.

       For larger programs (anything whose source is on more than
     one  file) mmaakkee allows you to specify the dependencies among
     the source files and the processing steps needed to  make  a
     new  version;  it then checks the times that the pieces were
     last changed and does the minimal amount of  recompiling  to
     create a consistent updated version.

       The  debugger  ggddbb  is useful for digging through the dead
     bodies of C programs, but is rather hard  to  learn  to  use
     effectively.   The  most  effective  debugging tool is still
     careful  thought,  coupled  with  judiciously  placed  print
     statements.

       The C compiler provides a limited instrumentation service,
     so you can find out where programs spend their time and what
     parts  are  worth optimizing.  Compile the routines with the
     --ppgg option; after the test run, use ggpprrooff to print an execu-
     tion profile.  The command ttiimmee will give you the gross run-
     time statistics of a program, but they are not  super  accu-
     rate or reproducible.













     UNIX For Beginners                                  USD:1-29


     OOtthheerr LLaanngguuaaggeess

       If  you  _h_a_v_e to use Fortran, there are two possibilities.
     You might consider Ratfor, which gives you the  decent  con-
     trol structures and free-form input that characterize C, yet
     lets you write code that is still portable to other environ-
     ments.   Bear  in  mind  that  UNIX Fortran tends to produce
     large and relatively  slow-running  programs.   Furthermore,
     supporting  software like ggddbb, pprrooff, etc., are all virtually
     useless with Fortran programs.  There may also be a  Fortran
     77  compiler on your system.  If so, this is a viable alter-
     native to Ratfor, and has the non-trivial advantage that  it
     is compatible with C and related programs.  (The Ratfor pro-
     cessor and C tools can be used with Fortran 77 too.)

       If your application requires you to translate  a  language
     into a set of actions or another language, you are in effect
     building a compiler, though probably a small one.   In  that
     case,  you should be using the yyaacccc compiler-compiler, which
     helps you develop a compiler quickly.  The lleexx lexical  ana-
     lyzer  generator does the same job for the simpler languages
     that can be expressed as regular  expressions.   It  can  be
     used  by itself, or as a front end to recognize inputs for a
     yyaacccc-based program.  Both yyaacccc and lleexx require some  sophis-
     tication to use, but the initial effort of learning them can
     be repaid many times over  in  programs  that  are  easy  to
     change later on.

       Most  UNIX  systems  also  make available other languages,
     such as Algol 68, APL,  Basic,  Lisp,  Pascal,  and  Snobol.
     Whether  these are useful depends largely on the local envi-
     ronment: if someone cares about the language and has  worked
     on it, it may be in good shape.  If not, the odds are strong
     that it will be more trouble than it's worth.

     VV..  UUNNIIXX RREEAADDIINNGG LLIISSTT

     GGeenneerraall::

     K. L. Thompson and D. M. Ritchie, _T_h_e _U_N_I_X _P_r_o_g_r_a_m_m_e_r_'_s _M_a_n_-
     _u_a_l_,  Bell  Laboratories,  1978  (PS2:3)[++] Lists commands,
     system routines and interfaces, file formats,  and  some  of
     the  maintenance  procedures.   You can't live without this,
     although you will probably only need to read section 1.

     D. M. Ritchie and K. L. Thompson,  ``The  UNIX  Time-sharing
     System,'' CACM, July 1974. (PS2:1)[++] An  overview  of  the
     -----------
     [+] These documents (previously in Volume 2 of the
     Bell Labs Unix distribution)  are  provided  among
     the  "User  Supplementary"  Documents  for 4.3BSD,
     available from the Usenix Association.
     [++] These are among  the  "Programmer  Supplemen-
     tary"  Documents for 4.3BSD.  PS1 is Volume 1, PS2









     USD:1-30                                  UNIX For Beginners


     system,  for  people interested in operating systems.  Worth
     reading by anyone who programs.  Contains a remarkable  num-
     ber  of one-sentence observations on how to do things right.

     The Bell System Technical Journal (BSTJ)  Special  Issue  on
     UNIX,  July/August,  1978,  contains  many papers describing
     recent developments, and some retrospective material.

     The 2nd International  Conference  on  Software  Engineering
     (October,  1976)  contains several papers describing the use
     of the Programmer's Workbench (PWB) version of UNIX.

     DDooccuummeenntt PPrreeppaarraattiioonn::

     B. W. Kernighan, ``A Tutorial Introduction to the UNIX  Text
     Editor'' (USD:12) and ``Advanced Editing on UNIX,'' (USD:13)
     Bell Laboratories, 1978.[+]  Beginners  need  the  introduc-
     tion;  the  advanced material will help you get the most out
     of the editor.

     M. E. Lesk, ``Typing Documents on UNIX,'' Bell Laboratories,
     1978.  (USD:20)[+]  Describes  the  --mmss macro package, which
     isolates the novice from the vagaries of  nnrrooffff  and  ttrrooffff,
     and  takes care of most formatting situations.  If this spe-
     cific package isn't available on your system, something sim-
     ilar  probably  is.   The  most  likely  alternative  is the
     PWB/UNIX macro package --mmmm; see your local guru if  you  use
     PWB/UNIX.*

     B. W. Kernighan and L. L. Cherry, ``A System for Typesetting
     Mathematics,'' Bell  Laboratories  Computing  Science  Tech.
     Rep. 17. (USD:26)[+]

     M. E. Lesk, ``Tbl -- A Program to Format Tables,'' Bell Lab-
     oratories CSTR 49, 1976. (USD:28)[+]

     J. F. Ossanna, Jr., ``NROFF/TROFF User's Manual,'' Bell Lab-
     oratories CSTR 54, 1976. (USD:24)[+] ttrrooffff is the basic for-
     matter used by --mmss, eeqqnn and ttbbll.  The  reference  manual  is
     indispensable if you are going to write or maintain these or
     similar programs.  But start with:

     B. W. Kernighan, ``A TROFF  Tutorial,''  Bell  Laboratories,
     1976.  (USD:25)[+]  An attempt to unravel the intricacies of
     ttrrooffff.




     -----------
     is Volume 2.
     *The  macro  package -me is additionally available
     on Berkeley Unix Systems.  -mm  is  typically  not
     available.









     UNIX For Beginners                                  USD:1-31


     PPrrooggrraammmmiinngg::

     B. W. Kernighan and D. M. Ritchie, _T_h_e  _C  _P_r_o_g_r_a_m_m_i_n_g  _L_a_n_-
     _g_u_a_g_e_,  Prentice-Hall,  1978.  Contains a tutorial introduc-
     tion, complete discussions of all language features, and the
     reference manual.

     B.  W.  Kernighan and R. Pike, _T_h_e _U_n_i_x _P_r_o_g_r_a_m_m_i_n_g _E_n_v_i_r_o_n_-
     _m_e_n_t_, Prentice-Hall, 1984.  Contains many examples of C pro-
     grams  which  use the system interfaces, and explanations of
     ``why''.

     B. W. Kernighan and D.  M.  Ritchie,  ``UNIX  Programming,''
     Bell Laboratories, 1978. (PS2:3)[++] Describes how to inter-
     face with the system from C programs:  I/O  calls,  signals,
     processes.

     S.  R.  Bourne,  ``An Introduction to the UNIX Shell,'' Bell
     Laboratories, 1978. (USD:3)[+] An introduction and reference
     manual  for  the  Version 7 shell.  Mandatory reading if you
     intend to make effective use of  the  programming  power  of
     this shell.

     S.  C.  Johnson,  ``Yacc -- Yet Another Compiler-Compiler,''
     Bell Laboratories CSTR 32, 1978. (PS1:15)[++]

     M. E. Lesk, ``Lex -- A Lexical  Analyzer  Generator,''  Bell
     Laboratories CSTR 39, 1975. (PS1:16)[++]

     S. C. Johnson, ``Lint, a C Program Checker,'' Bell Laborato-
     ries CSTR 65, 1977. (PS1:9)[++]

     S. I. Feldman, ``MAKE -- A Program for Maintaining  Computer
     Programs,'' Bell Laboratories CSTR 57, 1977. (PS1:12)[++]

     J.  F. Maranzano and S. R. Bourne, ``A Tutorial Introduction
     to ADB,'' Bell Laboratories CSTR 62, 1977.  (PS1:10)[++]  An
     introduction to a powerful but complex debugging tool.

     S.  I. Feldman and P. J. Weinberger, ``A Portable Fortran 77
     Compiler,'' Bell Laboratories, 1978. (PS1:2)[++] A full For-
     tran 77 for UNIX systems.


















