5. Miscellaneous programming
5.1 How do I compare strings using wildcards?
The answer to that depends on what exactly you mean by `wildcards'.
There are two quite different concepts that qualify as `wildcards'. They are:
- Filename patterns
- These are what the shell uses for filename expansion (`globbing').
- Regular Expressions
-
These are used by editors,
grep
, etc. for matching text, but they normally aren't applied to filenames.
5.1.1 How do I compare strings using filename patterns?
Unless you are unlucky, your system should have a function
fnmatch()
to do filename matching. This generally allows only the
Bourne shell style of pattern; i.e. it recognises `*', `[...]'
and `?', but probably won't support the more arcane patterns
available in the Korn and Bourne-Again shells.
If you don't have this function, then rather than reinvent the wheel, you are probably better off snarfing a copy from the BSD or GNU sources.
Also, for the common cases of matching actual filenames, look for
glob()
, which will find all existing files matching a pattern.
5.1.2 How do I compare strings using regular expressions?
There are a number of slightly different syntaxes for regular
expressions; most systems use at least two: the one recognised by
ed
, sometimes known as `Basic Regular Expressions', and the one
recognised by egrep
, `Extended Regular Expressions'. Perl has
it's own slightly different flavour, as does Emacs.
To support this multitude of formats, there is a corresponding multitude
of implementations. Systems will generally have regexp-matching
functions (usually regcomp()
and regexec()
) supplied, but
be wary; some systems have more than one implementation of these
functions available, with different interfaces. In addition, there are
many library implementations available. (It's common, BTW, for regexps
to be compiled to an internal form before use, on the assumption that
you may compare several separate strings against the same regexp.)
One library available for this is the `rx' library, available from the GNU mirrors. This seems to be under active development, which may be a good or a bad thing depending on your point of view :-)
5.2 What's the best way to send mail from a program?
There are several ways to send email from a Unix program. Which is the best method to use in a given situation varies, so I'll present two of them. A third possibility, not covered here, is to connect to a local SMTP port (or a smarthost) and use SMTP directly; see RFC 821.
5.2.1 The simple method: /bin/mail
For simple applications, it may be sufficient to invoke mail
(usually `/bin/mail', but could be `/usr/bin/mail' on some
systems).
WARNING: Some versions of UCB Mail may execute commands prefixed by `~!' or `~|' given in the message body even in non-interactive mode. This can be a security risk.
Invoked as `mail -s 'subject' recipients...' it will take a message
body on standard input, and supply a default header (including the
specified subject), and pass the message to sendmail
for
delivery.
This example mails a test message to root
on the local system:
#include <stdio.h> #define MAILPROG "/bin/mail" int main() { FILE *mail = popen(MAILPROG " -s 'Test Message' root", "w"); if (!mail) { perror("popen"); exit(1); } fprintf(mail, "This is a test.\n"); if (pclose(mail)) { fprintf(stderr, "mail failed!\n"); exit(1); } }
If the text to be sent is already in a file, then one can do:
system(MAILPROG " -s 'file contents' root </tmp/filename");
These methods can be extended to more complex cases, but there are many pitfalls to watch out for:
- If using system() or popen(), you must be very careful about quoting arguments to protect them from filename expansion or word splitting
- Constructing command lines from user-specified data is a common source of buffer-overrun errors and other security holes
- This method does not allow for CC: or BCC: recipients to be specified (some versions of /bin/mail may allow this, some do not)
5.2.2 Invoking the MTA directly: /usr/lib/sendmail
The mail
program is an example of a Mail User Agent, a
program intended to be invoked by the user to send and receive mail, but
which does not handle the actual transport. A program for transporting
mail is called an MTA, and the most commonly found MTA on Unix
systems is called sendmail
. There are other MTAs in use, such as
MMDF
, but these generally include a program that emulates the
usual behaviour of sendmail
.
Historically, sendmail
has usually been found in `/usr/lib',
but the current trend is to move library programs out of `/usr/lib'
into directories such as `/usr/sbin' or `/usr/libexec'. As a
result, one normally invokes sendmail
by its full path, which is
system-dependent.
To understand how sendmail
behaves, it's useful to understand the
concept of an envelope. This is very much like paper mail; the
envelope defines who the message is to be delivered to, and who it is
from (for the purpose of reporting errors). Contained in the envelope
are the headers, and the body, separated by a blank line.
The format of the headers is specified primarily by RFC 822; see also
the MIME RFCs.
There are two main ways to use sendmail
to originate a message:
either the envelope recipients can be explicitly supplied, or
sendmail
can be instructed to deduce them from the message
headers. Both methods have advantages and disadvantages.
5.2.2.1 Supplying the envelope explicitly
The recipients of a message can simply be specified on the command line.
This has the drawback that mail addresses can contain characters that
give system()
and popen()
considerable grief, such as
single quotes, quoted strings etc. Passing these constructs successfully
through shell interpretation presents pitfalls. (One can do it by
replacing any single quotes by the sequence single-quote backslash
single-quote single-quote, then surrounding the entire address with
single quotes. Ugly, huh?)
Some of this unpleasantness can be avoided by eschewing the use of
system()
or popen()
, and resorting to fork()
and
exec()
directly. This is sometimes necessary in any event; for
example, user-installed handlers for SIGCHLD will usually break
pclose()
to a greater or lesser extent.
Here's an example:
#include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <stdlib.h> #include <fcntl.h> #include <sysexits.h> /* #include <paths.h> if you have it */ #ifndef _PATH_SENDMAIL #define _PATH_SENDMAIL "/usr/lib/sendmail" #endif /* -oi means "dont treat . as a message terminator" * remove ,"--" if using a pre-V8 sendmail (and hope that no-one * ever uses a recipient address starting with a hyphen) * you might wish to add -oem (report errors by mail) */ #define SENDMAIL_OPTS "-oi","--" /* this is a macro for returning the number of elements in array */ #define countof(a) ((sizeof(a))/sizeof((a)[0])) /* send the contents of the file open for reading on FD to the * specified recipients; the file is assumed to contain RFC822 headers * & body, the recipient list is terminated by a NULL pointer; returns * -1 if error detected, otherwise the return value from sendmail * (which uses <sysexits.h> to provide meaningful exit codes) */ int send_message(int fd, const char **recipients) { static const char *argv_init[] = { _PATH_SENDMAIL, SENDMAIL_OPTS }; const char **argvec = NULL; int num_recip = 0; pid_t pid; int rc; int status; /* count number of recipients */ while (recipients[num_recip]) ++num_recip; if (!num_recip) return 0; /* sending to no recipients is successful */ /* alloc space for argument vector */ argvec = malloc((sizeof char*) * (num_recip+countof(argv_init)+1)); if (!argvec) return -1; /* initialise argument vector */ memcpy(argvec, argv_init, sizeof(argv_init)); memcpy(argvec+countof(argv_init), recipients, num_recip*sizeof(char*)); argvec[num_recip + countof(argv_init)] = NULL; /* may need to add some signal blocking here. */ /* fork */ switch (pid = fork()) { case 0: /* child */ /* Plumbing */ if (fd != STDIN_FILENO) dup2(fd, STDIN_FILENO); /* defined elsewhere -- closes all FDs >= argument */ closeall(3); /* go for it: */ execv(_PATH_SENDMAIL, argvec); _exit(EX_OSFILE); default: /* parent */ free(argvec); rc = waitpid(pid, &status, 0); if (rc < 0) return -1; if (WIFEXITED(status)) return WEXITSTATUS(status); return -1; case -1: /* error */ free(argvec); return -1; } }
5.2.2.2 Allowing sendmail to deduce the recipients
The `-t' option to sendmail
instructs sendmail
to
parse the headers of the message, and use all the recipient-type headers
(i.e. To:
, Cc:
and Bcc:
) to construct the list of
envelope recipients. This has the advantage of simplifying the
sendmail
command line, but makes it impossible to specify
recipients other than those listed in the headers. (This is not usually
a problem.)
As an example, here's a program to mail a file on standard input to
specified recipients as a MIME attachment. Some error checks have been
omitted for brevity. This requires the `mimencode' program from the
metamail
distribution.
#include <stdio.h> #include <unistd.h> #include <fcntl.h> /* #include <paths.h> if you have it */ #ifndef _PATH_SENDMAIL #define _PATH_SENDMAIL "/usr/lib/sendmail" #endif #define SENDMAIL_OPTS "-oi" #define countof(a) ((sizeof(a))/sizeof((a)[0])) char tfilename[L_tmpnam]; char command[128+L_tmpnam]; void cleanup(void) { unlink(tfilename); } int main(int argc, char **argv) { FILE *msg; int i; if (argc < 2) { fprintf(stderr, "usage: %s recipients...\n", argv[0]); exit(2); } if (tmpnam(tfilename) == NULL || (msg = fopen(tfilename,"w")) == NULL) exit(2); atexit(cleanup); fclose(msg); msg = fopen(tfilename,"a"); if (!msg) exit(2); /* construct recipient list */ fprintf(msg, "To: %s", argv[1]); for (i = 2; i < argc; i++) fprintf(msg, ",\n\t%s", argv[i]); fputc('\n',msg); /* Subject */ fprintf(msg, "Subject: file sent by mail\n"); /* sendmail can add it's own From:, Date:, Message-ID: etc. */ /* MIME stuff */ fprintf(msg, "MIME-Version: 1.0\n"); fprintf(msg, "Content-Type: application/octet-stream\n"); fprintf(msg, "Content-Transfer-Encoding: base64\n"); /* end of headers -- insert a blank line */ fputc('\n',msg); fclose(msg); /* invoke encoding program */ sprintf(command, "mimencode -b >>%s", tfilename); if (system(command)) exit(1); /* invoke mailer */ sprintf(command, "%s %s -t <%s", _PATH_SENDMAIL, SENDMAIL_OPTS, tfilename); if (system(command)) exit(1); return 0; }