August 2006


We’re getting a lot of questions about spam, so I thought I’d go over what we are doing about the problem.

Please be aware that the following information is for @mail.usf.edu accounts only. If you have an @eng.usf.edu or @stpt.usf.edu account you can use WebMail, but none of these anti-spam features are available to you.

The Problem

By now everyone has heard of, and received spam, so I’m not going to explain what it is, but I want to give you some perspective on the size of the problem we are dealing with. We receive around 300,000 email messages on an average day and we’ve had peaks of over 500,000 per day. That’s a lot of mail, and scanning each message for viruses and spam is very CPU-intensive. Spam scanning is especially hard, because of the nearly-infinite variations that spams come in, thousands of tests have to be run on each message. Up until now, scanning was done on the mail server itself, just before the message was placed in your mailbox. This was sufficient when the mail server was put into production back in 2004, but we were only receiving about 1ooK messages per day then. In order for the mail server to handle the increased workload since then, we’ve had to cut down on the number of tests that we used to scan for spam, which limited the effectiveness of the filters.

The Solution

Just before Fall semester, we moved to a different architecture: the scanning is done on a separate set of machines (called MailGate) which then hand the messages to the mail server for final delivery. The new system is working really well, and with your help (more on that later), it will get even better. However it is not perfect. Some spam will still get through, but it will make a huge difference in the amount of spam you receive. MailGate reduces the number of spams you receive in a couple of ways:

Blacklisting

The first step in combating spam happens before a message has even been transferred. When an Email server tries to contact MailGate to send a message, MailGate checks several blacklists and if the server is listed, the connection is denied and no mail is transferred. MailGate also denies access to badly mis-configured or non RFC-compliant mail servers, which are usually spam zombies.

Virus Scanning

At this point, MailGate looks at the message and determines what (if any) files are attached. All files that are executable on Windows (.exe, .bat, etc) are automatically rejected. We are doing this because most Email-borne viruses use these file formats. If you need to send an executable file for some reason, put it into a “zip” archive to get past this check. If the file is not an executable, it is sent to the virus scanner. All archived files are unpacked at this point and the contents are also scanned. If all of the contents are virus-free, the message is then ready for spam scanning.

Rules-Based Spam Scoring

We use SpamAssassin to determine if a message is spam. Spamassassin (SA) uses thousands of rules and text patterns to make this determination. In addition to SA’s built-in rules, we are also using sets of rules that are updated daily to detect the latest types of spam We are also using Razor and DCC which are massive spam databases that messages can be checked against. Each rule has a “spam score” associated with it and once the message has been tested against all of the rules, the message’s total score is added up. If this score is greater than 5.0 (this score may change at some point), the message is considered spam.

Bayesian Filtering

In addition to the rules-based spam scoring, SA also uses Bayesian Filtering to determine the spam score. I’m not going to go into all the details, but basically a bayesian filter “learns” what you think of as spam and non-spam (”ham” in SA terms). In order for a bayesian filter to work, however, you must train it. Here’s where you come in. You may not have noticed, but there is a new link in WebMail when you are reading a message: “Mark as Spam”. This link sends the message to MailGate’s bayesian filter to help train it to see that message as spam. There is a similar link ( “Mark as Non-Spam”) on every message in your SPAM folder which trains the filter to look for Non-Spam. Whenever the spam filter misses a spam message, make sure to mark the message as spam and whenever it mistakenly marks valid mail as spam, make sure to mark it as non-spam.

Delivery

Once all of these filters are run, the message is finally sent to mail.usf.edu for delivery. If you have spam filtering enabled, messages marked as spam by MailGate will be moved into your SPAM folder, if not, the message is delivered to your mailbox as usual. Again, this will NOT catch every spam! For me, it’s catching about 97% right now and with more training, it should get over 99% of the spam.

To make sure that you have the spam filtering enabled:

  1. Login to WebMail
  2. Click on Options
  3. Click on Spam Filtering
  4. Choose the destination for your spam
  5. Click on Update Spam Filter Action

There was an unscheduled outage of service on mail.usf.edu from roughly 1AM to 7AM on 8/30/2006. The root cause of this failure is still being determined, but the impact was that Webmail, POP, and IMAP services were unavailable during that time period. Thanks to our new spam/virus scanning system, all incoming mail was received, scanned, and delivered to your account once the system was operational again, so no mail was lost. We apologize for any inconvenience this may have caused.

You have the right to limit usage and sharing of your USF directory information, where permitted by applicable Federal and state laws, until you graduate or cease to enroll three consecutive terms.

Privacy Options

Complete privacy means you may not, nor may anyone on your behalf, make telephone or electronic mail inquiries# about your affiliation with USF. It prohibits disclosure of your contact information in USF directories, Commencement and Honors programs or other lists+ made available to the public.

Partial privacy means you may not, nor may anyone on your behalf, make telephone or electronic mail inquiries* about your affiliation with USF. It prohibits disclosure of your contact information in USF directories or other lists+ made available to the public; however, you may be included in Commencement and Honors programs.

Confidentiality means you may, or anyone on your behalf, make telephone or electronic mail inquiries# about your affiliation with USF. It prohibits disclosure of your contact information in USF directories or other lists+ made available to the public.

How to Notify USF

You do not have to elect a form of privacy, but if you have not previously opted for one or do not recall which form we continue to honor while you are eligible, you may view or update your current privacy status at: http://www.registrar.usf.edu/privacy

When to Notify USF

You may exercise your right at any time; however, your request must be executed before the end of the second week of Fall term to prevent your contact information appearing in USF Directories.

Questions?

E-mail us via the website above, call (813) 974-2000, or visit us at SVC1034 on the Tampa Campus. The Office of the Registrar personnel cannot elect or change your privacy option for you over the telephone.

#USF never releases grades, grade point averages or Social Security numbers via telephone or electronic mail.

+Lists made available to the public, printed or electronic, may include non-academic, student-related services.

The spam/virus-scanning system for mail.usf.edu will soon see a major upgrade. In preparation of this upgrade and to increase the amount of available storage for email accounts, some policy changes must be made. Beginning on Wednesday, August 10th 2006, messages marked as Spam and moved to your SPAM folder will be kept for two weeks and then deleted automatically. This change affects your SPAM folder only and messages in any of your other folders will not be affected. If you have any questions, please email us at usg@mailman.acomp.usf.edu or post a comment to this entry.

blog.usf.edu and myweb.usf.edu will be unavailable during the Blackboard maintenance window (12AM-2AM) on Aug 4, 2006. This outage is necessary to accomodate changes needed for the upcoming Blackboard upgrade. No other services (WebMail, mail.usf.edu accounts, etc) will be affected. We apologize for any inconvenience this may cause.