OK - I just went over Amit's "throttling with a daemon" thread.
I like the job spool architecture, too. I can tell you that the file space part scares me, though, even under normal usage. It's my understanding that the volume of mail going through the system is fairly extraordinary, and I think there would be significant risks of running out of disk space, unless the daemon stayed pretty busy.
Now that I think about it, though, it sort of works this way now, but without the daemon part -- sendmail does the spooling if necessary ahead of executing the sg code, and the spool grows when things heat up, then clears later. There's no throttling there, though.
Throttling does take on a whole new form when a daemon is involved in any manner -- we might end up storing a pretty big hash in memory -- up to a theoretical limit of one entry per user. I guess that's not too bad, though.
If I could alter the subject slightly -- I'm on the verge of adding the much more tedious file/db throttling. The db columns are in place, and I've started introducing the variables to both the old and new code (a pain... I've been working on the Mail::Audit header thing, too, but to no avail).
Here's what I'm planning:
1) implement the database throttling as Syskoll previously describled -- introduce an interval and maxcount to the config file (actually, a set for sending and for receiving), then maintain the database columns in the Users table, by updating the time to the current time() and setting the count to 0 if there's been a gap greater than the interval, or incrementing the count if not. Up front, we'd check to see if we're still in the interval and have exceeded the maxcount -- if so, we eat the message and probably go ahead and increment the count.
2) in addition to that, I'd probably like to specify a higher maxcount that will represent the pressure-valve threshold. If the count gets up this high, we shift the logic over to a file, including the username and the time() --- no need to count messages at this point, because we're basically under attack. Actually, we'd probably want to name the file as the username (if possible - there are some funny characters in some of the names -- we could use the lc() function and a lightweight alphanumeric hash to get around both case and character problems) and put the time() in the file -- we could avoid some contention this way by having different files for each user. Before we even create a database connection, we check for one of these files and see if the interval (actually, probably a second, greater interval) has expired. If not, we just exit - no database connection. If so, we delete the file and proceed. We'd probably need to use two directories for the files -- one for sending and one for receiving, so that the throttling could operate independently. I guess we could just prepend the files with s- and r- as well, though.
Anyway, I think this would protect us from two of the major types of existing resource hogs: a) DOS attacks and b) gaming scripters who automate the creation of buzillions of addresses (amounts to the same thing as a DOS attack, but the motivation is different, I think).
The third type of resource hog are those damn Outlook viruses - they really get out of control, and they send big messages. It'd be nice to tackle these too (maybe throttling by subject line) but I don't want to overcomplicate things.
By virtue of being on the sending side, it would also open the way for "sending the first message"
How does this sound?