Page 1 of 1

database changes

PostPosted: Thu Oct 02, 2003 12:10 am
by josh
All,
I'm hopefully taking the next couple of days off, and I hope I'll be able to sit down for awhile and catch up on a few things.

There are a couple of database changes I'd like to make, for starters:

1) add the throttling columns to the user table. I'll look through the old posts, but I remember SysKoll had suggested two columns, one for the last event time and one for the "density" or something. That seemed like the best way to go. We've been getting more and more abuse in this area, and I'd sure like to put in the two phase throttling we spoke about (first database, then file) - this will also open the door for sending messages. We might need two sets of columns, one for sending and one for receiving - not sure.

2) a lot of people want date restricted addresses, so I though we could add a date column (or actually a number column for seconds since the epoch) to enable this.

Re: database changes

PostPosted: Thu Oct 02, 2003 3:54 am
by SysKoll
Regarding point 1: Here is what I wrote in an oooold post about the throttling algorithm for sending a message. I think it applies perfectly to the throttling of incoming messages.

1. Define a frequency for your resource and a time period. Example: if your messages are throttled at 100 messages per hour, then define frequency F=100 and time period TP=1 hour.
2. At first use (in our case, first message sent), initialize a time-stamp T=CurrentTime and a counter C=F.
3. Each time an event occurs (here, the user wants to send a message), do the following:
3a. If C=0, deny the event (here, tell user he cannot send messages anymore)
3b. If C>0, decrement C and allow the event.
3c. calculate D =CurrentTime - T. If D > TP then T=CurrentTime and C=F.

In our case, this means we need two fields per user row, T and C. We need two parameters in the code, F and TP (number of messages per time period).


So you need indeed two more columns, one for T and one for C, for each of the throttling you are planning (emission and reception). That's 4 columns. Then add the address expiration date, that's 5.

Hope you keep a good documentation of your DB scheme. :-)

PostPosted: Thu Oct 02, 2003 4:12 pm
by josh
yeah, 4 new columns on the Users table and one new column on the Emails table (which should be called the DisposableAddresses table, but, hey)

Not much doc on the db tables :( -- at least there are only a few of them.

So, how about

Code: Select all
  ALTER TABLE Users ADD RecThrottleTime BIGINT (20);
  ALTER TABLE Users ADD RecThrottleCount INT (10);
  ALTER TABLE Users ADD SendThrottleTime BIGINT (20);
  ALTER TABLE Users ADD SendThrottleCount INT (10);
  ALTER TABLE Emails ADD ExpireTime BIGINT(20);


I don't think we need indexes on these, because I don't see us querying on them...

BIGINT type?

PostPosted: Thu Oct 02, 2003 5:05 pm
by SysKoll
Yeah, that's good. I was wondering if we should store time stamps as numbers of seconds since Epoch, but the MySQL DATETIME format would probably make calculations more complex and more CPU-hungry.

Why the BIGINT length of 20? Isn't it way overkill?

PostPosted: Thu Oct 02, 2003 8:04 pm
by josh
yeah, I suppose 10 would work. time() is 10 digits, and will be for quite awhile (and I guess our remote progeny can always widen the column at that point)

Make these timestamp columns UNISGNED

PostPosted: Thu Oct 02, 2003 8:28 pm
by SysKoll
Unless you have a very good reason, make the timestamp columns UNSIGNED BIGINT. That will give us until 2038 to change it. By that time, I expect we'll be auctionning off the organs of every spammer caught.

PostPosted: Fri Oct 10, 2003 2:37 pm
by josh
ok, so

Code: Select all
  ALTER TABLE Users ADD RecThrottleTime BIGINT UNSIGNED;
  ALTER TABLE Users ADD RecThrottleCount SMALLINT;
  ALTER TABLE Users ADD SendThrottleTime BIGINT UNSIGNED;
  ALTER TABLE Users ADD SendThrottleCount SMALLINT;
  ALTER TABLE Emails ADD ExpireTime BIGINT UNSIGNED;


I think this has got it...

Looks good

PostPosted: Fri Oct 10, 2003 3:06 pm
by SysKoll
Looks good to me. What I'd like to do is start documented the scheme -- the purpose and semantics of each DB field. As Brooks said in the "Mythical Man Month", show me your data structures and your whole program becomes clear.

Please put the most recent code and SQL scripts on CVS so I can take a whack at this.