1 Cistron Password Caching
5 Thanks to Alan DeKok for the initial idea, to Dr. Bob Pilgrim for
6 implementation strategies, and to Miquel for putting it into cistron.
8 What does this caching do?
10 It will add the capability to cache several of the system files that
11 are used often by cistron radius. These include: /etc/passwd, /etc/shadow
12 (if necessary), and /etc/group.
14 By caching the information in these files and storing it in a VERY efficient
15 hash table, authentication lookups are sped up many times over. So, the
16 short answer is that your authentication requests will be faster if you use
17 caching, assuming you're using System authenication now. Read on for the
22 Compile radiusd as you normally would, and then set 'cache=yes' in your
25 Then, if you want to monitor how the caching is doing (errors and such),
26 'grep HASH /var/log/radius.log'.
28 What are the advantages?
30 Tremendous speed increase in authentication request handling. Orders of
33 What are the disadvantages?
35 As with anything else, there are a couple of trade-offs. However, these are
38 First, memory consumption of your radiusd process will go up somewhat over
39 normal operation. Depending on the size of your files, it could take several
40 megabytes of RAM to build the hash table properly. However, I have gone to
41 great lengths to ensure that there are no memory leaks. The size of the
42 daemon process should stay constant, or grow linearly with the size of your
43 system files. If you notice otherwise, send me an email with lots of detail
44 about your system and I'll see what I can find.
46 Second, you should kill -HUP the main radiusd process every so often out of
47 cron. The reason for this is because the daemon will need to rebuild the
48 cache to get passwords that you have changed for existing users. You will
49 not have to -HUP it for new users in /etc/passwd to get authenticated, as if
50 it can't find the user in the hash, it will automatically revert to the old
51 (slow) lookup methods [getpwnam()]. However, if you change the password for
52 a user, and the hash table contains that user but with the old password, it
53 will simply think that the user is supplying the wrong password and deny
54 them access. One thing I'd like to point out is that rebuilding the hash
55 with the -HUP takes less than a second with our ~1MB /etc/passwd and shadow
56 files. The slowest part of the -HUP is caused by the gethostby*() functions.
57 All in all, a -HUP reconfig is virtually instant on our solaris boxes, and
58 takes about 15 seconds on standard linux 2.x machines.
62 In examining our radius servers, I noticed that there were several things
63 slowing down an authentication requests. Please note that some of these
64 things will depend on what you are doing in your raddb/users file, however,
65 most of the things we were doing were quite common, so I feel that others
66 may have similar setups. In any case, I'll go thru exactly what we were
67 seeing, and how caching fixes it.
69 Authentication Bottlenecks:
70 *Group check items calling getgrnam()
71 *DEFAULT users with System authentication calling getpwnam()
72 *Simultaneous use check reading file /var/log/radutmp
74 First, the auth request comes in, and radius (usually) forks a child to
75 handle it. The child will then check the raddb/users file to see if the user
76 is explicitly listed there. In our case, most the users were matched off of
77 a DEFAULT system entry, meaning they would be looked up in the /etc/passwd
78 and /etc/shadow files. However, first, we had to check for a couple of
79 groups that aren't allowed access. In our case we had two DEFAULT group
80 compare check items, meaning that the getgrnam() system call was called
81 twice for every authentication request. Considering that the /etc/group file
82 is not so large, that wasn't a huge factor. However, the /etc/group file is
83 stored on disk, and was accessed through system calls, meaning it had to be
84 opened and searched until a matching group was found (twice), slowing down
85 authentication requestions somewhat.
87 Ok, so we've established that there's a getgrnam() call for each DEFAULT
88 user you have with a group check item. If you don't have any such DEFAULTs,
89 no problem, you don't have those calls. However, nearly 99% of you probably
90 have a line like the following in your raddb/users file, leading into the
91 next bottleneck: DEFAULT Auth-Type = System, Simultaneous-Use = 1
93 What does that mean? It means, for every auth request that comes in for
94 which we don't have an explicit match in our raddb/users file, lookup the
95 user in /etc/passwd and/or /etc/shadow using getpwnam() and getspwnam()
96 system calls. So what's so bad about that? It works right? Well, yes, but
97 it's inefficient because of the way getpwnam() works.
99 Think for a second about what getpwnam() has to do. It is a linear search
100 algorithm through an unsorted list stored on disk. It is an O(N) algorithm,
101 for you computer science folks (barring any caching build into the function
102 by particular OSs). So, basically, what it does is open the /etc/passwd
103 file, start at line one, compare that user to the user we're looking for,
104 and if there's no match, go to the next line and repeat. That means that on
105 average, every auth request you're taking requires opening and reading half
106 of your /etc/passwd file. Worst case is when a user not in the file is
107 looked up, as you search the entire file without a match. Best case is when
108 the user is the first one in the file, and average is in the middle
111 Are you saying, that doesn't sound so bad? Well, then perhaps caching
112 isn't for you. In our case, we had a ~15,000 line passwd file, and anywhere
113 from 20-40,000 logins daily. The getpwnam() call was taking by far the
114 majority of the radius daemon's time, resulting in fairly poor
115 authentication performance.
117 NOTE: radutmp caching is disabled for now until we can find a better way
118 to do it. Still, this next paragraph indicates why it needs doing.
120 Lastly, if you're using the wonderful simultaneous-use check item in cistron
121 radius, you have another disk lookup to perform on your /var/log/radutmp
122 file (actually, caching of this file just came available in b17. glad
123 somebody else noticed it :) What happens is, if the user is authenticated
124 properly, the daemon then opens the radutmp file, reads until it hits a line
125 matching the user currently requesting authentication, and then either
126 allows or denies access based on the presense or absense of a match. Thus,
127 in the majority of cases, you read the entire file in for every
128 authentication request (assuming that most of your users aren't trying to
129 steal service from you :) .
131 So, we've identified the problems. If you really care about how caching
132 addresses those problems, keep reading.
136 The code will first open your /etc/passwd file and runs through it hashing
137 all the usernames into a numeric index into my hash table. It then creates
138 an entry into the hashtable at that index for every user in the file.
139 Currently, it caches username, password, gid, and uid in the entry. Then, if
140 you have shadow passwords, it will open your /etc/shadow file and put the
141 passwords into the entry corresponding with the appropriate user. It is this
142 hash table that allows us to eliminate the getpwnam() bottleneck. Read below
145 Next, it will read the /etc/group file and build a singly linked list of it.
146 It's not as fast as the hash, but the group file is usually small enough
147 that it wouldn't make a difference. Just brining it into RAM should be good
148 enough. This takes care of the getgrnam() bottleneck.
150 NOTE: Again, radutmp caching is disabled. Disregard this next paragraph.
152 Finally, it reads the /var/log/radutmp file and marks every user listed as
153 logged in with a login variable stored in their hash entry. Then, every time
154 a user logs in or out, that variable is updated so that we can avoid reading
157 So what, right? Well, understand that for every user lookup done on a
158 system, you have currently to read half of your passwd file. Thus, the
159 number of comparisons done per getpwnam() is the number of lines in your
160 passwd file divided by two. So do 'wc -l /etc/passwd', divide that by two,
161 and that's the number of comparisons done per user lookup on your system
164 Using the hash table method, I have achieved user lookups in RAM in near
165 constant time. That is, using my passwd files (everyone's is different,
166 obviously), I could find a user in the hash in 1.0673 comparisons (on
167 average). Being that 1.0 is the lowest number of comparisons possible,
168 that's a pretty significant thing. Again, for the computer science people,
169 the hash function and table are wide enough such that 87.4% of users are
170 stored singly in a bucket, 11.7% are stored with two users in a bucket, 0.8%
171 are stored with 3 users in the bucket, and 0.03% are stored with more than 3
172 users per bucket. That means that 87% of your lookups are done in one
173 comparison, versus the number you got when you did the 'wc -l /etc/passwd' /
174 2 above. This discussion has also neglected the fact that RAM searches are
175 many, many times faster than disk searches. We'll let it speak for itself.
177 But, you don't have to take my word for it. I have a test program that I've
178 written that can show you the difference on your own system between the
179 speed in lookups using the hash and getpwnam(). You can find the program
180 cachetest.c at this end of this file. Simply save it to cachetest.c and
183 gcc -o cachetest ./cachetest.c
185 What it does is read in your passwd file, build a hash of it, and then
186 perform the number of lookups you specify either with getpwnam() or using
187 the hash lookup functions.
189 Before performing tests, please note that the larger the passwd file, the
190 larger the disparity between the two test methods. So you'll get a more
191 accurate depiction of what the code will do for your system by running it
192 on a passwd file on your radius server.
194 I'd recommend doing the following tests with it:
196 # Perform 1,000 lookups using getpwnam()
199 # Perform 1,000 lookups using the hash
202 My guess, no real significant difference in the time taken to complete each
205 # Perform 10,000 lookups using getpwnam()
208 # Perform 10,000 lookups using the hash
211 Unless you've got a real fast machine, you should have noticed a little bit
212 of difference between the two that time. Next test
214 # Perform 50,000 lookups using getpwnam()
217 # Perform 50,000 lookups using the hash
220 Unless you have a cray, I'm real confident you noticed a difference that
221 time. If the 50,000 user lookup didn't extraordinarily long on your system,
224 # Perform 100,000 lookups using getpwnam()
225 ./cachetest 100000 -s
227 # Perform 100,000 lookups using the hash
230 No doubt, you saw the difference here. Unless you want to be old and gray
231 when you finish the getpwnam() tests, I'd recommend not going higher than
232 100,000. Still, it's up to you. Go as high as you want on the hash test.
233 1,000,000 hash lookups are done in about a second on my pentium 400 running
236 How will it help my radius server?
238 Well, assuming you can live with the disadvantages listed above, it'll make
239 it faster. If you haven't run the tests in the above section, I'd recommend
244 As far as I can tell, yes. We've run it on both linux and solaris maches for
245 about two weeks now, and had zero problems. If you notice a bug, by all
246 means, let me know and I'll see if I can get a fix out. However, if you
247 notice system degrading problems with it, you can always turn it off in
248 your radiusd.conf file.
250 Ok, that's it. If you made it this far, I'm impressed. Let me know what you
255 --------------------------------------------------------------------------
258 * cachetest.c: Tests performance difference between user lookups in
259 * the hash table vs. getpwnam()
261 * All users in the passwd/shadow files are stored in a hash table.
262 * the hash lookup is VERY fast, generally 1.0673 comparisons per
263 * lookup. For the unitiated, that's blazing. You can't have less
264 * than one comparison, for example.
266 * (c) 1999 Author - Jeff Carneal, Apex Internet Services, Inc.
269 COMPILE: gcc -o cachetest cachetest.c
279 #include<sys/types.h>
282 /* Misc definitions */
284 #define MAXUSERNAME 20
285 #define HASHTABLESIZE 100000
286 #define PASSWDFILE "/etc/passwd"
288 /* Structure definitions */
290 char *pw_name; /* user name */
291 char *pw_passwd; /* user password */
292 uid_t pw_uid; /* user id */
293 gid_t pw_gid; /* group id */
294 int loggedin; /* number of logins */
295 struct mypasswd *next; /* next */
298 /* Function prototypes */
299 int buildHashTable(void);
300 struct mypasswd *findHashUser(const char *user);
301 int storeHashUser(struct mypasswd *new, int index);
302 int hashUserName(const char *s);
303 int doHashTest(int numusers, unsigned long numlookups);
306 /* Make the tables global since so many functions rely on them */
307 static struct mypasswd *hashtable[HASHTABLESIZE];
308 static struct mygroup *grphead = NULL;
310 char allusers[HASHTABLESIZE][MAXUSERNAME];
313 int main(int argc, char** argv) {
315 unsigned long numlookups=0;
321 if(isdigit(argv[1][0])) {
322 numlookups = atoi(argv[1]);
326 if((argv[2] != NULL) && !strncmp(argv[2],"-s",2)) {
330 memset((char *)allusers, 0, MAXUSERNAME*HASHTABLESIZE);
332 numusers = buildHashTable();
334 printf("HASH: Error building hash table! Quitting...\n");
338 doHashTest(numusers, numlookups);
343 /* Builds the hash table up by storing passwd/shadow fields
344 * in memory. Returns -1 on failure, 0 on success.
346 int buildHashTable(void) {
347 FILE *passwd, *shadow;
348 char buffer[BUFSIZE];
350 char username[MAXUSERNAME];
352 int len, hashindex, numread=0;
353 struct mypasswd *new, *cur, *next;
355 memset((char *)username, 0, MAXUSERNAME);
357 /* Init hash array */
358 memset((struct mypasswd *)hashtable, 0, (HASHTABLESIZE*(sizeof(struct mypasswd *))));
360 if( (passwd = fopen(PASSWDFILE, "r")) == NULL) {
361 printf("HASH: Can't open file %s: %s", PASSWDFILE, strerror(errno));
364 while(fgets(buffer, BUFSIZE , passwd) != (char *)NULL) {
368 /* Get usernames from password file */
369 for(ptr = bufptr; *ptr!=':'; ptr++);
371 if((len+1) > MAXUSERNAME) {
372 printf("HASH: Username too long in line: %s", buffer);
374 strncpy(username, buffer, len);
375 username[len] = '\0';
376 if(numread < HASHTABLESIZE) {
377 strncpy(allusers[numread-1], username, MAXUSERNAME);
380 /* Hash the username */
381 hashindex = hashUserName(username);
382 /*printf("%s:%d\n", username, hashindex);*/
384 /* Allocate space for structure to go in hashtable */
385 if((new = (struct mypasswd *)malloc(sizeof(struct mypasswd))) == NULL) {
386 printf("HASH: Out of memory!");
389 memset((struct mypasswd *)new, 0, sizeof(struct mypasswd));
391 /* Put username into new structure */
392 if((new->pw_name = (char *)malloc(strlen(username)+1)) == NULL) {
393 printf("HASH: Out of memory!");
396 strncpy(new->pw_name, username, strlen(username)+1);
398 /* Put passwords into array, if not shadowed */
399 /* Get passwords from password file (shadow comes later) */
405 #if defined(NOSHADOW)
406 /* Put passwords into new structure (*/
408 if((new->pw_passwd = (char *)malloc(len+1)) == NULL) {
409 printf("HASH: Out of memory!");
412 strncpy(new->pw_passwd, bufptr, len);
413 new->pw_passwd[len] = '\0';
414 #endif /* NOSHADOW */
417 * Put uid into structure. Not sure why, but
418 * at least we'll have it later if we need it
425 strncpy(idtmp, bufptr, len);
427 new->pw_uid = (uid_t)atoi(idtmp);
430 * Put gid into structure.
437 strncpy(idtmp, bufptr, len);
439 new->pw_gid = (gid_t)atoi(idtmp);
442 * We're skipping name, home dir, and shell
443 * as I can't think of any use for storing them
446 /*printf("User: %s, UID: %d, GID: %d\n", new->pw_name, new->pw_uid, new->pw_gid);*/
447 /* Store user in the hash */
448 storeHashUser(new, hashindex);
449 } /* End while(fgets(buffer, BUFSIZE , passwd) != (char *)NULL) { */
457 * Looks up user in hashtable. If user can't be found, returns 0.
458 * Otherwise returns a pointer to the structure for the user
460 struct mypasswd *findHashUser(const char *user) {
462 struct mypasswd *cur;
465 /* first hash the username and get the index into the hashtable */
466 index = hashUserName(user);
468 cur = hashtable[index];
470 while((cur != NULL) && (strcmp(cur->pw_name, user))) {
478 return (struct mypasswd *)0;
482 /* Stores the username sent into the hashtable */
483 int storeHashUser(struct mypasswd *new, int index) {
485 /* store new record at beginning of list */
486 new->next = hashtable[index];
487 hashtable[index] = new;
492 /* Hashes the username sent to it and returns index into hashtable */
493 int hashUserName(const char *s) {
494 unsigned long hash = 0;
497 hash = hash * 7907 + (unsigned char)*s++;
500 return (hash % HASHTABLESIZE);
503 int doHashTest(int numusers, unsigned long numlookups) {
504 int i, numlookedup = 0;
506 /* Use pokey getpwnam() syscall to find users */
508 printf("\nUsing getpwnam() lookup for %d user lookups over %d users found\n\n", numlookups, numusers);
509 while(numlookedup < numlookups) {
510 for(i=0; i<numusers; i++) {
511 if(numlookedup >= numlookups)
514 if(allusers[i] != NULL) {
515 getpwnam(allusers[i]);
521 /* Use blazing fast hash method to find user */
523 printf("\nUsing hash lookup for %d user lookups over %d users found\n\n", numlookups, numusers);
524 while(numlookedup < numlookups) {
525 for(i=0; i<numusers; i++) {
526 if(numlookedup >= numlookups)
529 if(allusers[i] != NULL) {
530 findHashUser(allusers[i]);
540 printf("\nUsage: ./cachetest #lookups [-s]\n");
541 printf("\n\tPerform 50,000 lookups using the hash method");
542 printf("\n\tEx: ./cachetest 50000\n");
543 printf("\n\tPerform 50,000 lookups using the getpwnam() syscall");
544 printf("\n\tEx: ./cachetest 50000 -s\n\n");