ADC Home > Reference Library > Reference > Mac OS X > Mac OS X Man Pages

 

This document is a Mac OS X manual page. Manual pages are a command-line technology for providing documentation. You can view these manual pages locally using the man(1) command. These manual pages come from many different sources, and thus, have a variety of writing styles.

This manual page is associated with Mac OS X Server. It is not available on standard Mac OS X (client) installations.

For more information about the manual page format, see the manual page for manpages(5).



mod_perl_tuning(3)                   User Contributed Perl Documentation                  mod_perl_tuning(3)



NAME
       mod_perl_tuning - mod_perl performance tuning

DESCRIPTION
       Described here are examples and hints on how to configure a mod_perl enabled Apache server,
       concentrating on tips for configuration for high-speed performance.  The primary way to achieve
       maximal performance is to reduce the resources consumed by the mod_perl enabled HTTPD processes.

       This document assumes familiarity with Apache configuration directives some familiarity with the
       mod_perl configuration directives, and that you have already built and installed a mod_perl enabled
       Apache server.  Please also read the mod_perl documentation that comes with mod_perl for programming
       tips.  Some configurations below use features from mod_perl version 1.03 which were not present in
       earlier versions.

       These performance tuning hints are collected from my experiences in setting up and running servers
       for handling large promotional sites, such as The Weather Channel's "Blimp Site-ings" game, the MSIE
       4.0 "Subscribe to Win" game, and the MSN Million Dollar Madness game.

BASIC CONFIGURATION
       The basic configuration for mod_perl is as follows.  In the httpd.conf file, I add configuration
       parameters to make the "http://www.domain.com/programs" URL be the base location for all mod_perl
       programs.  Thus, access to "http://www.domain.com/programs/printenv" will run the printenv script, as
       we'll see below.  Also, any *.perl file will be interpreted as a mod_perl program just as if it were
       in the programs directory, and *.rperl will be mod_perl, but without any HTTP headers automatically
       sent; you must do this explicitly.  If you don't want these last two, just leave it out of your
       configuration.

       In the configuration files, I use /var/www as the "ServerRoot" directory, and /var/www/docs as the
       "DocumentRoot".  You will need to change it to match your particular setup.  The network address
       below in the access to perl-status should also be changed to match yours.

       Additions to httpd.conf:

        # put mod_perl programs here
        # startup.perl loads all functions that we want to use within mod_perl
        Perlrequire /var/www/perllib/startup.perl
        <Directory /var/www/docs/programs>
          AllowOverride None
          Options ExecCGI
          SetHandler perl-script
          PerlHandler Apache::Registry
          PerlSendHeader On
        </Directory>

        # like above, but no PerlSendHeaders
        <Directory /var/www/docs/rprograms>
          AllowOverride None
          Options ExecCGI
          SetHandler perl-script
          PerlHandler Apache::Registry
          PerlSendHeader Off
        </Directory>

        # allow arbitrary *.perl files to be scattered throughout the site.
        <Files *.perl>
          SetHandler perl-script
          PerlHandler Apache::Registry
          PerlSendHeader On
          Options +ExecCGI
        </Files>

        # like *.perl, but do not send HTTP headers
        <Files *.rperl>
          SetHandler perl-script
          PerlHandler Apache::Registry
          PerlSendHeader Off
          Options +ExecCGI
        </Files>

        <Location /perl-status>
          SetHandler perl-script
          PerlHandler Apache::Status
          order deny,allow
          deny from all
          allow from 204.117.82.
        </Location>

       Now, you'll notice that I use a "PerlRequire" directive to load in the file startup.perl.  In that
       file, I include all of the "use" statements that occur in any of my mod_perl programs (either from
       the programs directory, or the *.perl files).  Here is an example:

        #! /usr/local/bin/perl
        use strict;

        # load up necessary perl function modules to be able to call from Perl-SSI
        # files.  These objects are reloaded upon server restart (SIGHUP or SIGUSR1)
        # if PerlFreshRestart is "On" in httpd.conf (as of mod_perl 1.03).

        # only library-type routines should go in this directory.

        use lib "/var/www/perllib";

        # make sure we are in a sane environment.
        $ENV{GATEWAY_INTERFACE} =~ /^CGI-Perl/ or die "GATEWAY_INTERFACE not Perl!";

        use Apache::Registry ();       # for things in the "/programs" URL

        # pull in things we will use in most requests so it is read and compiled
        # exactly once
        use CGI (); CGI->compile(':all');
        use CGI::Carp ();
        use DBI ();
        use DBD::mysql ();

        1;

       What this does is pull in all of the code used by the programs (but does not "import" any of the
       module methods) into the main HTTPD process, which then creates the child processes with the code
       already in place.  You can also put any new modules you like into the /var/www/perllib directory and
       simply "use" them in your programs.  There is no need to put "use lib "/var/www/perllib";" in all of
       your programs.  You do, however, still need to "use" the modules in your programs.  Perl is smart
       enough to know it doesn't need to recompile the code, but it does need to "import" the module methods
       into your program's name space.

       If you only have a few modules to load, you can use the PerlModule directive to pre-load them with
       the same effect.

       The biggest benefit here is that the child process never needs to recompile the code, so it is faster
       to start, and the child process actually shares the same physical copy of the code in memory due to
       the way the virtual memory system in modern operating systems works.

       You will want to replace the "use" lines above with modules you actually need.

       Simple Test Program

       Here's a sample script called printenv that you can stick in the programs directory to test the
       functionality of the configuration.

        #! /usr/local/bin/perl
        use strict;
        # print the environment in a mod_perl program under Apache::Registry

        print "Content-type: text/html\n\n";

        print "<HEAD><TITLE>Apache::Registry Environment</TITLE></HEAD>\n";

        print "<BODY><PRE>\n";
        print map { "$_ = $ENV{$_}\n" } sort keys %ENV;
        print "</PRE></BODY>\n";

       When you run this, check the value of the GATEWAY_INTERFACE variable to see that you are indeed
       running mod_perl.

REDUCING MEMORY USE
       As a side effect of using mod_perl, your HTTPD processes will be larger than without it.  There is
       just no way around it, as you have this extra code to support your added functionality.

       On a very busy site, the number of HTTPD processes can grow to be quite large.  For example, on one
       large site, the typical HTTPD was about 5Mb large.  With 30 of these, all of RAM was exhausted, and
       we started to go to swap.  With 60 of these, swapping turned into thrashing, and the whole machine
       slowed to a crawl.

       To reduce thrashing, limiting the maximum number of HTTPD processes to a number that is just larger
       than what will fit into RAM (in this case, 45) is necessary.  The drawback is that when the server is
       serving 45 requests, new requests will queue up and wait; however, if you let the maximum number of
       processes grow, the new requests will start to get served right away, but they will take much longer
       to complete.

       One way to reduce the amount of real memory taken up by each process is to pre-load commonly used
       modules into the primary HTTPD process so that the code is shared by all processes.  This is
       accomplished by inserting the "use Foo ();" lines into the startup.perl file for any "use Foo;"
       statement in any commonly used Registry program.  The idea is that the operating system's VM
       subsystem will share the data across the processes.

       You can also pre-load Apache::Registry programs using the "Apache::RegistryLoader" module so that the
       code for these programs is shared by all HTTPD processes as well.

       NOTE: When you pre-load modules in the startup script, you may need to kill and restart HTTPD for
       changes to take effect.  A simple "kill -HUP" or "kill -USR1" will not reload that code unless you
       have set the "PerlFreshRestart" configuration parameter in httpd.conf to be "On".

REDUCING THE NUMBER OF LARGE PROCESSES
       Unfortunately, simply reducing the size of each HTTPD process is not enough on a very busy site.  You
       also need to reduce the quantity of these processes.  This reduces memory consumption even more, and
       results in fewer processes fighting for the attention of the CPU.  If you can reduce the quantity of
       processes to fit into RAM, your response time is increased even more.

       The idea of the techniques outlined below is to offload the normal document delivery (such as static
       HTML and GIF files) from the mod_perl HTTPD, and let it only handle the mod_perl requests.  This way,
       your large mod_perl HTTPD processes are not tied up delivering simple content when a smaller process
       could perform the same job more efficiently.

       In the techniques below where there are two HTTPD configurations, the same httpd executable can be
       used for both configurations; there is no need to build HTTPD both with and without mod_perl compiled
       into it.  With Apache 1.3 this can be done with the DSO configuration -- just configure one httpd
       invocation to dynamically load mod_perl and the other not to do so.

       These approaches work best when most of the requests are for static content rather than mod_perl
       programs.  Log file analysis become a bit of a challenge when you have multiple servers running on
       the same host, since you must log to different files.

       TWO MACHINES

       The simplest way is to put all static content on one machine, and all mod_perl programs on another.
       The only trick is to make sure all links are properly coded to refer to the proper host.  The static
       content will be served up by lots of small HTTPD processes (configured not to use mod_perl), and the
       relatively few mod_perl requests can be handled by the smaller number of large HTTPD processes on the
       other machine.

       The drawback is that you must maintain two machines, and this can get expensive.  For extremely large
       projects, this is the best way to go.

       TWO IP ADDRESSES

       Similar to above, but one HTTPD runs bound to one IP address, while the other runs bound to another
       IP address.  The only difference is that one machine runs both servers.  Total memory usage is
       reduced because the majority of files are served by the smaller HTTPD processes, so there are fewer
       large mod_perl HTTPD processes sitting around.

       This is accomplished using the httpd.conf directive "BindAddress" to make each HTTPD respond only to
       one IP address on this host.  One will have mod_perl enabled, and the other will not.

       TWO PORT NUMBERS

       If you cannot get two IP addresses, you can also split the HTTPD processes as above by putting one on
       the standard port 80, and the other on some other port, such as 8042.  The only configuration changes
       will be the "Port" and log file directives in the httpd.conf file (and also one of them does not have
       any mod_perl directives).

       The major flaw with this scheme is that some firewalls will not allow access to the server running on
       the alternate port, so some people will not be able to access all of your pages.

       If you use this approach or the one above with dual IP addresses, you probably do not want to have
       the *.perl and *.rperl sections from the sample configuration above, as this would require that your
       primary HTTPD server be mod_perl enabled as well.

       Thanks to Gerd Knops for this idea.

       USING ProxyPass WITH TWO SERVERS

       To overcome the limitation of the alternate port above, you can use dual Apache HTTPD servers with
       just slight difference in configuration.  Essentially, you set up two servers just as you would with
       the two port on same IP address method above.  However, in your primary HTTPD configuration you add a
       line like this:

        ProxyPass /programs http://localhost:8042/programs

       Where your mod_perl enabled HTTPD is running on port 8042, and has only the directory programs within
       its DocumentRoot.  This assumes that you have included the mod_proxy module in your server when it
       was built.

       Now, when you access http://www.domain.com/programs/printenv it will internally be passed through to
       your HTTPD running on port 8042 as the URL http://localhost:8042/programs/printenv and the result
       relayed back transparently.  To the client, it all seems as if it is just one server running.  This
       can also be used on the dual-host version to hide the second server from view if desired.

       The directory structure assumes that /var/www/documents is the "DocumentRoot" directory, and the the
       mod_perl programs are in /var/www/programs and /var/www/rprograms.  I start them as follows:

        daemon httpd
        daemon httpd -f conf/httpd+perl.conf

       Thanks to Bowen Dwelle for this idea.

       SQUID ACCELERATOR

       Another approach to reducing the number of large HTTPD processes on one machine is to use an
       accelerator such as Squid (which can be found at http://squid.nlanr.net/Squid/ on the web) between
       the clients and your large mod_perl HTTPD processes.  The idea here is that squid will handle the
       static objects from its cache while the HTTPD processes will handle mostly just the mod_perl requests
       once the cache is primed.  This reduces the number of HTTPD processes and thus reduces the amount of
       memory used.

       To set this up, just install the current version of Squid (at this writing, this is version 1.1.22)
       and use the RunAccel script to start it.  You will need to reconfigure your HTTPD to use an alternate
       port, such as 8042, rather than its default port 80.  To do this, you can either change the
       httpd.conf line "Port" or add a "Listen" directive to match the port specified in the squid.conf
       file.  Your URLs do not need to change.  The benefit of using the "Listen" directive is that
       redirected URLs will still use the default port 80 rather than your alternate port, which might
       reveal your real server location to the outside world and bypass the accelerator.

       In the squid.conf file, you will probably want to add "programs" and "perl" to the "cache_stoplist"
       parameter so that these are always passed through to the HTTPD server under the assumption that they
       always produce different results.

       This is very similar to the two port, ProxyPass version above, but the Squid cache may be more
       flexible to fine tune for dynamic documents that do not change on every view.  The Squid proxy server
       also seems to be more stable and robust than the Apache 1.2.4 proxy module.

       One drawback to using this accelerator is that the logfiles will always report access from IP address
       127.0.0.1, which is the local host loopback address.  Also, any access permissions or other user
       tracking that requires the remote IP address will always see the local address.  The following code
       uses a feature of recent mod_perl versions (tested with mod_perl 1.16 and Apache 1.3.3) to trick
       Apache into logging the real client address and giving that information to mod_perl programs for
       their purposes.

       First, in your startup.perl file add the following code:

        use Apache::Constants qw(OK);

        sub My::SquidRemoteAddr ($) {
          my $r = shift;

          if (my ($ip) = $r->header_in('X-Forwarded-For') =~ /([^,\s]+)$/) {
            $r->connection->remote_ip($ip);
          }

          return OK;
        }

       Next, add this to your httpd.conf file:

        PerlPostReadRequestHandler My::SquidRemoteAddr

       This will cause every request to have its "remote_ip" address overridden by the value set in the
       "X-Forwarded-For" header added by Squid.  Note that if you have multiple proxies between the client
       and the server, you want the IP address of the last machine before your accelerator.  This will be
       the right-most address in the X-Forwarded-For header (assuming the other proxies append their
       addresses to this same header, like Squid does.)

       If you use apache with mod_proxy at your frontend, you can use Ask Bjorn Hansen's
       mod_proxy_add_forward module from ftp://ftp.netcetera.dk/pub/apache/ to make it insert the
       "X-Forwarded-For" header.

SUMMARY
       To gain maximal performance of mod_perl on a busy site, one must reduce the amount of resources used
       by the HTTPD to fit within what the machine has available.  The best way to do this is to reduce
       memory usage.  If your mod_perl requests are fewer than your static page requests, then splitting the
       servers into mod_perl and non-mod_perl versions further allows you to tune the amount of resources
       used by each type of request.  Using the "ProxyPass" directive allows these multiple servers to
       appear as one to the users.  Using the Squid accelerator also achieves this effect, but Squid takes
       care of deciding when to acccess the large server automatically.

       If all of your requests require processing by mod_perl, then the only thing you can really do is
       throw a lot of memory on your machine and try to tweak the perl code to be as small and lean as
       possible, and to share the virtual memory pages by pre-loading the code.

AUTHOR
       This document is written by Vivek Khera.  If you need to contact me, just send email to the mod_perl
       mailing list.

       This document is copyright (c) 1997-1998 by Vivek Khera.

       If you have contributions for this document, please post them to the mailing list.  Perl POD format
       is best, but plain text will do, too.

       If you need assistance, contact the mod_perl mailing list at modperl@perl.apache.org first (send
       'subscribe' to modperl-request@apache.org to subscribe). There are lots of people there that can
       help. Also, check the web pages http://perl.apache.org/ and http://www.apache.org/ for explanations
       of the configuration options.

       $Revision: 177689 $ $Date: 2002-03-24 18:57:59 -0800 (Sun, 24 Mar 2002) $

POD ERRORS
       Hey! The above document had some coding errors, which are explained below:

       Around line 294:
           '=begin' only takes one parameter, not several as in '=begin html <P> A complete configuration
           example of this technique is provided by two HTTPD configuration files. <A
           HREF="httpd.conf.txt">httpd.conf</A> is for the main server for all regular pages, and <A
           HREF="httpd%2bperl.conf.txt">httpd+perl.conf</A> is for the mod_perl programs accessed in the
           <CODE>/programs</CODE> URL. </P>'

       Around line 311:
           =end html without matching =begin.  (Stack: [empty])



perl v5.8.8                                      2007-07-17                               mod_perl_tuning(3)

Did this document help you?
Yes: Tell us what works for you.
It’s good, but: Report typos, inaccuracies, and so forth.
It wasn’t helpful: Tell us what would have helped.