Module ngx_http_perl_module

优质
小牛编辑
133浏览
2023-12-01

The ngx_http_perl_module module allows to implement location and variable handlers in Perl, and to insert Perl calls into SSI.

This module is not built by default, it should be enabled with the --with-http_perl_module configuration parameter.

This module requires Perl version 5.6.1 or higher. The C compiler should be compatible with that used to build Perl.

Known Bugs

The module is experimental, caveat emptor applies.

In order for Perl to recompile the modified modules during reconfiguration, it needs to be built with the parameters -Dusemultiplicity=yes or -Dusethreads=yes. Also, in order for Perl to leak less memory at run time, it needs to be built with the -Dusemymalloc=no parameter. To check the values of these parameters in an already built Perl (preferred values are specified in the example), run:

$ perl -V:usemultiplicity -V:usemymalloc
usemultiplicity='define';
usemymalloc='n';

Note that after rebuilding Perl with the new parameters -Dusemultiplicity=yes or -Dusethreads=yes, all binary Perl modules will have to be rebuilt as well — they will just stop working with the new Perl.

It is possible for the main process and then worker processes to grow in size after every reconfiguration. If the main process grows to an unacceptable size, the live upgrade procedure can be applied without changing an executable file.

While a Perl module performs long term operation, for example, resolves a domain name, connects to another server, queries a database, other requests assigned to this worker process will not be processed. It is thus recommended to limit the work done to operations that have predictable and short execution time, for example, access local file system.

The below mentioned issues only affect versions of nginx before 0.6.22.

Data returned by the $r request object methods only has a text value, and the value itself is stored in memory allocated by nginx from its own pools, not by Perl. This allows to reduce the number of copy operations involved in most cases, however it can lead to errors in some cases. For example, a worker process trying to use such a data in the numeric context will terminate with an error (FreeBSD):
nginx in realloc(): warning: pointer to wrong page
Out of memory!
Callback called exit.
or (Linux):
*** glibc detected *** realloc(): invalid pointer: ... ***
Out of memory!
Callback called exit.
The workaround is simple — a method’s value should be assigned to a variable. For example, the following code
my $i = $r->variable('counter') + 1;
should be replaced by
my $i = $r->variable('counter');
$i++;

Since most strings inside nginx are stored without a terminating null character, they are similarly returned by the $r request object methods (except for the $r->filename and $r->request_body_file methods). Thus, such values cannot be used as filenames and the likes. The workaround is similar to a previous case — the value should either be assigned to a variable (this results in data copying that in turn adds the necessary null character) or used in an expression, for example:
open FILE, '/path/' . $r->variable('name');

Example Configuration

http {

    perl_modules perl/lib;
    perl_require hello.pm;

    perl_set $msie6 '

        sub {
            my $r = shift;
            my $ua = $r->header_in("User-Agent");

            return "" if $ua =~ /Opera/;
            return "1" if $ua =~ / MSIE [6-9]\.\d+/;
            return "";
        }

    ';

    server {
        location / {
            perl hello::handler;
        }
    }

The perl/lib/hello.pm module:

package hello;

use nginx;

sub handler {
    my $r = shift;

    $r->send_http_header("text/html");
    return OK if $r->header_only;

    $r->print("hello!\n<br/>");

    if (-f $r->filename or -d _) {
        $r->print($r->uri, " exists!\n");
    }

    return OK;
}

1;
__END__

Directives

syntax:perl module::function|'sub { ... }';
default:
context:location, limit_except

Installs a Perl handler for the given location.

syntax:perl_modules path;
default:
context:http

Sets an additional path for Perl modules.

syntax:perl_require module;
default:
context:http

Defines the name of a module that will be loaded during each reconfiguration. There could be several perl_require directives.

syntax:perl_set $variable module::function|'sub { ... }';
default:
context:http

Installs a Perl handler for the specified variable.

Calling Perl from SSI

An SSI command calling Perl has the following format:

<!--# perl sub="module::function" arg="parameter1" arg="parameter2" ...
-->

The $r Request Object Methods

$r->args
returns request arguments.
$r->filename
returns a filename corresponding to the request URI.
$r->has_request_body(handler)
returns 0 if there is no body in a request. If there is a body, the specified handler is installed and 1 is returned. After reading the request body, nginx will call the installed handler. Note that the handler function should be passed by reference. Example:
package hello;

use nginx;

sub handler {
    my $r = shift;

    if ($r->request_method ne "POST") {
        return DECLINED;
    }

    if ($r->has_request_body(\&post)) {
        return OK;
    }

    return HTTP_BAD_REQUEST;
}

sub post {
    my $r = shift;

    $r->send_http_header;

    $r->print("request_body: \"", $r->request_body, "\"<br/>");
    $r->print("request_body_file: \"", $r->request_body_file, "\"<br/>\n");

    return OK;
}

1;

__END__
$r->allow_ranges
enables the use of byte ranges when sending responses.
$r->discard_request_body
instructs nginx to discard a request body.
$r->header_in(field)
returns value of the specified client request header field.
$r->header_only
determines should the whole response or only its header be sent to a client.
$r->header_out(field, value)
sets a value for the specified response header field.
$r->internal_redirect(uri)
does an internal redirect to the speicified uri. An actual redirect happens after the Perl handler has completed.
$r->print(text, ...)
passes data to a client.
$r->request_body
returns a client request body if it was not written to a temporary file. To ensure that a client request body is in memory, its size should be limited with client_max_body_size, and a sufficient buffer size should be set with client_body_buffer_size.
$r->request_body_file
returns the name of a file with the client request body. At the end of processing, the file needs to be removed. To always write a request body to a file, client_body_in_file_only needs to be enabled.
$r->request_method
returns client request HTTP method.
$r->remote_addr
returns client IP address.
$r->flush
immediately sends data to a client.
$r->sendfile(name[, offset[, length]])
sends the specified file content to a client. Optional parameters specify an initial offset and length of data to be transmitted. The actual data transmission happens after the Perl handler has completed. It should be noted that when using this method in a subrequest, and sendfile is enabled, the file content will not be passed through the gzip, SSI, and charset filters.
$r->send_http_header([type])
sends the response header to a client. An optional type parameter sets the value of the “Content-Type” response header field. If the value is an empty string, the “Content-Type” header field will not be passed.
$r->status(code)
sets a response code.
$r->sleep(milliseconds, handler)
sets the specified handler and stops request processing for the specified time. In the mean time, nginx continues to process other requests. After the specified time has elapsed, nginx will call the installed handler. Note that the handler function should be passed by reference. In order to pass data between handlers, $r->variable() should be used. Example:
package hello;

use nginx;

sub handler {
    my $r = shift;

    $r->discard_request_body;
    $r->variable("var", "OK");
    $r->sleep(1000, \&next);

    return OK;
}

sub next {
    my $r = shift;

    $r->send_http_header;
    $r->print($r->variable("var"));

    return OK;
}

1;

__END__
$r->unescape(text)
decodes a text encoded in the “%XX” form.
$r->uri
returns a request URI.
$r->variable(name[, value])
returns or sets a value of the specified variable. Variables are local to each request.