|
To access the contents, click the chapter and section titles.
HTML 4.0 Sourcebook
Figure 10.13 Perl code extract for decoding FORM data passed to the program via standard input. Differences from the extract in Figure 10.11 are shown in italics. Note that this is not a functional piece of code and that the extracted name and value strings must be placed in a permanent storage location (such as an associative array) for subsequent processing. $input=<STDIN> # read FORM data from stdin chop($input); chop($input); # chop CR/LF trailing characters: # recall that the data sent by a client # is always terminated by a single line # containing only a CRLF pair. This # must be removed, since it is not # part of the message body. # Check for unencoded equals sign -- if # there are none, the string didnt if( $input !~ /=/ ) { # come from a FORM, which is an error. &pk_error(Query String not from FORM\n); } # If we get to here, all is OK. Now @fields=split(&,$input); # split data into separate name=value # fields(@fields is an array) # Now loop over each of the entries in the @fields array and break # them into the name and value parts. Then decode each part to get # back the strings typed into the form by the user foreach $one (@fields) { ($name, $value) = split(=,$one); # split,at the equals sign,into # the name and value strings. Next, # decode the strings. $name =~ s/\+/ /g; # convert +s to spaces $name =~ s/%(..)/pack(c,hex($1))/ge; # convert URL hex codings to Latin-1 $value =~ s/\+/ /g; # convert +s to spaces $value =~ s/%(..)/pack(c,hex($1))/ge; # convert URL hex codings to Latin-1 # What you do now depends on how the program works. If you know that each # name is unique (your FORM does not have checkbox or SELECT items that # allow multiple name=value strings with the same name) then you can place # all the data in an associative array (a useful little perl fea- ture!): Relative Advantages of GET and POSTThe GET and POST methods for handling FORM input have different strengths and weaknesses. POST is clearly superior if you are sending large quantities of data to the server or data encoded in character sets other than ISO Latin-1. If you are sending small quantities of data, and only ISO Latin-1 characters, the choice is less clear. One useful criterion is to ask if you want the user to be able to store (bookmark) a URL that will return the user to this particular resource. If the answer is yes, then you must use the GET method, since the relevant data will be placed in the query string portion of a URL, which is stored when a URL is recorded. If, on the other hand, you do not want the user to be able to quickly return to this resource or you want to hide the FORM content as much as possible, you should use POST. HTML Encoding of Text Within a FORMWith gateway programs, you often need to place data inside the FORM sent to the clientthis might be initial field values assigned to the VALUE attributes of INPUT or OPTION elements or within the body of a TEXTAREA element, or it might be state information (information describing the state of the interaction between the user and the server-side application) preserved within the VALUE attributes of TYPE=hidden INPUT elements. However, in doing so, you must remember that the text received by the client will be parsed. This means that any entity or character references embedded in the VALUE (or NAME) strings or within the body of a TEXTAREA element will be automatically expanded into the correct ISO Latin-1 characters. For example, if a document sent to a client contains the hidden element <INPUT TYPE=hidden NAME=stuff VALUE=&lt;BOO&quot;&gt;> the client will parse the VALUE string and convert it into the string <BOO>. When the FORM containing this hidden element is submitted, the string <BOO> will be URL-encoded and sent to the server, so that the entity references in the original data are lost. This is sensible if you recall that, as far as the browser is concerned, entity references and character references are no different from the characters they represent. This can be a problem, however, if the data within the hidden form contains HTML markup, since you often need to preserve entity references distinct from the characters they represent; for example, so that simple character strings (&lt;tag&gt;) do not get converted into markup tags (<tag>) by the conversion process. Thus, if you need to preserve entity references, you must do the following encodings of the string prior to placing it within a VALUE or NAME attribute or inside a TEXTAREA element:
The second and third steps are necessary, as any raw double quote characters () will prematurely terminate a VALUE or NAME string, while some browsers mistakenly use an unencoded greater-than symbol (>) to prematurely end INPUT elements. The first step encodes the leading character of each entity or character reference: For example, the original string &eacute; becomes &amp;eacute;. This is processed by the client browser back to the string &eacute;, which brings you full circle when the data are returned to the server. State Preservation in CGI TransactionsIn a complex gateway application, a complete session may require a series of interactions between the client and server. Since the HTTP protocol is stateless, the serverand any gateway program on the serverretain no knowledge of any previous transaction. Thus you, the gateway program designer, must build in mechanisms for keeping track of what happened in any previous stage. There are two strategies for doing this. The traditional way is to use TYPE=hidden INPUT elements within HTML forms, to pass state information back and forth between client and server. A second, newer method is to use Netscape cookies to store state information on the client.
|
Products | Contact Us | About Us | Privacy | Ad Info | Home
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement. |