Previous Page TOC Next Page



Chapter 13


Windows CGI Scripting


by Wayne Berry

Common Gateway Interface (CGI) was developed so that a Web browser could pass parameters to a Web server, regardless of the platform on which either of the machines is running. For instance, if the Web browser is running on a Macintosh and the server is running on a UNIX machine, the browser could pass parameters that both machines can understand. CGI scripts is a common name for programs that run on the server side. The name scripts comes from the fact that in the beginning, most server-side implementations were Perl scripts running on a UNIX box. CGI scripts do not have to be Perl scripts; they can be any program, scripted or compiled, that the Web server is allowed to execute.

CGI Scripting—the Client Side


The parameters to the CGI scripts usually originate in a <FORM></FORM> tag within the current page that the browser is displaying. Encapsulated within the <FORM> tag are <INPUT> tags that create 'name = value' parameters. There are also other HTML tags that create 'name = value' parameter pairs. When the form is submitted to the server, the browser encodes the parameters into the CGI standard. The browser takes all the parameters and puts them into a single string without spaces. The browser replaces the spaces with a + symbol and converts all other symbols to hexadecimal values preceded with a % symbol. The first half of the string is the location of the URL on the server that contains the location of the CGI script. The second half of the string is 'name = value' pairs. The two halves are separated by a ? symbol and the 'name = value' pairs are separated by a & symbol.

Table 13.1 is an example of the HTML Input

Table 13.1. HTML to CGI.

HTML Input CGI <INPUT NAME="Item" VALUE="1"> Item=1 <INPUT NAME="FirstName" VALUE="John Doe"> FirstName=John+Doe <INPUT NAME="Company" VALUE="Al's RVs"> Company=Al%27s+RVs

Simple Form Examples


Here are several examples of HTML text that can be used to pass parameters to a CGI Script.

If the browser is reading a single input form page that contains these tags:


<FORM ACTION="http://www.myserver.com/scripts/mycgi.exe">

<INPUT TYPE=HIDDEN NAME="Item" VALUE="Oranges">

<INPUT TYPE=SUBMIT>

</FORM>

the server would receive a request of


http:\\www.myserver.com\scripts\mycgi.exe?Item=Oranges

If the browser is reading a multiple input form page that contains these tags


<FORM ACTION="http://www.myserver.com/scripts/mycgi.exe">

<INPUT TYPE=HIDDEN NAME="Item" VALUE="Oranges">

<INPUT TYPE=HIDDEN NAME="Price" VALUE="2.00">

<INPUT TYPE=SUBMIT>

</FORM>

the server would receive a request of


http://www.myserver.com/scripts/mycgi.exe?Item=Oranges&Price=2.00

Hidden INPUT types like those in the preceding are not shown to the user by the browser. More interesting INPUT types and other tags allow user interaction. If the browser is reading a page that contains these tags (where the INPUT is associated with a text box, and the SELECT is associated with a drop-down list containing two choices: 1.00 or 2.00), you would have code something like this:


<FORM ACTION="http://www.myserver.com/scripts/mycgi.exe">

<INPUT TYPE=TEXT NAME="Item">

<SELECT NAME="Price"'>

<OPTION>1.00

<OPTION>2.00

</SELECT>

<INPUT TYPE=SUBMIT>

</FORM>

When the user selects the link, the parameters will be sent to mycgi.exe just like the multiple input form's response..

CGI Scripting—the Server Side


On receiving a request from a browser, the Web server interrupts the request and decides if it is a request for a CGI script or a static page, such as an HTML text file. If the request is for a CGI script, the server passes the parameters that are referenced in the URL to the CGI script. The CGI script decodes the parameters and uses them to create output that is sent back to the server. The server takes the output and returns it to the browser.

What Makes Up an HTML Page?


As a static text file, HTML pages start with the <HTML> tag and end with the </HTML> tag. When the server reads a static page from the disk, it outputs not only the text file to the browser, but also adds a header at the beginning. The header contains information for the browser that describes the server state and the content of the information following the header. The header contains a status line to indicate that the server transaction completed successfully, and it also contains a line to indicate the format of the information following the header.

When your CGI script executes, it must generate its own header information to be sent back to the server. The capability of sending back the header line allows the script to notify the server whether or not it has run successfully. For now, assume that the CGI script is returning as an HTML page to the browser.

A successful execution written in the header might look like this:


Status: 200

Content-type: text/html

<HTML>

<BODY>

Hello World

</BODY>

</HTML>

Notice the extra line between the header and the first <HTML> tag. This is very important syntax for the Web browser; without this line, your CGI script won't run properly. The server returns the output of your CGI script to the client's browser. When viewed as source from the browser, the header won't be displayed. 200 is a standard success code. Table 13.2. contains other code.

Table 13.2. Header Return Codes and their meanings

Return Code Meaning 200 Request completed 201 Object created, reason = new URI 202 Asynchronous completion (TBS) 203 Partial completion 204 No information to return 300 Server couldn't decide what to return 301 Object permanently moved 302 Object temporarily moved 303 Redirection with new access method 304 If-modified-since was not modified 400 Invalid syntax 401 Access denied 402 Payment required 403 Request forbidden 404 Object not found 405 Method is not allowed 406 No response acceptable to client found 407 Proxy authentication required 408 Server timed out waiting for request 409 User should resubmit with more information 410 The resource is no longer available 411 Couldn't authorize client 500 Internal server error 501 Required not supported 502 Error response received from gateway 503 Temporarily overloaded 504 Timed out waiting for gateway

Notice that codes in the 200s are used when the action was successfully received, understood, and accepted. Codes in the 300s are used when further action must be taken in order to complete the request. Codes in the 400s are used when the request contains bad syntax or cannot be fulfilled. Codes in the 500s are used when the server failed to fulfill an apparently valid request.

When your CGI script runs into an error, either processing information or accessing resources like SQL Server, it should return both an error status code and some HTML text describing the problem. For instance:


Status: 500

Content-type: text/html

<HTML>

<BODY>

Server Error, please try again later

</BODY>

</HTML>

The Development Environment


Before you get started making your own scripts and viewing them, you need to have a development environment. I prefer to have a single machine running NT 3.51 that has both the browser, the Web server, and my compiler on it. Debugging takes place on the Web server because the CGI scripts are running on the Web server. With Microsoft Developer Studio, you can debug the CGI scripts you write, so it makes sense to have both the compiler and the Web server on the same machine. I prefer not to swivel between two machines because it will be the browser that activates your scripts on the Web server. Finally, you need to have a Web server in which there is low activity. A poorly written script can crash the Web server, causing downtime for other users.

A Word About Security


Writing CGI scripts for Web servers is like allowing everyone to run a program on your machine. The first step to good security is to make sure that the users cannot read your script. This isn't a problem if you use compiled program written in a language such as C++. This is a concern when you're writing in batch or another type of runtime script. Make sure that the directory that contains your script has EXECUTE permissions, but not READ. A good example is the default IIS (Microsoft's Internet Information Server) script directory. This will allow users to execute the CGI scripts but not read them.

Using Batch Files


The default installation of IIS can execute batch files as CGI scripts. Batch files, considered the scripting language of DOS, are not as powerful as Perl scripts in UNIX. Because batch files lack string handling functions, they are limited in use as CGI scripts. Their only advantage is that they are a runtime language and make good sample programs.

Hello World!


Let's create a "Hello World!" CGI script. Create a batch file named List13_1.bat in the scripts directory of your Web server, as shown in Listing 13.1.

Listing 13.1. The Hello World Example


@echo off

REM Header

echo Status: 200

echo Content-type: text/html

echo.

REM Body

echo "<HTML><BODY>Hello World!</BODY></HMTL>"

To run the script, type a URL address into your browser as MYMACHINE\Scripts\Lst13_1.bat, where MYMACHINE is the name of your computer.

The first thing to notice is that "Hello World!" is in quotes. This is because the echo statement thinks that > is the symbol to pipe the output to a device. In double quotes, the > is outputted instead of the piped. This problem is the reason that batch files make poor CGI scripts.

How the CGI Script Accesses the Parameters


The server puts the CGI parameters into the environment variable QUERY_STRING. QUERY_STRING contains the CGI string in its CGI-coded form. It is the responsibility of the CGI script to get the information it needs from QUERY_STRING. Unfortunately, batch files do not have functions for string manipulation. This means that you will be able to view the CGI string but not separate the 'name = value' pairs or translate the CGI hexadecimal values. This problem is another good reason not to use batch files for CGI scripts.

Listing 13.2 allows the user to view the CGI parameters passed to the batch file.

Listing 13.2. A batch file that returns the Query string.


@echo off

REM Header

echo Status: 200

echo Content-type: text/html

echo.

REM Body

echo "<HTML><BODY>Query String: %QUERY_STRING%</BODY></HMTL>"

To run the script, type a URL address into your browser as MYMACHINE\Scripts\Lst13_2.bat, where MYMACHINE is the name of your computer. Notice that the QUERY_STRING doesn't contain a value; the browser displays "Query String:" as the text on the HTML page. Now change the URL to MYMACHINE\Scripts\Lst13_2.bat?Name=John. The browser now displays "Query String: Name=John" as the text on the HTML page.

Other Parameters Passed to the Server


Besides the parameters passed to the server as part of the URL, the server also puts other information about the browser and the server state in environment variables (see Table 13.3).

Table 13.3. Server parameters.

HTTP variable Meaning HTTP_USER_AGENT The maker and version of the browser. Best used for making adjustments in the layout for different browsers. HTTP_REFERER The URL that the client is referencing. HTTP_CONTENT_TYPE The Content type of the input passed to the page. HTTP_CONTENT_LENGTH The length of the content passed to the page. CONTENT_LENGTH The length of the content passed to the page. CONTENT_TYPE The Content type of the input passed to the page. GATEWAY_INTERFACE Version of CGI handled by the server. HTTP_ACCEPT The type of input the browser will expect. PATH_INFO The pathname of the URL without the server name. PATH_TRANSLATED The full path to the CGI script. This variable is great for getting a location where you can write additional files. REMOTE_ADDR The IP address of the client. REMOTE_HOST The hostname of the client. REMOTE_USER This contains the username supplied by the client and authenticated by the Server. SCRIPT_NAME The name of the script being executed. SERVER_NAME The name of the server. This is good for referencing other pages on this server. SERVER_PROTOCOL The version of the HTML protocol the server is using. SERVER_SOFTWARE The Web server name and version. REQUEST_METHOD The type of request used either GET or POST. QUERY_STRING The CGI string passed with a GET Request.

Listing 13.3 is an example of a batch file that displays all the major environment variables.

Listing 13.3. Views all the return values from a batch file CGI script.


@echo off

REM Header

echo Status: 200

echo Content-type: text/html

echo.

REM Body

echo "<HTML><BODY>"

echo QUERY_STRING: %QUERY_STRING% "<BR>"

echo ALL_HTTP: %ALL_HTTP% "<BR>"

echo HTTP_USER_AGENT: %HTTP_USER_AGENT% "<BR>"

echo HTTP_REFERER: %HTTP_REFERER% "<BR>"

echo HTTP_CONTENT_TYPE: %HTTP_CONTENT_TYPE% "<BR>"

echo HTTP_CONTENT_LENGTH: %HTTP_CONTENT_LENGTH% "<BR>"

echo HTTP_EXTENSION: %HTTP_EXTENSION% "<BR>"

echo AUTH_TYPE: %AUTH_TYPE% "<BR>"

echo CONTENT_LENGTH: %CONTENT_LENGTH% "<BR>"

echo CONTENT_TYPE: %CONTENT_TYPE% "<BR>"

echo GATEWAY_INTERFACE: %GATEWAY_INTERFACE% "<BR>"

echo HTTP_ACCEPT: %HTTP_ACCEPT% "<BR>"

echo PATH_INFO: %PATH_INFO% "<BR>"

echo PATH_TRANSLATED: %PATH_TRANSLATED% "<BR>"

echo REMOTE_ADDR: %REMOTE_ADDR% "<BR>"

echo REMOTE_HOST: %REMOTE_HOST% "<BR>"

echo REMOTE_USER: %REMOTE_USER% "<BR>"

echo REQUEST_METHOD: %REQUEST_METHOD% "<BR>"

echo SCRIPT_NAME: %SCRIPT_NAME% "<BR>"

echo SERVER_NAME: %SERVER_NAME% "<BR>"

echo SERVER_PORT: %SERVER_PORT% "<BR>"

echo SERVER_PROTOCOL: %SERVER_PROTOCOL% "<BR>"

echo SERVER_SOFTWARE: %SERVER_SOFTWARE% "<BR>"

echo "</BODY></HMTL>"

Save the preceding example as Lst13_3.bat in your scripts directory and call it from your wwwroot directory by creating a Lst13_3.htm that looks like this:


<HTML>

<BODY>

<FORM ACTION="http://MYMACHINE/scripts/lst13_3.bat">

Name: <INPUT TYPE=TEXT NAME="Name">

<INPUT TYPE=SUBMIT>

</FORM>

</BODY>

</HTML>

The Difference Between POST and GET


There are two TYPE methods that FORM can use to transmit the parameters for the CGI script: GET and POST. You can choose which method to use in the <FORM> tag by entering METHOD=POST or METHOD=GET. By not entering any action, the form defaults to GET. Both methods send information by the way of CGI to the server, but in different ways.

The main difference between POST and GET is the way in which you receive the CGI parameters. With GET, you get the parameters through QUERY_STRING. With POST, the parameters are piped into the batch file through standard input (stdin). Another difference is that GET can support only 255 characters in the CGI string. POST has an unlimited number.

Change Lst13_3.htm's FORM to read


<FORM ACTION="http://MYMACHINE/scripts/Lst13_7.bat" METHOD=POST>

Now, try resubmitting the CGI script to the Lst13_3.bat. Notice that the QUERY_STRING isn't filled in. Also notice that CONTENT_LENGTH has a value of 10 with the GET method and a value of 0 with the POST method. Look at the address space of the URL at the top of your browser: with a GET request the CGI script parameters appeared, but with a POST request the CGI script parameters don't appear. This is important for passing confidential information from one page to another. POST also gives a cleaner look to your Web page. Finally, make note that the REQUEST_METHOD is POST and not GET. The differences between POST and GET are listed in Table 13.4.

Table 13.4. Differences between POST and GET.

POST GET Doesn't display parameters Displays parameters in the address line
in the address line of the of the URL
URL Doesn't set QUERY_STRING Sets QUERY_STRING Sets CONTENT_LENGHT Doesn't set CONTENT_LENGTH Unlimited CGI string length Limited to 255 characters in CGI string

Batch files have no way of supporting standard input (stdin). By using CGI scripts created in C++, you can handle standard in and separate the 'name = value' pairs passed in by the form.

Hello World in C++


To create a C++ CGI script, I use VC++ 4.1 and MFC. All C++ CGI scripts must be console applications, in Microsoft Windows. It's important to remember that there is a possibility of more than one user using your application at a time. For every user executing the script, a new instance of the application will be open.

To create a C++ CGI script start by

  1. 1. Open Microsoft Developer Studio.
  2. 2. Choose File | New.
  3. 3. Choose Project Workspace from the New list box and click OK.
  4. 4. From the Type list box choose Console Application.
  5. 5. Name the project Lst13_4.
  6. 6. Click Create.
  7. 7. You now need to include MFC. Select Build from the menu bar, then select Settings. In the General Tab, choose "Use MFC in a Shared Dll (mfc40.dll)" from the Microsoft Foundation Classes select Box. Click OK.
  8. 8. Open a blank text file by selecting File from the Menu bar, clicking on New, and then selecting text file.
  9. 9. Save the file as Lst13_4.cpp. Select File|Save. Call the file Lst13_4.cpp.
  10. 10. All that is left is to include the code file in the project. Select Insert from the menu bar and then select Insert Files into Project. Select Lst13_4.cpp and then press Add.
  11. Insert the text in Listing 13.4 into Lst13_4.cpp.

Listing 13.4. A C++ Hello World Example


// lst13_4.cpp

#include <stdio.h>

void main( int argc, char *argv[ ], char *envp[ ] )

{

// Header

printf("Status: 200\r\n");

printf("Content-type: text/html\r\n");

printf("\r\n");

// Body

printf("<HTML><BODY>Hello World!</BODY></HTML>\n");

}
  1. 11. Now compile the project. Make sure the configuration selected is Lst13_4—Win32 Debug. Select Build Settings from the Menu bar and then select Build Lst13_4.exe.
  2. 12. Copy Lst13_4.exe from the debug directory to the scripts directory of your IIS.
  3. 13. To View your CGI script, open your browser and enter this line as the URL: MYMACHINE\Scripts\Lst13_4.exe, where MYMACHINE is the name of your computer.

In step 7, you chose a shared mfc40.dll instead of a static library. The reason for this is that it cuts down on operating overhead. To reduce operating overhead, having a smaller executable is better. Sharing DLLs makes for smaller executables because the MFC code is in a DLL, not bound to your executable. If more than one person is using your application, then more than one of your applications is loaded into memory, but only one shared mfc40.dll is loaded. Also, the smaller your CGI script, the quicker it will load, allowing the page to be sent to the user faster. The thing to remember about using shared DLLs is that you will need to copy mfc.dll to the server along with your CGI script. Remember that the preceding example assumes that your compiler and your server are on the same machine. This means that mfc.dll will be in your system path and you will not need to copy it.

In step 7, you chose to use MFC, yet the code used as an example didn't reference MFC. The preceding example is intended to be used as a generic example of how to create a CGI script. Other examples in this chapter will reference MFC.

Make sure to compile the debug configuration. CGI Scripts will run in both debug and release. For debugging purposes, you need to have a debug build. Final products should be in a release build because release executables are smaller and take less time for the Web server to load.

More advanced users of Microsoft Developer Studio may want to create their projects within the scripts directory of the IIS. The advantage of this is that you don't have to copy the executable (Step 12). When naming the project, select the scripts directory as the location instead of the default projects directory. When you compile your executable, the CGI script will be built into the scripts directory. The browser URL will also be different:


MYMACHINE/scripts/Lst13_4/debug/Lst13_4.exe

where MYMACHINE is the name of your computer.

The CGI scripts are no different then a regular console application. They can be run from a DOS command line and will display exactly the same information that it sends to the server. In fact, the way to send information to the server is to write to standard out (stdout), just like a console application. Running the console application is a way of debugging your executable.

Notice the header section of example nine source code. Each header line has both a new line character \n and a carriage return \r; these are required. Also note that the required space is represented by a \r\n.

Getting the CGI Parameters in C++


Create another console application as you did in Listing 13.4, call it Lst13_5, and use the source from Listing 13.5. Compile it and copy it to your server's scripts directory.

Listing 13.5. An example of viewing the query string.


// lst13_5.cpp

#include <stdio.h>

#include <afx.h>

void main( int argc, char *argv[ ], char *envp[ ] )

{

// Header

printf(_T("Status: 200\r\n"));

printf(_T("Content-type: text/html\r\n"));

printf(_T("\r\n"));

// Body

printf(_T("<HTML><BODY>"));

DWORD  dwBufferSize=50;

LPTSTR szQuery = new TCHAR[dwBufferSize];

GetEnvironmentVariable(_T("QUERY_STRING"),szQuery, dwBufferSize);

printf(_T("QueryString: %s"),szQuery);

printf(_T("</BODY></HTML>\n"));

delete szQuery;

}

Copy Listing 13.3's HTML form from Lst13_3.htm to Lst13_5.htm and change the Form attributes to read


<FORM ACTION="http://MYMACHINE/scripts/Lst13_5.exe" METHOD=GET>

Load Lst13_5.htm in your browser, type a name, and submit the data to your C++ CGI script. The query string should represent the name you typed.

With the power of C++, you can take the query string passed by the server and resolve the 'name = value' pairs of CGI into C variables that you can use. In addition, you can retrieve the information from standard input (stdin) and process post methods.

First, you must make a form that sends interesting data to your CGI script. Save the code in Listing 13.6 as lst13_6.htm in the wwwroot directory of your server.

Listing 13.6. Reading CGI parameters.


<HTML>

<BODY>

<FORM ACTION="http://MYMACHINE/scripts/Lst13_10.exe" METHOD=GET>

Name: <INPUT TYPE=TEXT NAME="Name"><BR>

Age: <INPUT TYPE=TEXT NAME="Age"><BR>

<INPUT TYPE=SUBMIT>

</FORM>

</BODY>

</HTML>

Create, compile, and copy to the server a CGI script called Lst13_6.exe that contains the following source code:


// Lst13_6.cpp

#include <stdio.h>

#include <afx.h>

// Global Cache for Post

// You can only read from

// Standard In Once

TCHAR *szCache=NULL;

TCHAR ConvertHex(TCHAR cHigh, TCHAR cLow)

{

static const TCHAR szHex[] = _T("0123456789ABCDEF");

LPCTSTR pszLow;

LPCTSTR pszHigh;

TCHAR cValue;

// Find the Values in the Hex String

pszHigh = _tcschr(szHex, (TCHAR) _totupper(cHigh));

pszLow = _tcschr(szHex, (TCHAR) _totupper(cLow));

// If both Values Exist Then Calculate the Value

// Based off of the string

if (pszHigh && pszLow)

{

cValue = (TCHAR) (((pszHigh - szHex) << 4) + (pszLow - szHex));

return (cValue);

}

return('?');

}

// Returns the String

LPVOID TranslateCGI(LPTSTR pszString)

{

LPTSTR pszIndex = pszString;

LPTSTR pszReturn = pszString;

// unescape special characters

while (*pszIndex)

{

// Translate '+' to Spaces

if (*pszIndex == _T('+'))

*pszReturn++ = _T(' ');

// Translate Hex Strings to characters

else if (*pszIndex == _T('%'))

{

*pszReturn++=ConvertHex(pszIndex[1], pszIndex[2]);

pszIndex+=2;

}

// or just copy the character

else

*pszReturn++ = *pszIndex;

pszIndex++;

}

// Terminate the End

*pszReturn = '\0';

return (LPVOID) pszString;

}

DWORD GetValue(LPTSTR szCGI, LPTSTR szName, LPTSTR szValue, DWORD dwValueSize)

{

LPTSTR    szIndex;

LPTSTR    szEnd;

DWORD    dwReturnSize=0;

// Find The Name in the Query String

szIndex=_tcsstr(szCGI,szName);

// Error: The Name part of the Name value pair doesn't exist

if (!szIndex)

return (0);

// Increase the pointer passed the Name and Get to the Value

szIndex+=_tcslen(szName)+1;

// Find the End of the Value by looking for the '&'

szEnd=_tcschr(szCGI,_T('&'));

// if we find a '&' set it as the end

if (szEnd)

(*szEnd)='\0';

// Remove the CGI Syntax

TranslateCGI(szIndex);

// Calculate the Value Size

dwReturnSize=_tcslen(szIndex);

// Chop the Value if bigger than the Allocation of Value

if (dwReturnSize>dwValueSize)

szIndex[dwValueSize]=_T('\0');

// Assign the Value if there is allocated space

// if no space has been allocated then the caller

// is just looking for the string size

if (szValue)

_tcscpy(szValue,szIndex);

// If we are going to return the size of

// Allocated space we might as well

// include the Null

return (dwReturnSize+1);

}

// Returns The Length of szValue on successful execution else returns 0

DWORD GetMethod(LPTSTR szName, LPTSTR szValue, DWORD dwValueSize)

{

DWORD    dwBufferSize=0;

DWORD    dwReturnSize=0;

LPTSTR    szQuery=NULL;

// Call GetEnvironmentVariable To get the buffer size

dwBufferSize=GetEnvironmentVariable(_T("QUERY_STRING"),szQuery,dwBufferSize);

// Error: QUERY_STRING doesn't exist

if (!dwBufferSize)

return(0);

// Allocate the Space needed

szQuery = new TCHAR[dwBufferSize];

// Call Again

dwBufferSize=GetEnvironmentVariable(_T("QUERY_STRING"),szQuery,dwBufferSize);

// Get the Value From the Query String

dwReturnSize=GetValue(szQuery,szName,szValue,dwValueSize);

delete szQuery;

return (dwReturnSize);

};

// Returns The Content Length on successful execution else returns 0

DWORD GetContentLength()

{

DWORD    dwBufferSize=0;

LPTSTR    szContentLength=NULL;

DWORD    dwContentLength;

// Call GetEnvironmentVariable to get the buffer size

dwBufferSize=GetEnvironmentVariable(_T("CONTENT_LENGTH"),

szContentLength,dwBufferSize);

// Error: CONTENT_LENGTH doesn't exist

if (!dwBufferSize)

return(0);

// Allocate the Need Space

szContentLength = new TCHAR[dwBufferSize];

// Call Again

dwBufferSize=GetEnvironmentVariable(_T("CONTENT_LENGTH"),

szContentLength,dwBufferSize);

// Change the String to a usable form

dwContentLength=(DWORD)_ttoi(szContentLength);

delete szContentLength;

return(dwContentLength);

};

// Returns The Length of szValue on successful execution else returns 0

DWORD PostMethod(LPTSTR szName, LPTSTR szValue, DWORD dwValueSize)

{

DWORD    dwBufferSize;

LPTSTR    szContentType=NULL;

DWORD    dwContentTypeSize=0;

LPTSTR    szPost;

DWORD    dwReturnSize=0;

UINT    nCount;

if (szCache)

{

dwBufferSize=_tcslen(szCache);

// Allocate Some Memory for szPost Plus the NULL

szPost=new TCHAR[dwBufferSize+1];

_tcscpy(szPost,szCache);

}

else

{

// Look at the CONTENT_TYPE to see if it is a POST

// Call GetEnvironmentVariable to get the buffer size

dwContentTypeSize=GetEnvironmentVariable(_T("CONTENT_TYPE"),

szContentType,dwContentTypeSize);

// Error: CONTENT_TYPE doesn't exist

if (!dwContentTypeSize)

return(0);

// Allocate the Need Space

szContentType = new TCHAR[dwContentTypeSize];

// Call Again

GetEnvironmentVariable(_T("CONTENT_TYPE"),szContentType,dwContentTypeSize);

if (!_tcscmp(szContentType,_T("application/x-www-form-urlencoded")))

{

// Figure out the Size of the String

dwBufferSize=GetContentLength();

if (!dwBufferSize)

return(0);

// Declare the Memory for the String plus the NULL

szPost = new TCHAR[dwBufferSize+1];

nCount=0;

// Read the Standard In

while (!feof(stdin) && (nCount<dwBufferSize))

{

szPost[nCount++]=(TCHAR)_fgetchar();

}

szPost[nCount]=_T('\0');

// Cache the CGI String

szCache=new TCHAR[dwBufferSize+1];

_tcscpy(szCache,szPost);

}

else

{

// Not a POST so return unsuccessfull

return(0);

}

}

// We now have the CGI String, Lets Get the Value

// Get the Value From the Post String

dwReturnSize=GetValue(szPost,szName,szValue,dwValueSize);

delete szPost;

// If we are going to return the size of

// Allocated space we might as well

// include the Null

return (dwReturnSize);

return(0);

};

// Returns The Length of szValue on successful execution else returns 0

DWORD GetParameter (LPTSTR szName, LPTSTR szValue, DWORD dwValueSize)

{

DWORD    dwBufferSize=0;

LPTSTR    szRequestMethod=NULL;

// Look at the environment variable REQUEST_METHOD

// Call GetEnvironmentVariable to get the buffer size

dwBufferSize=GetEnvironmentVariable(_T("REQUEST_METHOD"),

szRequestMethod,dwBufferSize);

// Error: REQUEST_METHOD doesn't exist

if (!dwBufferSize)

return(0);

// Allocate the Need Space

szRequestMethod = new TCHAR[dwBufferSize];

// Call Again

dwBufferSize=GetEnvironmentVariable(_T("REQUEST_METHOD"),

szRequestMethod,dwBufferSize);

// It's has to be POST or GET

if (!_tcscmp(szRequestMethod,_T("GET")))

{

delete szRequestMethod;

return(GetMethod(szName,szValue,dwValueSize));

}

if (!_tcscmp(szRequestMethod,_T("POST")))

{

delete szRequestMethod;

return(PostMethod(szName,szValue,dwValueSize));

}

delete szRequestMethod;

return(0);

};

int main( int argc, char *argv[ ], char *envp[ ] )

{

DWORD  dwValueSize=0;

LPTSTR szValue=NULL;

// Header

_tprintf(_T("Status: 200\r\n"));

_tprintf(_T("Content-type: text/html\r\n"));

_tprintf(_T("\r\n"));

// Body

_tprintf(_T("<HTML><BODY>\n"));

// Find out how big the name parameter is going to be

dwValueSize=GetParameter(_T("Name"),szValue,dwValueSize);

// Allocate enough space for The Value of Name

szValue=new TCHAR[dwValueSize];

// Get The Value Again this time with a big enough buffer

dwValueSize=GetParameter(_T("Name"),szValue,dwValueSize);

// Display the Name

if (dwValueSize)

_tprintf(_T("Name : %s\n"),szValue);

delete szValue;

_tprintf(_T("<BR>\n"));

// Do it all Again for Age

dwValueSize=GetParameter(_T("Age"),szValue,dwValueSize);

szValue=new TCHAR[dwValueSize];

dwValueSize=GetParameter(_T("Age"),szValue,dwValueSize);

if (dwValueSize)

_tprintf(_T("Age : %s\n"),szValue);

_tprintf(_T("</BODY></HTML>\n"));

delete szValue;

// Clean the Cache

if (szCache)

delete     szCache;

return(0);

}

The Main function calls GetParameter() twice, once with name and once with age. The return will be the value of name and age as passed in by the form. If no value is present or there is an error, the GetParameter() will return zero; otherwise, it will return the character length of the value.

With GetParameter(), no matter what method is used in the form, the value of the named variable will be returned. GetParameter() looks at the environmental variable REQUEST_METHOD to figure out if the method is a POST or a GET. If it is a POST method, PostMethod() is called. In PostMethod(), the environment variable CONTENT_TYPE is checked to make sure that the post is coming from a form. CONTENT_LENGTH is also checked so that the right size string can be allocated. The CGI string is then read from standard in as a file would be read. Notice that in PostMethod(), you cache the CGI string returning from standard input; standard input can be read only once. If GetParameter() detects that the GET method is called, then GetMethod() is executed. GetMethod() loads the CGI script from QUERY_STRING. Try experimenting with the FORM method, changing the method in lst13_6.htm to POST. Type lst13_6.htm into your browser address space. You will need to refresh to get the changes. Now, try resubmitting the data.

Both GetMethod() and PostMethod() call GetValue() after the CGI string is acquired. GetValue() separates the value that is associated with the name from the CGI string. After the value is separated, it's passed to TranslateCGI(). TranslateCGI() changes + to spaces and translates hexadecimal characters.

Notice that TCHAR and _t windows functions were used wherever possible. This allows the code to be compiled either in MBCS or UNICODE.

It's important to consider the length of the strings you allocate to hold the data coming in. Remember that real users mean real data. Consider an input tag that sends in first names. On the Web, users from all over the world can access your page, so their first names might be longer than you expect. If the data overrides the memory you have allocated, the CGI script could crash, causing the server to crash. To solve this problem, you can limit the data coming in by using the MAXLENGTH attribute on input tags. You can handle any length value like GetParameter(). Also, make sure that users cannot send in parameters that make your CGI script crash. Because users can type anything in the URL address of their browser, make sure to test all possibilities. For instance, if the standard URL address and CGI string look like this:


http://MYMACHINE/scripts/mycgi.exe?Name=John

and the a user types this instead


http://MYMACHINE/scripts/mycgi.exe?&Name&Name=&&&&&&

can your CGI script handle it without crashing?

Debugging C++ CGI Scripts


One of the disadvantages of using C++ CGI scripts is that you have to debug them. The optimal way to test CGI scripts is to put them into the scripts directory and have the server call them. This way, they are called just as they would be in practice. Microsoft Developers Studio has an excellent debugging environment, but in order to use the debugger, Microsoft Developers Studio has to call the executable you're testing. Here lies the dilemma: either to have the server call the CGI scripts without a debugger or have Microsoft Developers Studio call the CGI script without the benefit of the server.

The Web Server Debugging

If the server calls the CGI scripts and there is an ASSERT or an access violation, the process running the CGI script hangs. This usually causes the server not to return a header, leaving the browser waiting until it times out. The reason an ASSERT or an access violation hangs is that it tries to initialize a message box without a valid window handle. The only solution is to kill the process. With the server calling the CGI script, other errors such as returning the wrong output are equally as hard to debug. For instance, if no output is returned by the CGI script, the browser displays the screen shown in Figure 13.1.

Figure 13.1. Bad return.

This type of screen leaves the programmer no information for debugging. Like an ASSERT or an access violation, inserting a DebugBreak() into the CGI script also hangs the process.

The solution is to redirect the output for ASSERT, warnings, or errors to an output file. The StartDebugging() and StopDebugging() procedures in Listing 13.7 create a file called error.log, and all warnings, errors, and ASSERTs will be written to it. StartDebugging() opens the file and redirects the output. StopDebugging() closes the file handle. Add StartDebugging to the beginning of Main() and StopDebugging() to the end of Main().

Listing 13.7. Redirecting debugging information to a file.


void StartDebugging()

{

#ifdef _DEBUG

hFile= CreateFile("error.log",GENERIC_WRITE,

FILE_SHARE_WRITE,NULL,OPEN_ALWAYS,

FILE_ATTRIBUTE_NORMAL,NULL);

if (hFile)

{

_CrtSetReportMode(_CRT_WARN, _CRTDBG_MODE_FILE);

_CrtSetReportFile(_CRT_WARN, hFile);

_CrtSetReportMode(_CRT_ERROR, _CRTDBG_MODE_FILE);

_CrtSetReportFile(_CRT_ERROR, hFile);

_CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE);

_CrtSetReportFile(_CRT_ASSERT, hFile);

}

_RPT0(_CRT_WARN, "Start Debug Reporting\r\n" );

#endif

};

void StopDebugging()

{

#ifdef _DEBUG

_RPT0(_CRT_WARN, "Stop Debug Reporting\r\n" );

if (hFile)

CloseHandle(hFile);

#endif

};

int main( int argc, char *argv[ ], char *envp[ ] )

{

StartDebugging();

// The Code

StopDebugging();

return(0);

}

Notice that _RPT0() can be used to write to error.log. This function could be used to mark entrances and exits of procedures, output variables, and other information for debugging. The error file will not be created with release builds and shouldn't be used with multiple processes running.

Tip


Unless you precede the error filename with the full path, the file will be created in the scripts directory. This could be a security issue if your users can read the scripts directory. Either redirect the output out of the Web space or make sure that the directory security is set to execute only.


Microsoft Developer Studio

The other option for debugging your C++ CGI script is to call the CGI script from Microsoft Developer Studio. The advantage of this, compared to having the Web server call the script, is that you can set break points and view variables with Microsoft Developer Studio. Problems start to arise, however, when the input to your CGI script is considered. Because the CGI script retrieves its output from environment variables set by the server, environment variables need to be set to debug. Microsoft Developer Studio doesn't have an easy way to set environment variables, so either the programmer needs to set the variables in the System Properties or the variables need to be set in the program. Setting the variables in the System Properties requires the programmer to open the control panel, change the variables, and restart Microsoft Developer Studio. It's easier to set the variables in the program, recompile the program, and run it from the debugger. Here is an example of setting the variables for Listing 13.8.

Listing 13.8. Inserts for the Setting of the Variables


int main( int argc, char *argv[ ], char *envp[ ] )

{

SetEnvironmentVariable("QUERY_STRING","Name=John+Doe&Age=25");

SetEnvironmentVariable("REQUEST_METHOD","GET");

…

This is also a great deal of trouble because every time the strings change, you need to recompile.

The POST method poses another problem. Microsoft Developer Studio does not allow you to pipe standard input into a program that you are running on the debugger. Because the Web server sends its information through standard input into a program, there is really no way to test the POST method with Microsoft Developer Studio.

There is no surefire, easy method for testing C++ CGI scripts with either the Web server or Microsoft Developer Studio. These issues get resolved with ISAPI Server Extensions because Microsoft Developer Studio can handle Server Extensions DLLs better.

Summary


The Common Gateway Interface (CGI) is a standardized parameter passing syntax. CGI scripts are programs that are executed by the Web server, that read a CGI parameter string, and that output to the client. Because the data returned to the browser differs for every execution, CGI scripts create dynamic Web pages. A CGI script can be written in any language the Web server can execute. The server passes parameters to the CGI script with environment variables and standard input, and the CGI script passes the output to the server with standard output. A good CGI script can read the CGI sting passed in by the GET and POST method. Although debugging CGI script is not straightforward, you can use them to create dynamic and interactive Web pages quickly.

Previous Page Page Top TOC Next Page