Category Archives: PHP

How to use the file_get_contents() function to make an HTTP request from PHP

In a previous post I talked about using the HttpRequest object and functions in the PECL_HTTP extension to make HTTP requests from PHP. In some cases you may be limited to using functionality built into the PHP core. The file_get_contents() function has less features than the PECL_HTTP extension but it is built into PHP 4.3 and up. Here is an example of using it to retrieve the landing page at www.example.com:

<?php
 
echo file_get_contents("http://www.example.com");
 
?>

Someone hit that easy button.

The file_get_contents() functions as well as many other PHP file functions implement a streams abstraction layer largely conceived by Mr. Wez Furlong. This abstraction layer is what enables many of the PHP file functions to access network resources. Given this functionality “file” seems a misnomer.

The file_get_contents() function uses an HTTP GET request but what if you want to do a POST without using cURL or the PECL_HTTP extension? Furlong posted an article here on how to do just that.

This next code example uses the file_get_contents() function again but this time a few options are set first using the stream_context_create() function:

<?php
 
$http_options = stream_context_create(array(
    'http' => array(
        'user_agent' => "Mark's Browser",
        'max_redirects' => 3)));
echo file_get_contents("http://www.example.com", false, $http_options);
 
?>

Note that the array passed to the stream_context_create() function can also be used to specify a POST method, which is how Furlong does so in his blog post.

There is still yet another way to make an HTTP request from PHP that I haven’t covered yet using the PHP built-in cURL functions. I will cover these in a separate blog post.

How to make your PHP application check for its dependencies

The very informative phpinfo() function

The phpinfo() function displays just about everything you want to know about your PHP installation. It includes info on all your PECL and PEAR modules so it is a quick way to check what’s installed. It will also tell you useful web server information including any query strings you pass. To display all this good info just point your browser at a page on your server that contains the following code:

<?php
 
phpinfo();
 
?>

That’s it!

Automatic dependency checking

The phpinfo() function will give us a page that displays a lot of info. You can pass it bitwise constants to narrow down the information displayed but what if we want to check for specific items?

In my humble opinion, when developing software or anything in general, it is a good idea to design things so that the end user will not even need a manual because the user interface is obvious. When something doesn’t work, it should be equally obvious how to fix it.

If you write a PHP application that others will install and use, it is a good idea to check for dependencies when they try to use the application. This way even if they don’t read your documentation they will quickly know why the software is not working.

Using phpversion(), PHP_VERSION, and version_compare() to check the PHP verson

To get the core PHP version you can use either of the following methods:

<?php
 
echo phpversion();
echo "<br/>or<br/>";
echo PHP_VERSION;
 
?>

The above code should output something like this:

5.2.6-2ubuntu4
or
5.2.6-2ubuntu4

If you are using Ubunto or some other distribution, you will note that some additional stuff is tack on to the version number (I.e. “-2ubuntu4″). This makes a comparison to your expected version a little tricky but you can use a substr()/strpos() combo to get what you need. There is an easier way to do the comparison though. The version_compare() function is “PHP-standardized” version aware. So we can do something like this:

<?php
 
if (version_compare(PHP_VERSION, '5.0.0', '<')) {
    echo 'You are using PHP version ' . PHP_VERSION . 
      'This program requires PHP version 5.0.0 or higher.';
} else {
    echo 'You are using PHP 5.0.0 or higher. You are all set!';
}
 
?>

Now you can check the PHP version and notify the user if it is not the minimum required version.

The PHP function documentation for each function at www.php.net include the PHP versions that contain the function in the upper left hand corner:

substr function version on php.net

You can use this to learn what versions of PHP include the functions you are using in your code to help identify your minimum PHP version requirement.

Using get_loaded_extensions() to check for extensions

The get_loaded_extensions() function will return an array of PHP extensions that you can use to check if a specific extension is installed. Use it in combination with with the in_array() function to check if the extension you require is loaded. In this example I check if the PECL_HTTP module is installed:

<?php
 
if (in_array("http",get_loaded_extensions())){
    echo 'The PECL_HTTP module is installed. '.
      'You are all set!';
} else {
    echo 'The PECL_HTTP module is not installed. '.
      'Please install it.';
}
 
?>

You can use the phpversion() function to check if extension is listed and if so, its version. This code example not only checks if the PECL_HTTP module is installed, but also checks it’s version:

<?php
 
if (!phpversion('http')){
    echo 'The PECL_HTTP module is not installed. '.
      'Please download it from '.
      '<a href="http://pecl.php.net/package/pecl_http">here</a> '.
      ' and install it.';
} else {
    if (version_compare(phpversion('http'),'1.6.0','>=')){
        echo 'The PECL_HTTP extension is installed and '.
          'version 1.6.0 or higher. You are all set!';
    } else {
        echo 'Please upgrade your PECL_HTTP extension to '.
          'version 1.6.0 or higher. You can download it '.
          '<a href="http://pecl.php.net/package/pecl_http">here'.
          '</a>.';
    }
}
 
?>

Use function_exists() to check for individual functions

So far the methods for checking dependencies have been somewhat broad. They check that the script has a certain version of PHP or extensions installed and that will likely be good enough in most cases. If you really want to be thorough you can also check if specific functions are available using the function_exists() method. In this example I check that the http_request() module, which is part of the PECL_HTTP extension, is there before I use it. If it is not, I use the less featured, built in, file_get_contents() function.

<?php
 
if (function_exists("http_get")) {
    echo 'Using the http_get():<br/>' .
        http_parse_message(http_get("http://www.example.com"))->body;
} else {
    echo 'Using the file_get_contents():<br/>' . 
        file_get_contents("http://www.example.com");
}
 
?>

Check for include files

Here is a simple way to check for include files. It doesn’t verify their content but you can at least make sure they are there:

<?php
 
if (!file_exists('httptest.php')){
    die('The httptest.php include file is missing.');
} else {
    require_once('httptest.php');
}
 
?>

Wrap up

Checking dependencies is an important part of building robust software and hopefully the above techniques will help accomplish that. Even if your end user is a very technical they will likely appreciate a good dependency checking mechanism that quickly tells them whats missing to save them time. If your software will be used by non-technical users you might want to automatically and gracefully downgrade your software feature set instead of generating errors and asking them for something they won’t know how to do. Usability is king!

PHP HttpRequest class options and notes

In a recent post I talked about the PECL_HTTP extension for PHP. In this post I will cover a few of the options you can set for the HttpRequest object and related functions.

The PECL_HTTP extension allows you to set a number of options when you make a request. Usually you put the options in a key=>value array and pass the array as an argument to the request functions (I.e. http_get(), http_post_data(), http_request(), etc.) or assign the array to the HttpRequest object using the setOptions() method. Here is a code example using the HttpRequest object:

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));

In the above code the timeout was set to 10 seconds and the user agent, which is a request header that identifies the browser to the server, is set to “MyScript”. I am going to cover just a few of the options but a full list of all the request options can be found here.

timeout

The timeout option specifies the maximum amount of time in seconds that the request may take to complete. Set this too high and your HTTPD process that PHP is running in could be stalled for quite a bit waiting for a request that may never complete. If you set it too low you might have problems with sites that are just slow to respond. This may require some tweaking depending on what you are doing. If you are making requests to a web server in Taiwan you might want to set the timeout a bit higher. The default timeout does not appear to be documented in the HttpRequest options page on PHP.NET (that I could find) but if you look at the http_request_api.c file in the HTTP_PECL source code, it looks like it is 0L AKA nothing:

HTTP_CURL_OPT(CURLOPT_TIMEOUT, 0L);

This indicates it will wait forever unless you explicitly set a timeout so it might be a good idea to set one! I put together a page that makes an HTTP request and then another one that will sleep for some number of seconds that I can test against.

Here is the code for the page that will sleep:

<?php

echo "Sleeping.";
sleep(30);

?>

Here is the code for the page that will make the HTTP request:

<?php
$http_req = new HttpRequest("http://localhost/alongtime.php");
$http_req->setOptions(array(timeout=>10, useragent=>"MyScript"));
try {
    $http_req->send();
} catch (HttpException $ex) {
    if (isset($ex->innerException)){
        echo $ex->innerException->getMessage();
        exit;
    } else {
        echo $ex;
        exit;
    }
} //end try block
echo $http_req->getResponseBody();
?>

When I pull up the page that makes the HTTP request in my browser I get the following error:

Timeout was reached; Operation timed out after 10000 milliseconds with 0 bytes received (http://localhost/alongtime.php)

If I don’t set the timeout option at all, the page responds 30 seconds later since it will wait forever or at least the 30 second sleep time on the target page.

connecttimeout

The connecttimeout option indicates the maximum amount of time in seconds that the request may take just connecting to the server. This does not include the time it takes for the server to process and return the data for the request. This option will have the same considerations as above although the number should be considerably lower since it is only the connection timeout and not the timeout for the whole request. Again, the default value is not documented but if you look at the http_request_api.c file in the HTTP_PECL source code, it looks like it is 3 seconds:

HTTP_CURL_OPT(CURLOPT_CONNECTTIMEOUT, 3);

dns_cache_timeout

One of the interesting features of the HTTP_PECL extension is that it will cache DNS lookups. Some of the Windows operating systems do this but many of the Linux distributions do not by default. By the way, if you want to clear your cached DNS lookup entries on a Windows box use the command “ipconfig /flushdns”. If you are making multiple requests to the same site, DNS lookup caching should provide a significant performance advantage because a round trip to the DNS server isn’t required for every request. The dns_cache_timeout option sets the number of seconds that will pass before the cached DNS lookup results will expire and a new DNS lookup will be performed. Again, the default value is not documented but if you look at the http_request_api.c file in the HTTP_PECL source code, it looks like it is 60 seconds which is probably fine for most applications:

HTTP_CURL_OPT(CURLOPT_DNS_CACHE_TIMEOUT, 60L);

redirect

The redirect option determines how many redirects the request will follow before it returns with a response. The default is 0 (this IS documented), which may not work in many situations because some applications respond with one or two redirects for authentication, etc. If you set this too high your application may get bounced around too many times and never return. I have not tried it but you could probably put someone in a redirect loop. Anyway, a value of around 4 or 5 should be adequate for most applications I would imagine.

useragent

The useragent option allows you to specify a different User-Agent request header to send to the server than the default which is “PECL::HTTP/x.y.z (PHP/x.y.z)” where x.y.z are the versions.

I made a little one-liner test page that returns the user agent info sent to the server:

<?php

echo $_SERVER['HTTP_USER_AGENT'];

?>

If I make an HTTP request to this page using the HttpRequest object without setting the useragent I get:

PECL::HTTP/1.6.2 (PHP/5.2.6-2ubuntu4)

If I do something like this:

$http_req->setOptions(array(timeout=>10, useragent=>"Mark’s Browser"));

I will get:

Mark’s Browser

The reason I bring this up is because some applications that you might make a request to may respond different depending on your user agent setting. In some cases you may need to spoof a specific browser to get what you are after.

Conclusion

As mentioned before, there are many more HttpRequest options. I just covered a few notable ones that I have some limited experience with.

How to: PECL HTTP request exception and error handling

In a previous post, we created a simple PHP page that told us if http://www.example.com is up or down by using the PECL HTTP extension to make an HTTP request to the site and look for the string “example” in the response. Here is the code for our test page, httptest.php:

<?php

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));
$http_req->send();

if (stripos($http_req->getResponseBody(), "example") === false){
    echo "The page is down!";
} else {
    echo "The page is up!";
}

?>

The problem with this code is that there is no error handling. Below are a few examples of what can go wrong and the resulting errors:

DNS Lookup Failure

If the DNS lookup fails the page will return the following error:

Fatal error: Uncaught exception 'HttpInvalidParamException' with message 'Empty or too short HTTP message: ''' in /var/www/httptest.php:12 inner exception 'HttpRequestException' with message 'Couldn’t resolve host name; Couldn’t resolve host 'www.somewebsitethatdoesnotexist.com'
(http://www.somewebsitethatdoesnotexist.com/)' in /var/www/httptest.php:4 Stack trace: #0 /var/www/httptest.php(12): HttpRequest->send() #1 {main} thrown in /var/www/httptest.php on line 12

Since www.example.com is a valid DNS name I used “www.somewebsitethatdoesnotexist.com” instead to demonstrate what happens with an invalid name or failed DNS lookup. Note the “inner exception” that says “Couldn’t resolve host name”. More on “inner exceptions” in a bit. This is not very pretty for a diagnostic page.

Connection Failure

In this example I again used “www.somewebsitethatdoesnotexist.com” but I added the following entry to the /etc/hosts file on the server:

10.10.10.10 www.somewebsitethatdoesnotexist.com

Now the DNS entry will resolve using the /etc/hosts file but this is not a valid IP for any machine on my neetwork so I see this error:

Fatal error: Uncaught exception ‘HttpInvalidParamException’ with message ‘Empty or too short HTTP message: ''' in /var/www/httptest.php:12 inner exception ‘HttpRequestException’ with message ‘Timeout was reached; connect() timed out! (http://www.somewebsitethatdoesnotexist.com/)’ in /var/www/httptest.php:4 Stack trace: #0 /var/www/httptest.php(12): HttpRequest->send() #1 {main} thrown in /var/www/httptest.php on line 12

Again we have a inner exception buried in all of that telling me that the connection time out.

404 Error

In this example I put in “http://localhost/notarealpage.php” for the URL. This will connect to the local Apache server but that page doesn’t exist so the server will return a 404 file not found error. The server responded but since we are not checking the response code from the server our code just tells us the page is down and that is true but it would be useful to know that it is because the page is missing!

The page is down!

If the server responds OK we will get a 200 status code. We should handle any other response appropriately.

Handle the exceptions

The first thing we can do is put a try catch block around our code and try catching the HttpException as shown in example section of the documentation for the HttpRequest::send method:

<?php

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));

try {
    $http_req->send();
} catch (HttpException $ex) {
    echo $ex->getMessage();
}

if (stripos($http_req->getResponseBody(), "example") === false){
    echo "The page is down!";
} else {
    echo "The page is up!";
}

?>

If there is a time out or connection failure the HttpException is caught and we see this:

Empty or too short HTTP message: ''The page is down!

Hmm… that is not very informative and the same error is displayed for both a name lookup failure and a connection timeout. We can also try changing:

echo $ex->getMessage();
to
echo $ex;

Now we get this:

exception 'HttpInvalidParamException' with message 'Empty or too short HTTP message: ''' in /var/www/httptest.php:16 inner exception 'HttpRequestException' with message 'Couldn’t resolve host name; Couldn’t resolve host 'www.ssomewebsitethatdoesnotexist.com'
(http://www.ssomewebsitethatdoesnotexist.com/)' in /var/www/httptest.php:5 Stack trace: #0 /var/www/httptest.php(16): HttpRequest->send() #1 {main}The page is down!

That is painfully ugly but at least get the “Couldn’t resolve host name” message in there so we know what went wrong. Still, we can do better.

In addition to putting a try-catch around the send() method you probably should surround all of your HttpRequest code with a try-catch that eventually catches “Exception” to be safe.

The undocumented inner exception

The HttpException object, which is not really documented all that well as of this writing, has something completely undocumented called an inner exception. The inner exception is a more detailed exception that is nested inside the HttpException object. We can check if an inner exception is set and if so, display that instead:

<?php

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));

try {
    $http_req->send();
} catch (HttpException $ex) {
    if (isset($ex->innerException)){
        echo $ex->innerException->getMessage();
        exit;
    } else {
        echo $ex;
        exit;
    }
}

if (stripos($http_req->getResponseBody(), "example") === false){
    echo "The page is down!";
} else {
    echo "The page is up!";
}

?>

Now we get just the part we are interested in:

Couldn’t resolve host name; Couldn’t resolve host 'www.ssomewebsitethatdoesnotexist.com'
(http://www.ssomewebsitethatdoesnotexist.com/)

If the lookup is OK but we get a connection timeout we now see this:

Timeout was reached; connect() timed out! (http://www.somewebsitethatdoesnotexist.com/)

If no inner exception is detected the HttpException is echoed.

Check status codes

Sometimes the server may be responding but we do not get a 200 status. This could because of a redirect, security error, missing page, or a 500 server error. The HttpRequest object provides a getResponseCode() method so we can check what the response was and handle it appropriately. If redirects are followed the last received response is used. For this example we will simply echo out some of the common status codes if we don’t get a 200:

<?php

$http_req = new HttpRequest("http://www.example.com/blah");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));

try {
    $http_req->send();
} catch (HttpException $ex) {
    if (isset($ex->innerException)){
        echo $ex->innerException->getMessage();
        exit;
    } else {
        echo $ex;
        exit;
    }
}

$response_code = $http_req->getResponseCode();

switch ($response_code){
    case 200:
    break;
    case 301:
    echo "301 Moved Permanently";
    exit;
    case 401:
    echo "401 Unauthorized";
    exit;
    case 404:
    echo "404 File Not Found";
    exit;
    case 500:
    echo "500 Internal Server Error";
    exit;
    default:
    echo "Did not receive a 200 response code from the server";
    exit;
}

if (stripos($http_req->getResponseBody(), "example") === false){
    echo "The page is down!";
} else {
    echo "The page is up!";
}

?>

This handles a few of the more common response/status codes. To test the code we can put in a valid URL but a bogus page (I.e. http://www.example.com/blah) If everything works right we now see the following response from our diagnostic page:

404 File Not Found

Final Notes

Our little diagnostic page can now handle most of the errors it will likely encounter when it attempts to test our example site, example.com. If we wanted to take this a bit further we could add a database back-end that maintains a list of multiple sites we would like to test. To take things a step further we could execute this PHP script from the command line via a cron job that runs every 5 minutes. We could then have the script send an e-mail when a site was down with the problem indicated in the message. If we wanted to maintain some up-times stats would could log the outages to a database and generate uptime/SLA reports daily, weekly, yearly, etc.

In reality, I would just use something like IPSentry or Nagios to conserve effort for future generations but it was nice to think about. ;) Happy coding!

How to: Find your php.ini file and enable PHP error reporting

On some distributions PHP error reporting or display of errors is disabled by default as a security precaution. This is a good idea for production systems because errors may reveal useful information to undesirables. In a development environment on the other hand, it is generally useful to see your errors. ;) If error display is disabled you may just see a blank browser window/empty page when you expect an error. To enable errors and error display, find the your php.ini file and make sure the following lines are set:

;Show all errors, except for notices and coding standards warnings
error_reporting = E_ALL & ~E_NOTICE

display_errors = On

On Ubuntu you can find the php.ini file here:
/etc/php5/apache2/php.ini

On other distributions try:
/etc/php.ini

On Windows you might find it here:
c:\windows\php.ini

If you are running XAMPP it will be in the php folder off the XAMPP root.

You will need to restart Apache (or IIS as the case may be) so your changes will take effect:

On Ubuntu:

sudo /etc/init.d/apache2 restart

On other distributions you might try:

sudo /etc/init.d/httpd restart