How to use the PECL HTTP (PECL_HTTP) Extension to make HTTP requests from PHP

PECL HTTP is a feature-rich PHP extension that allows you to make HTTP and HTTPS (SSL) requests from your PHP code and handle the responses. If you are not familiar with PECL, it is a library of extensions to add functionality to PHP. It uses the same package and delivery system as PEAR.

Many distributions do not install PECL_HTTP by default even if you install PHP. If you try to use one of the PECL_HTTP object or functions (I.e. http_get()) without the extension installed you will likely get something like this:

Fatal error: Call to undefined function http_get()

If this error comes up but you think you installed the PECL_HTTP package and it shows up in phpinfo(), then it is possible your PECL_HTTP install failed and did not get cleaned up so phpinfo() still sees it. This may happen if you didn’t install the cURL source library dependency first (see below).

So let’s pretend we own the site http://www.example.com (See RFC 2606.) We want to build a PHP diagnostic page that will tell us that www.example.com is returning the string “example” somewhere in the page indicating if the page is up or down. First I need to install the PECL_HTTP extension. For details on how to install a PECL extension see my post “How to install a PHP PECL extension/module on Ubuntu“. For now I am going to assume that the php-pear and php5-dev packages have already been installed. These instructions are based on a Ubuntu install:

  • Install the libcurl3-openssl-dev package. The HTTP_PECL extension requires some of the cURL source libraries so we will have to install the cURL library package first:
    sudo apt-get install libcurl3-openssl-dev

    If you don’t install the cURL source library package you will likely see the following error when you attempt to install the PECL_HTTP extension:

    checking for curl/curl.h... not found
    configure: error: could not find curl/curl.h
    ERROR: ‘/tmp/pear/temp/pecl_http/configure --with-http-curl-requests --with-http-zlib-compression=1 --with-http-magic-mime=no --with-http-shared-deps’ failed
  • Install the HTTP_PECL module with the following command:
    sudo pecl install pecl_http

    The installer may ask you about some specific options but unless you really know what you want, you can probably just hit enter one or more times to accept all the defaults. If all goes well, the module should download, build, and install.

  • Once the install is complete, it will probably ask you to add a “extension=http.so” line to your php.ini file. Open up the php.ini file in your favorite text editor and add the line under the section labeled “Dynamic Extensions”. On Ubuntu the php.ini file seems to be located in the /etc/php5/apache2 folder:
    sudo nano /etc/php5/apache2/php.ini
  • Now that the php.ini file has been updated, Apache will need to be restarted so the new extension will be loaded:
    sudo /etc/init.d/apache2 restart

    That should restart Apache on Ubuntu but if that doesn’t work you can try:

    sudo /etc/init.d/httpd restart

At this point hopefully the PECL_HTTP extension is installed so now we can create a PHP script that will make an HTTP request to http://www.example.com and display the results. For this example I will use the http_get() function. The first argument is a predefined constant of the request method type (GET, POST, etc.) and the second argument is a string containing the URL. I created a file named httptest.php (using “sudo nano /var/www/httptest.php” with the following code and put it in the /var/www folder (The default HTTP root on a Ubuntu server):

<?php

echo http_get("http://www.example.com");

?>

or you could use the http_request function instead to do the same thing:

<?php

echo http_request(HTTP_METH_GET,"http://www.example.com");

?>

When the page is opened in a web browser (I.e. http://sparky/httptest.php) it returns something like this:

HTTP/1.1 200 OK Date: Sun, 04 Jan 2009 22:41:54 GMT Server: Apache/2.2.3 (CentOS) Last-Modified: Tue, 15 Nov 2005 13:24:10 GMT ETag: "b80f4-1b6-80bfd280" Accept-Ranges: bytes Content-Length: 438 Connection: close Content-Type: text/html; charset=UTF-8

You have reached this web page by typing "example.com", "example.net", or "example.org" into your web browser.

These domain names are reserved for use in documentation and are not available for registration. See RFC 2606, Section 3.

That’s it. Those are some pretty quick one-liners if we are fine with the default options. This time we’ll do something similar but use the HttpRequest object instead and set a timeout and a different user agent:

<?php

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));
$http_req->send();
echo $http_req->getRawResponseMessage();

?>

The output is the same as the previous two commands but this time the server could have taken up to ten seconds to respond before the request timed out. In addition, we sent the user agent string “MyScript” in the host header to the server. If you don’t want the HTTP response headers included in the the output, you use the getResponseBody() method instead:

<?php

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));
$http_req->send();
echo $http_req->getResponseBody();

?>

This outputs:

You have reached this web page by typing "example.com", "example.net", or "example.org" into your web browser.

These domain names are reserved for use in documentation and are not available for registration. See RFC 2606, Section 3.

No response headers this time. In fact, it looks as though you typed http://www.example.com in the browser.

I could set some URL query parameters using the setQueryData() if example.com was a dynamic page that accepts arguments but I am pretty sure it does not.

For the purpose of our example it doesn’t look like we have gotten very far but PHP is now getting a hold of the response data before we see it so we are halfway there. Now all we need to do is search for the string “example” and return some type of indicator letting us know that the example.com page is up or down:

<?php

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));
$http_req->send();

/*Note: The stripos() function just returns false if it doesn't find an upper or lower case version of the string we are looking for.*/
if (!stripos($http_req->getResponseBody(), "example")){
    echo "The page is down!";
} else {
    echo "The page is up!";
}

?>

If everything is working correctly we will see:

The page is up!

If example.com is broken or doesn’t return the string “example” our test page returns:

The page is down!

This is great and all but you may have noticed there is no error handling to speak of which isn’t good. I will talk about HTTP request/PECL_HTTP error handling in a separate post but until then, happy HTTPing!

How to install a PHP PECL extension/module on Ubuntu

PHP PECL extensions provide additional functionality over the base PHP install. You can browse the PHP PECL extensions available at the PECL repository here. The following steps show how to install a PECL extension/module on Ubuntu using the PECL_HTTP extension as an example and assumes that you already have Apache 2 and PHP 5 installed:

  • First, you will need to install PEAR via apt-get to get the necessary package and distribution system that both PEAR and PECL use. From a shell prompt enter:
    sudo apt-get install php-pear

    You will be prompted to confirm the install. Just press “y” and enter. If all goes well you should see it download and install the php-pear package.

    Note: “sudo” is used to provide the super user privileges necessary for the following command. So in this case the command “apt-get install php-pear” is being executed with super user privileges by preceding it with “sudo”. Unless configured otherwise, you will normally be prompted to enter a password when you use sudo. This is usually the same password that you logged in with.
  • Now you will need to install the php5-dev package to get the necessary PHP5 source files to compile additional modules. Enter the following from a shell prompt:
    sudo apt-get install php5-dev

    If you do not install the php5-dev package and try to install a PECL extension using “pear install”, you will get the following error:

    sh: phpize: not found
    ERROR: `phpize’ failed
  • The PECL_HTTP extension requires an additional dependency package to be installed. You can probably skip this for other extensions:
    sudo apt-get install libcurl3-openssl-dev
  • Now we are finally ready to actually install the extension. From a shell prompt enter following but substitute “pecl_http” with the PECL extension name you are installing:
    sudo pecl install pecl_http

    The installer may ask you about some specific options for the extension you are installing. You can probably just hit enter one or more times to accept all the defaults unless you want to set specific options for your implementation. If all goes well, the module should download, build, and install.

  • Once the install is complete, it will probably ask you to add a “extension=” line to your php.ini file. Open up the php.ini file in your favorite text editor and add the line under the section labeled “Dynamic Extensions”. On Ubuntu the php.ini file seems to be located in the /etc/php5/apache2 folder:
    sudo nano /etc/php5/apache2/php.ini

    In this example, the pecl_http extension install asked me to add “extension=http.so”.

  • Now that the php.ini file has been updated, Apache will need to be restarted so the new extension will be loaded:
    sudo /etc/init.d/apache2 restart

    That should restart Apache on Ubuntu but if that doesn’t work you can try:

    sudo /etc/init.d/httpd restart

If all went well your PECL extension should now be working. You might want to write a PHP test page that will test the basic functionality of the extension to make sure everything is working OK. You can use this later to check that all your required extensions are installed and working when you deploy to a new server. In this example where I installed the PECL_HTTP module, I might write a PHP page that uses the extension’s HttpRequest class to go get a page and return the results.

That’s it. For the next extension install you can skip the steps to install the php-pear and php5-dev packages.

Book Review: Smart and Gets Things Done: Joel Spolsky’s Concise Guide to Finding the Best Technical Talent

3 out of 5 stars

In my previous post I reviewed Joel Spolsky’s Joel on Software:… (I will spare you the full title). In this review I will be talking about one of the followup books to that, Smart and Gets Things Done: Joel Spolsky’s Concise Guide to Finding the Best Technical Talent. There are also a couple other Joel books I have not read yet that are worth noting:

A lot of the content in Smart and Gets Things Done seems to overlap with Joel’s other books. I understand that most of the books are just a rehash of his blog but I guess I was a little disappointed that there was duplication of content between his books. For about $12 however, this book is still a pretty good value. Particularly if you haven’t already have his other books.

In this book, Joel explains how to get the best programmers but it seems to lean more towards hiring the best college grads. Joel argues that the best programmer’s are so highly productive it is worth the extra pay and effort to bring them in as opposed to a mediocre programmer. I agree with this to some degree. I am not sure I totally agree with some of his techniques for evaluating who the best programmers are though.

Joel is an advocate of spending some extra dollars on perks for his interviewees and employees. One example is that he has a uniformed limo service pickup interviewees at the airport. At first this sounds ludicrous but the more I thought about it, the more it started to make sense. If you have gone through the labor of filtering all those resumes and conducting all those phone interviews to find the best candidates to interview in person then perhaps it is worth it to give them the treatment to sell the job. I checked and uniformed limo service from JFK to Manhattan runs less than $150. Considering a typical NYC IT salary, that is a drop in the bucket if it will help land a top notch programmer.

The book mentions purchasing a $900 chair for employees. I am not sure I could sit in a $900 chair without feeling a like a total snob but a $300 one seems to make sense. A comfortable programmer is likely a productive one. Nothing says you don’t care about your employees like an old, broken, stained, $100 office chair. The other thing the book mentions is giving employees dual monitors and top end computers. I think this is good advise and it probably isn’t expensive as you might think. For example, say you spend less than a $1000 to buy a new office chair, second monitor, dual head video card, and an extra 2GB stick of RAM for each employee and you expect those extras to last three years. That is about $27 a month per employee. I think that is a small price to pay for happy employees that feel valued. If that chair is super comfy you might even make up the cost in overtime work because they won’t be in such a hurry to get out of the chair at the end of the day. ;)

There are a few things in the book that I disagree with and this might be just because I don’t have enough experience to know better yet. The first is that the book says incentive pay doesn’t work. I disagree. I already talked about this in my review of Joel on Software and I won’t delve into it further here.

Another item in the book I don’t agree with is the concept that you don’t need an idea to build a software company. I suppose you don’t but it probably helps! I don’t have an MBA but I think that it is good business practice to identify a discrepancy or problem and build a product to fulfill it. Good programmers are great and all but I don’t think “best working conditions” + “best programmers” + “best software” always equals profit. You could build the best software but it won’t be profitable if the market is too small (you need the best sales people to pull that off). Perhaps I just misunderstood the first part of chapter 1.

One of the suggested interview questions to help separate the wheat from the chaff if is a pointer recursion question. I think it is more important to evaluate the skills that the interviewee claims to have. If they put in their resume that they have experience in a specific language then ask them to write something in that language. An outstanding web developer may never have touched pointers before because they simply never had to. Yes, there are leaky abstraction cases but typically those result in looking something up on Google rather than re-writing a module in C. Also, just because someone understands recursion and pointers doesn’t mean that they will be the best programmer for the job. They might have no understand of object oriented languages because all they have used is a procedural language such as C although admittedly that is less likely in this day and age.

The book goes on to say that pointers in C is more of an aptitude than a skill. Over the last year I have had a crash course in C/C++ and based on my own experience I argue that it is not that only a few people have an aptitude for pointers. The problem is that pointers are often poorly explained in many references and the syntax is a bit deceiving because the * symbol has a completely different meaning depending on if you are declaring a pointer or using it (“this is a pointer” and then “dereference this pointer”). If 90% of a computer science class isn’t getting something (as mentioned in the book) then I would say the professor should consider a different instruction strategy. Fortunately I had a pretty good instructor.

Despite some of my misgivings I think this book is worth the money especially if you don’t have any of the Joel on Software books already. There are many helpful tips including where to go looking for candidates, how to help employees feel at home in your organization, and how to turn around an existing team that is on the rocks.

Book Review: Joel on Software…

4 out of 5 stars

The full book title is actually Joel on Software: And on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work with Them in Some Capacity.

I felt the title was a bit too long for my blog post title so please excuse abbreviated version. ;) Other than the title and a few other points which I will cover shortly, I think it is a very good book written by someone who obviously has years of experience in the software industry. The author is Joel Spolsky and the content mostly consists of a series of short essays from his blog at http://www.joelonsoftware.com. Although you can read most of the content on his blog I think it is worth owning the printed book.

There are some technical sections in the book but it mostly focuses on software development and management in general. This is the kind of advise you would get from an experienced mentor. Of particular interest is the Joel Test. This is a list of 12 yes or no questions regarding things your software team should be doing to produce better code. The more items for which you can answer yes, the higher your team’s score. In my experience, such as it is, this is a pretty good test but there are a couple things I feel that are missing in the case of web application development. More on this shortly.

Joel talks about five worlds of software development:

  1. Shrinkwrap
  2. Internal
  3. Embedded
  4. Games
  5. Throwaway

I never really thought about it this way but it makes a lot of sense to do so. Depending on the type of development you are doing you will have wildly different priorities. As an example, Joel points out that embedded software often can never be upgraded so quality requirements are much higher than throwaway software for example, that will only be used once to message an input file. An awareness of the business model you are developing for should help sort your priorities.

There were a few things in the book I disagreed with. One example is that Joel believes that incentive pay is harmful. I think that managed correctly, it is not. One way to handle this is to simply reward employees privately for going above and beyond. In other words, don’t tell employees you are going to reward them for doing X or you will run into the kind of problems that Joel describes. Instead, just thank them with some reward after the fact and explain that this won’t always be the case so there is no expectation or set formula for them to work around. This will make employees feel more appreciated for going the extra mile and they will likely continue to do so even if there are no guarantee they will be rewarded again. This is kind of a pay it forward incentive.

I think employee ownership in the company is another good incentive. An individual employee may not have much control over the stock price of the company but it does provide the employee some justification to themselves as to why they should go above and beyond. After all, if you are hiring smart people, they will be smart enough to ask themselves why they are putting in that extra overtime to improve the quality of your product. Unless you are a managing a company that feeds starving children in Africa, smart employees will feel better about working overtime on a weekend if they are benefiting in some remote way and not just making some rich stock holders or owners even richer while their own salary remains flat.

There are a couple items that I think are missing from the Joel Test that are important for the web application world:

  • Is high availability, reliability, scalability, and performance part of your spec?
  • Do you load test?
  • Do you write automated unit and integration tests?

High Availability, Reliability, Scalability, and Performance

I have a little over a year of web application development experience but I have been supporting web applications for well over ten years now and if nothing else I have learned that you need to plan for availability, reliability, scalability and performance:

  • High availability ensures that your web application will be available when your users attempt to access your web application.
  • Reliability specifies how often your web application will work correctly. Just because your application is highly available doesn’t necessarily mean it always works right.
  • Scalability planning ensures that if your site is suddenly Slashdotted you can quickly add capacity or if your user base grows you can shard accounts across multiple database servers. A common bottleneck in web applications is data writes and you can get to a point where no single server will have enough write throughput for a high volume web site. Since this is a bottleneck on the database side, adding application servers is useless. A spec and/or design document needs to include a plan for eventually distributing data across multiple database servers for DB write intensive applications. It is not unheard of for a dot com to re-engineer a good portion of their software base in a big hurry to handle growth. Rushed software updates will likely create disgruntled employees and a poor user experience.
  • Performance indicates how quickly a web application will respond to a request. This should include internet lag time and take in consideration where users will be accessing the site from. If you are going to have a large user base in Australia for a example, it might be a good idea to consider implementing a content delivery network with a point of presence in Australia.

I think it is important to include very specific numbers for each of these items in the spec because each will strongly determine the level of effort and ultimately cost of a project. As you move closer to 100% availability and reliability, costs will likely go up exponentially for both development time and/or hardware. Scalability planning needs be in the spec so design time can be allocated for it on the project plan. Each incremental improvement will likely require more development time and/or hardware.

Load Testing

Load testing is critical to assessing performance bottlenecks in web applications. Your application’s performance may be spectacular with one user but what about 5000 users generating 500 requests a second to your database driven web application? It is good to know before it happens so you can plan accordingly and verify you are actually bottlenecking on over utilized hardware and not a “false bottleneck”. I define a false bottleneck as a situation where none of your server hardware is fully utilized and yet you can’t get any more throughput. This can be caused by a number of things such as a propagation delay, uncompressed images and JavaScript using up all the network bandwidth, or even something like a sleep(2) that someone put in the code for debugging and forgot to remove.

I believe optimizing your code without load test data can be a time sink. If you are optimizing your code without any data you might spot a Shlemiel the Painter’s Algorithm and then spend six hours fixing it resulting in a few micro seconds of execution time saved. That sounds great but if you had done some load testing and monitored your stats, you might have noticed that table scan on your database server costing you over half a second each transaction that could fixed in 10 minutes with a single create/alter index statement. Load testing helps show your actual bottlenecks versus your perceived ones.

Automated Unit and Integration Tests

Automated testing is essential for the same reasons as a daily build… to ensure that no one has broken anything. You can use automated unit and integration testing to do a full regression test of your software every day or even prior to every check-in. Daily builds will just identify if anyone has broken the build while automated testing will tell you if everything stills works like it should.

Unit tests focus on individual functions or units of code while integration tests verify that your application’s components are all working together per the requirements. A unit test, for example, will tell you if your circleArea() function is returning the correct value for given inputs. An integration test will tell you if data submitted via a form is being stored in the database correctly. Unit tests are good at helping you identify broken units of code at a more granular level while integration tests evaluate the overall functionality of your system as a whole. There are unit testing frameworks to facilitate authoring unit tests for all popular programming languages. In many cases unit testing frameworks can also be effectively implemented to do integration testing as well.

There is some debate over the line between unit testing and integration testing but I honestly don’t care too much. The ideal goal is that you can execute a single command to do a full regression test of your entire solution. This initially increases your development time but will more than make up for itself over long run if your software is going to be around for a while and have many revisions. I have seen a few projects where fixing one bug introduces another or adding a new feature breaks an existing feature. Automated testing will bring these cases to light quickly before you deploy or get too far down the development cycle.

Conclusion

Despite a few things that I disagreed with or omissions I give this book a 4 out of 5 stars. Joel obviously has a lot of experience and I learned a lot. Even if you are an experienced developer you might find some valuable insights from this book.