?

Log in

No account? Create an account

PHP4 memory usage

« previous entry | next entry »
Mar. 9th, 2006 | 11:23 am
mood: awakeawake
music: PJ Harvey

This is a complex topic, so I decided to write a practical guide. If it's useful, I will probably post it somewhere more accessible.

Consider the following script:
print memory_get_usage() . "\n";
$a = "a";
print memory_get_usage() . "\n";
$b = 1;
print memory_get_usage() . "\n";
$c = "b";
print memory_get_usage() . "\n";

The output is
13912
13984
14032
14088

This shows that 56 bytes are allocated for each of $b and $c. $a appears to use 72 bytes because of some sort of startup cost. 56 bytes is the standard memory usage of a php variable, regardless of if it is an integer, float or string. Of course, longer strings will need extra space.

This has a serious impact when you have large amounts of data, where each element is small. For example, an array of integers
array(1, 2, 3)

takes 264 bytes to create. The same data with the same precision could be stored as 12 bytes (3 words), or even 3 bytes, in C. That's the price of flexibility.

What does this mean? It means that a lot of memory can be saved by packing data into single php variables, such as strings. A serialized array will usually take considerably less space than the array itself, as it is a single php variable with a single string inside it. In fact, it takes only 56 bytes to store the serialized version of the 264 byte array above.

Array overhead


When recording memory usage, remember that arrays are implemented as hash tables. Every now and then, adding an element to an array will cause an additional chunk of memory to be allocated, rather than the usual 56 bytes. As a result, arrays use around 60-62 bytes per element.

Demonstration program:
$array = array();
$i = 0;
$last_mem = null;
$cur_mem = null;
$diff = null;
print memory_get_usage() . "\n";
$last_mem = memory_get_usage();
while ($i < 1000) {
  $array[] = $i;
  $cur_mem = memory_get_usage();
  $diff = $cur_mem - $last_mem;
  $last_mem = $cur_mem;
  if ($diff == 56) {
    print ".";
  } else {
    print "\n$diff ";
  }
  $i++;
}
print "\n";
print memory_get_usage() . "\n";

And the output:
16784

40 
40 ......
88 .......
120 ...............
184 ...............................
312 ...............................................................
568 ...............................................................................................................................
1080 ...............................................................................................................................................................................................................................................................
2104 .......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
76864


Each dot represents an additional 56 bytes. Any increase which was not 56 bytes is shown as the actual byte increase. I assume that these jumps are increases in the hash table size. The first two elements cost only 40 bytes to add, interestingly.


The following section was not written by me, I found it here

Beware of Infinity


"The only major error I know of in the PHP documentation is where the database documents say that you don't have free result sets after use, since PHP will free the memory automatically when the script terminates. This is bad advice all around, but especially bad for libraries.

Always free large amounts of memory when you're through using it. Database result sets are a good example. (Use the preferred function, such as pg_free_result().) Variables that end up containing large arrays or long strings should be unset() when you know you won't need them again.

This seems to contradict what I said earlier about not having to worry about pointers and memory management, but it's important. No matter how big and powerful newer computers get, they're still working with limited resources. And you never know how many times a particular script might legitimately call your class methods. You might be surprised how quickly repeated database result sets can eat up all the memory on a web server. Large arrays and long string values aren't as bad because the memory is garbage collected when they go out of scope, but keep your eyes open for problems. Database connections can cause trouble as well, since most database servers have a limit to the number of connections that can be made. If possible, reuse the same connection. If not, be sure to close the connection when you're done with it."

Link | Leave a comment | Share

Comments {0}