Unpacking binary data in PHP
5 stars based on
While reading and writing these kind of binary data was normally done in languages like C or even assembler, most higher level languages still have these capabilities and yes, even PHP… Meet pack and unpack. However, if you are just like me, then you are curious enough and want to know. Even might tell you an optimization trick or two in the meantime: When dealing with binary data in PHP there are 2 main functions that you cannot live without.
The pack and unpack functions take a binary string and convert them into an array. Both work more or less the same way. With unpack you have to add the key since the output is an associative array. The binary format for PNG files are available on the internet. When viewing a PNG file in a hex viewer or editor, you will see the handling binary data in php with pack and unpack 8 bytes are always the same.
You cannot really read it, so we have to unpack the data from it:. As example, it checks if the highbit is actually 0x89 and checks the singnature for PNG. You should check the others as well. Each chunk is formatted the same way:. Before reading the chunk data, we must read the chunk length. So first thing we do is read the first 8 bytes or 2 dwords actually:. When reading one chunk, you can read them all. Depending on handling binary data in php with pack and unpack chunk-type, you can actually unpack the data and display or use that information as well.
Including things like the last time it was written and a lot of text chunks. Since these chunks are not needed for displaying the PNG correctly, and they only take up space, you could write a program that removes these chunks from the PNG. This is a trick that most image-compressors will use to achieve smaller size images without changing even 1 byte on the actual image. Binary data is handled differently depending on your CPU.
When reading a word or dword from binary data, make sure you handling binary data in php with pack and unpack in which endian the data is written otherwise you might end up with incorrect data. Especially when you want to write binary data, make sure you think of everything. Things can get very complicated and miswriting a single byte will corrupt your whole image.
There are a lot of libraries out there that can do these things way better than you ever will. This blogpost has been posted over two years ago. That is a long time in development-world! The story here may not be relevant, complete or secure. Code might not be complete or obsoleted, and even my current vision might have completely changed on the subject. So please do read further, but use it with caution.
Please enter a valid address.