Saturday, March 12, 2011

SimpleXMLElement and EntityRef XML parser

I am using the PHP class SimpleXMLElement to take care of parsing some XML data that I am sourcing from 3rd parties. It had been working well for a while, but I just discovered an error that was popping up frequently. This error was "XML parser error : EntityRef: expecting ';'".

This error comes about as a result of XML input data being improperly encoded. Two data sources I was using had encoded things like "&", "<" and ">" by leaving off the semi-colon. In other words, the ampersand had been encoded as "&" instead of "&". SimpleXMLElement doesn't like this and throws a warning fest.

To fix the problem, I added a line before calling SimpleXMLElement:
$xmldata = preg_replace('/&(amp|lt|gt)([^;])/', '&$1;$2', $xmldata);
$obj_xml = new SimpleXMLElement($xmldata);

The preg_replace fixes the encoding problem and adds the ampersand for you. Just a note, this will only fix the encoding for the three characters I specified above ("&", "<" and ">"). If there are others that are causing problems, you'll have to add them to the first argument of preg_replace().

Here is another blog/article that helped me discover the underlying issue.

Wednesday, March 2, 2011

Computer Backups

I use Backblaze to back-up my computer's files. Let me explain why...

It is always important to have your files backed-up. And to have those files backed-up in a location that is secure. That is, away from your computer and locked and/or encrypted. There are a few big players in the online backup service sector. Carbonite and Mozy are both good examples.

It is important to have all my files backed up, especially on my work computer. If something were to happen, like fire or theft, I need to have a backup copy of all my files so I can get back to work as soon as possible. The great thing about online back-up services is that your back-up happens automatically, whenever you have an internet connection. The more automated the better!

I've been using Carbonite for the past couple of years for my laptop. It has worked pretty good. I like how they add a little icon in your file explorer on top of each folder and file that is backed up. It is a great visual cue that helps me to see exactly what is backed up and what is not.

HOWEVER, over the past few months my computer fan will turn on really loud, even when I am not using the computer. Looking closer I found that Carbonite's software was eating around 50% of my processor, causing my computer to heat up and the fan to turn on. As soon as I disabled Carbonite, the CPU utilization fell and the fan turned off. That's not right!

Well, I finally broke down and contacted their technical support. First, their technical support sucks. They have a crumby web interface, their emails are formated all funny, and the people on the other end take forever to get back to you. But, the worst part is that after I jumped through their hoops and they finally got back to me, all they said is that I use my computer "too much." Seriously!? It runs at 50% even when I am not using my computer.

What a waste of my time.

SO, now I am using Backblaze. I heard about them a while back through this Slashdot article about how they had made their hardware design public. Very cool: open-source hardware. Now that's an innovative company. And, as a double bonus, they will encrypt my files in a way that is extra safe. Only I can open my files. (I have a feeling those tech guys are Carbonite can browse my personal files whenever they'd like.) Last, but not least, when I need to restore my backed-up files, they will mail me a physical disk with all my files on it. Awesome!

And, I must mention this because I am a web designer, the Backblaze website is WAY better looking. Nice work, guys!

Thursday, February 24, 2011

Storing Passwords in Databases

In the last few weeks/months there have been a couple high-profile computer breaches in the news. One was at Plenty of Fish, an online dating website. The second was at HBGary, a computer security company (ah, the irony).

Plenty of Fish made a major security mistake. They stored passwords in their database in plain text. This means, if you had access to their database (legitimately or through a SQL injection vulnerability) you could see anyone's password. And this is typically a big security no, no. Passwords should always be hashed before being stored in a database.

Hashing is a one-way encryption that prevents a password (or any character string) from being un-encrypted. So, when a user logs in, you hash their entered password and compare it against the hashed password in the database. If they match, the password is the same. It is simple AND safe.

Two particular hashes are quite popular. SHA1 and MD5. Although they are beginning to be a bit dated now and, if I remember correctly, MD5 has been shown to have some vulnerabilities. Some of the more recent hashes are SHA256 and SHA512. (Use the hash function to implement them in PHP.)

Over at HBGary, the "security" firm, they actually did use hashes to store passwords. They used the common MD5 hashing algorithm. The problem with their implementation, is that they added nothing to the password before hashing it. This is an issue because people have created massive online collections of auto-generated hashes from between 1 and 12 characters (typically). So, if you have a hash, and you're a hacker, you look up the hash in this table and (if it is a normal-sized password) it will likely be found.

The best way to make this look-up table irrelevant is to add a set of constant, known characters to the password before the hashing takes place. This technique, called adding "salt" to the password, will create an extra long password, that will never show up in a look-up table. Why? Because it would take more than a lifetime to compute look-up tables that long. Hashing is an expensive operation, in terms of CPU cycles, and the longer the original text, the longer it takes to compute.

If you are building a website, and you have passwords to store, remember: salt & hash. Mmmm, sounds tasty!

Friday, February 18, 2011

Mac crash / Mac blue screen flash during use (not at startup)

A few days ago my MacBook Pro (running OSX 10.6.6) starting having some weird issues. I would be working on the computer, and all of a sudden the screen would flash blue, and then a few seconds later show my blank desktop background, and then finally the dock would come back. All within a few seconds. All of my applications would be closed as if it had restarted.

This has happened about five times, all when I was using different programs. The first time it happened, it was after I dragged a file to trash; after that it happened while using iPhoto and it had frozen up on me; and another time while using Safari. Most recently, the blue screen flash occurred after waking the computer from sleep and trying to close the browser windows in Google Chrome.

I've researched a lot of Mac forums and apparently this is a common problem, and I saw posts from 2007 that were never resolved; I still haven't found a forum with a clear answer.

Anyways, I reviewed the console from the crash this morning, and saw that there was quite a bit of activity all occurring around the time of the blue screen flash, like "windowserver port death" and a whole lot of errors and warnings. This narrowed it down to a possible problem with the window server. I took it in to the Apple store nearby and spoke with a Mac Genius. I showed him the console with the errors, and he suspected it might be a permissions error causing the system to crash and then reboot (looks like it is logging out and then re-logging in at rapid pace).

The Mac Genius attached an external hard drive and rebooted from that and ran permissions repair. He recommended doing this every 3 months or so since it is easy to corrupt permissions. I had never done this before, and it took about 12 minutes. He also said he would recommend doing this at home by booting from the Install DVD I got with the computer. To do this, you have to press the 'option' key during startup in order to have the option to boot from the DVD. (I guess this is a change from pressing "C" with older models.)

Here's how to reboot and repair permissions from the OSX install DVD:

  • Insert install DVD, restart computer, and hold down the 'option' key during startup.
  • This will give the options of booting from the hard drive or the DVD – select DVD
  • The reboot could take up to 5 minutes or so.
  • When install screen comes up, go to "utilities" in the top menu bar and select "Disk Utility."
  • Click on the main hard drive (not the subfolder/partition), and click on "repair disk permission" button, allowing the process to complete. This could take a while, especially if never done before. (It could take up to 3 hours, but mine took 12 minutes.)
  • Quit Mac OSX installer from the menu bar and restart.
We were thinking this should clear up the problem, but if not, the second resort is to archive & reinstall OSX (archive will save all your data), and a third resort is to erase and install (after full backup of all data, of course). We'll see what happens! Hopefully it will work. That would be nice, since it is rather annoying to get the blue screen flash in the middle of doing something, and loosing any data you were working on.  

Update 5/17/11
Well, after many months, many disk permissions repairs and many more blue screen crashes, we decided to go back to the Genius Bar. It was evident that the problem was not fixed. When we returned they ran a test on the hard drive. It apparently failed the test to some degree because they replaced the hard drive. Perhaps it was a hardware problem all along? We'll see.

-- This is a guest post by my wife Susan

Tuesday, February 15, 2011

VSFTPD & SELinux

This can be a fun combo to work with. SELinux, or Security Enhanced Linux, is the life of any party. And Google searches about SELinux-related problems makes it pretty evident that very few people have taken the time to understand how this program works. I ran across numerous people simply suggesting that you turn off SELinux if it is getting in the way.

Well, I didn't buy into this wholesale approach to getting things to work. Besides, I want a secure system and if SELinux is going to help me in the long run, I want it enabled.

VSFTPD, or Very Secure FTP Daemon, is a pretty standard FTP server. You can install it on CentOS systems (or RHEL, for that matter) by running "yum install vsftpd" from the command line. Once you get it installed you can make changes to the configuration file at "/etc/vsftpd/vsftpd.conf" using a file editor. Also, to make sure it is always running in the background run "chkconfig vsftpd on" and "service vsftpd restart" from the command line.

Make sure the configuration file is set up properly. Only give FTP access to a limited number of users (never include root), disable anonymous access and make sure users can only access the files they need (I recommend chroot-ing them).

This is only half the battle. Now that VSFTPD is running, how do you allow users to access their home directories? SELinux usually will get in the way of this. Well, there is a SELinux setting for this called "ftp_home_dir" and this will allow users to access their home directory via FTP. To set this, from the command line run
setsebool -P ftp_home_dir 1

Be sure to also check file permissions and file ownership, if you run into problems. A file must be writable for everyone if they do not specifically own it.

If this fails to grant you FTP access where you need it, or your set-up is slightly different, you can always allow the FTP daemon full access to all files by running
setsebool -P allow_ftpd_full_access 1

This is granting a bit more power to the FTP daemon than is necessary, but it is much better than just disabling SELinux all-together.

By the way, here is a great intro to CentOS & SELinux.

Wednesday, February 9, 2011

Calculating Popularity

I am working on a new site that will have a sort of ranking system. It will be used to list a series of resources that visitors can rate and score. This scoring will then drive what resources are listed first and what resources are listed last.

Well, how in the world do you calculate ratings? I want to make sure new sites get a chance to rank high (so that old sites don't stay at the top forever). I also want to make sure that sites with a low number of votes are put at the top, just because all of the 2 people voted the max. And the list goes on, the more you think about it, the more there is to it.

Well, this blog entry does a great job of explaining different ranking algorithims and how they work. I give it a great score and a high rank. ;)

Thursday, January 13, 2011

Database connection (mysql_connect) taking a long time

After setting up a new database, the connection from the web servers was really slow. We had enough combined traffic to slow down the response of the PHP function mysql_connect() to between 5 and 20 seconds. But the current load on the MySQL server wasn't that high... something fishy was going on.

What I found was that the database server was trying to do a reverse look-up on every connection using the IP. It was slowing down every connection for this. You can disable this by using the following option, as explained here.

--skip-name-resolve

"Do not resolve host names when checking client connections. Use only IP addresses. If you use this option, all Host column values in the grant tables must be IP addresses or localhost. See Section 7.9.8, "How MySQL Uses DNS"." --> http://dev.mysql.com

Anyway, you can make this change in the configuration file to be loaded at start-up (it doesn't have to be a command-line option). Simply add "skip-name-resolve" on a new line in the /etc/my.cnf file and restart your DB server. VoilĂ !