I found a problem with dhcp in the stock CentOS Amazon EC2 AMIs. Which currently in us-west-2b, causes instance startups with high failure rates. The root cause is most likely on Amazon's side, but this is a problem that doesn't seem to affect Ubuntu or Amazon Linux.<br />
<br />
The reason for the difference are different timeout and retry defaults. The CentOS defaults are 60 second timeout, and 300 second retry. Which is also the dhclient client default. Ubuntu patches it to change the defaults to 300 second timeout, and 60 second retry.<br />
<br />
I don't advise patching dhclient, but adding a /etc/dhcp/dhclient.conf with the options below has the same effect. I have tested this solution and it works great.<br />
<br />
timeout 300<br />
retry 60<br />
<br />
Another alternative is PERSISTENT_DHCLIENT=yes in /etc/sysconfig/network-scripts/ifcfg-eth0. This has the same effect, but slower. It causes dhclient to go into the background while retrying. Which allows the system to continuing booting. This means that things that depend on networking fail, because it hasn't yet come up. So it is a bad choice.
↧