While experimenting with OpenNebula and trying to build a public cloud with the EC2 interface to OpenNebula I encountered the following problem in the code:
[rogierm@cloudtest3 one]$ econe-upload /home/rogierm/test.img
/usr/lib/ruby/1.8/rdoc/ri/ri_options.rb:53: uninitialized constant RI::Paths (NameError)
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
from /usr/lib/ruby/1.8/rdoc/usage.rb:72
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
from /usr/local/one/bin/econe-upload:61
I fixed this problem by adding the following line (above the other require statements) in econe-upload, or any other command giving the same error:
require 'rdoc/ri/ri_paths'
OpenQRM uses dropbear for the communication and exchange of messages between the server and the appliances. When something goes wrong in this communication OpenQRM can’t function correctly. It can’t access the applicances for status updates and commands. These communication problems are often caused by a misconfiguration in dropbear. The most seen problem is a misconfiguration in the the public and private dropbear key.
The keys should be synchronized between the server and the appliance. On the server grep the public key with the following command:
[root@localhost log]# /usr/lib/openqrm/bin/dropbearkey -t rsa -f /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -y
Public key portion is:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgwCBvwSO7vBBL2avDMds...pVn root@localhost.localdomain
Fingerprint: md5 65:ca:5b:3b:05:c3:61:6d:fb:75:2f:c0:d2:7e:02:cf
Copy the ssh-rsa public key in /root/.ssh/authorized_keys on the appliance.
Now communication should be established.
OpenQRM event log with example of error message caused by communication problem:
openqrm-cmd-queue ERROR executing command with token 64d478dcac6670e5fb000e7c4954863b : /usr/lib/openqrm/bin/dbclient
Aug 26 23:19:45 localhost httpd: openQRM resource-monitor: (update_info) Processing statistics from resource 2
Aug 26 23:19:48 localhost logger: openQRM-cmd-queu: Running Command with token 64d478dcac6670e5fb000e7c4954863b 1. retry : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.243 "/usr/lib/openqrm/bin/openqrm-cmd /usr/lib/openqrm/plugins/xen/bin/openqrm-xen post_vm_list -u openqrm -p openqrm"
Aug 26 23:19:52 localhost logger: openQRM-cmd-queu: ERROR executing command with token 64d478dcac6670e5fb000e7c4954863b 2. retry : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.243 "/usr/lib/openqrm/bin/openqrm-cmd /usr/lib/openqrm/plugins/xen/bin/openqrm-xen post_vm_list -u openqrm -p openqrm" -----
Aug 26 23:19:52 localhost logger: Host '192.168.42.243' key accepted unconditionally.
Aug 26 23:19:52 localhost logger: (fingerprint md5 64:d5:c7:8e:7a:11:08:3f:43:bc:3c:2b:bf:4a:c8:ce)
Aug 26 23:19:52 localhost logger: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: /usr/lib/openqrm/bin/dbclient: connection to root@192.168.42.243:1667 exited: remote closed the connection
OpenQRM uses dropbear for communication between the OpenQRM server and the appliances. Dropbear is basically a simple version of SSH, so it uses host keys which are cached in /root/.ssh/known_hosts. Dropbear uses a different key than sshd, ssh and dropbear share the known_hosts file and ports are not included in this file.
When you ssh once into the appliance from the OpenQRM server the ssh hostkey is cached in the known_hosts file. Now if OpenQRM wants to connect to the appliance, dropbear checks the know_hosts file for the cached hostkey. This contains the ssh hostkey instead of the dropbear hostkey, so dropbear stops the connection because the hostkeys don’t matc which could be caused by a security compromise.
To solve the problem remove the hostkey entry for the host from /root/.ssh/known_hosts.
Aug 24 23:24:26 localhost logger: openQRM-cmd-queu: Running command with token 34b3e7ddd93ffa548d34ccea1e4aa7e5 : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.235 "/usr/lib/openqrm/bin/openqrm-cmd openqrm_server_set_boot local 1 00:00:5A:11:21:B7 0.0.0.0"
Aug 24 23:24:26 localhost logger: openQRM-cmd-queu: ERROR while running command with token bc7c6de1b59370dd8019bcae2d7bfa45 : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.235 "/usr/lib/openqrm/bin/openqrm-cmd openqrm_server_set_boot local 1 00:00:5A:11:21:B7 0.0.0.0" ----- /usr/lib/openqrm/bin/dbclient: connection to root@192.168.42.235:1667 exited:
Aug 24 23:24:26 localhost logger:
Aug 24 23:24:26 localhost logger: Host key mismatch for 192.168.42.235 !
Aug 24 23:24:26 localhost logger: Fingerprint is md5 65:ca:5b:3b:05:c3:61:6d:fb:75:2f:c0:d2:7e:02:cf
Aug 24 23:24:26 localhost logger: Expected md5 a8:e5:d4:62:36:d2:98:b2:c3:74:a9:0c:d5:d1:56:f9
Aug 24 23:24:26 localhost logger: If you know that the host key is correct you can
Aug 24 23:24:26 localhost logger: remove the bad entry from ~/.ssh/known_hosts
On a new Xen server I encounterd the following error while starting a fully virtualized guest:
[root@resource1 xen]# xm create test-vps.cfg
Using config file "./test-vps.cfg".
VNC= 1
Error: Unable to connect to xend: Name or service not known. Is xend running?
This problem was caused by a problem in the name resolving. I solved this by adding the hostname and ip address of the server in /etc/hosts
After this change the guest booted without problems.
After a yum upgrade of one of our CentOS 5 Xen server, xend would not start properly. The logs contained the following error messages below.
xend-debug.log:
Xend started at Wed Aug 26 18:15:57 2009.
sysctl operation failed -- need to rebuild the user-space tool set?
Exception starting xend: (13, 'Permission denied')
xend.log
[2009-08-26 18:15:57 3310] ERROR (SrvDaemon:347) Exception starting xend ((13, 'Permission denied'))Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvDaemon.py", line 339, in run servers = SrvServer.create() File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvServer.py", line 251, in create root.putChild('xend', SrvRoot()) File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvRoot.py", line 40, in __init__ self.get(name) File "/usr/lib/python2.4/site-packages/xen/web/SrvDir.py", line 82, in get val = val.getobj() File "/usr/lib/python2.4/site-packages/xen/web/SrvDir.py", line 52, in getobj self.obj = klassobj() File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvNode.py", line 30, in __init__ self.xn = XendNode.instance()
File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 752, in instance
inst = XendNode()
File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 87, in __init__
self.other_config["xen_pagesize"] = self.xeninfo_dict()["xen_pagesize"]
File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 741, in xeninfo_dict
return dict(self.xeninfo())
File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 685, in xeninfo
info['xen_scheduler'] = self.xenschedinfo()
File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 675, in xenschedinfo
sched_id = self.xc.sched_id_get()
Error: (13, 'Permission denied')
After some investigation this was quite easy to solve. The yum upgrade updated the kernel and modified the grub.conf. So after the reboot, the new xen kernel booted. However, this kernel did not match the xen tools installed. This is easily fixed by changing the grub.conf to boot the correct xen kernel. See the examples below for the exact change.
The grub.conf after the yum update that caused the problem:
title CentOS (2.6.18-128.7.1.el5xen)
root (hd0,0)
kernel /xen.gz-2.6.18-128.7.1.el5
module /vmlinuz-2.6.18-128.7.1.el5xen ro root=/dev/VolGroup00/LogVol00
module /initrd-2.6.18-128.7.1.el5xen.img
The changed grub.conf after the yum updated that fixed the problem:
title CentOS (2.6.18-128.7.1.el5xen)
root (hd0,0)
kernel /xen.gz-3.3.1
module /vmlinuz-2.6.18-128.7.1.el5xen ro root=/dev/VolGroup00/LogVol00
module /initrd-2.6.18-128.7.1.el5xen.img
There are couple of differences between IPv6 and IPv4 address allocation.
With IPv4, prefix length varies between subnets to subnets, and it caused painful costs when renumbering subnets (for example, imagine when you renumber an IPv4 subnet from /28 to /29 or vice versa).
With IPv4, the allocation varies by the size of the site, and made it very painful when you migrated from one ISP to another, for example.
A SAN is often implemented as a dedicated network that is considered to be a secure network. However, the nature of a SAN is that it is a shared network. This involves some serious security risks, that should be evaluated when using an iSCSI based SAN. Some vendors consider an iSCSI network save when it is implemented as a dedicated switches network (Dell EqualLogic. Securing storage area networks with iSCSI. EqualLogic Inc., 2008.). They consider it virtually impossible to snoop or inject packets in a switched network. We all know this is not the case. If this is true, why do we use firewalls, ids and tons of other security measures? Even if iSCSI runs on an isolated network, and only the management interface of the storage devices are connected to a shared/general-purpose network, security is just as good as the hosts that are connected to the dedicated network. A single compromised host connected to the dedicated iSCSI network can attack the storage devices to get access to LUNs for other hosts.
When implementing an iSCSI network you should be aware of the security risks that this imposes on the environment. To estimate the risk, awareness of the methods that can be used to secure iSCSI is paramount. The iSCSI protocol allows for the following security measures to prevent unintended or unauthorized access to storage resources:
Because iSCSI setups are generally shared environments access to the storage elements (LUNs) by unauthorized initiators should be blocked. Authorization is implemented by means of the iQN. The iQN is the initiator node name (iSCSI Qualified Name), this can be seen as a mac-address. During an audit, storage systems must demonstrate controls to ensure that a server under one regime cannot access the storage assets of a server under another.
Typically, iSCSI storage arrays explicitly map initiators to specific target LUNs; an initiator authenticates not to the storage array, but to the specific storage asset it intends to use.
As an added security method, the iSCSI protocol allows initiators and targets to use CHAP to authenticate each other. This prevents simple access by spoofing the iQN. And last, because iSCSI runs on IP, IPSec can be used to secure and encrypt the data flowing between the client (initiator) and the storage server (target).
Now that we know there are multiple ways to secure access to the storage resouces, you might conclude that iSCSI must be safe and secure to use. Unfortunately this is not evident. There are several flaws in the iSCSI security design:
Because iQN’s are manually configured in the iSCSI driver on the client, it is easy to change them. To get access to a LUN that is only protected by a iQN restriction, you can sniff the communication to get the iQN, or guess the iQN as it is often a default string (eg.: iqn.1991-05.com.microsoft.hostname), configure the iscsi driver to use this name and get access to the LUN.
The CHAP protocol is basically the only authentication mechanism that is supported by iSCSI vendors. The protocol allows for other mechanisms like Kerberos. The CHAP protocol is not a protocol know for its strong security on shared networks. The CHAP protocol is vulnerable to dictionary attacks, spoofing, or reflection attacks. Because the security issues with CHAP are well known, the RFC even mentions ways to deal with the limitations of CHAP (http://tools.ietf.org/html/rfc3720#section-8.2.1).
While IPSec could stop or reduce most of the security issues outlined above, it is hard to implement and manage. Therefor not many administrators will feel the need to use it. It should not only be possible to make a secure network, it should also be made easy.
To reduce the risk, and make your iSCSI network as safe as possible, you should do the following:
Also vendors/distributors should enable authentication by default, and add other authentication mechanisms to the iSCSI target and initiator software.
References:
http://www.blackhat.com/presentations/bh-usa-05/bh-us-05-Dwivedi-update.pdf
http://en.wikipedia.org/wiki/ISCSI#Authentication
http://weird-hobbes.nl/reports/iSCSI%20security/
In general it is a good idea to configure password aging as part of your password/security policy. In some cases however, this might cause unexpected problems. I’ve seen cases where an expired password prevented a machine from booting. In this specific case this was caused by a service that ran as the user with the expired password. In general you should not run services as a normal user account, but sometimes you just have to deal with things you can’t change. Generally the documentation states that to disable password aging you have to edit the /etc/shadow file, and remove the part where the password age is stored. This is quite error prone. If you do it this way, be sure to use vipw to prevent errors in this critical file. To disable password aging I recommend just using the command to enable it as well:
# chage -m 0 -M 99999 -E -1 username
Check the before and after:
# chage -l username
Minimum: 7
Maximum: 90
Warning: 7
Inactive: -1
Last Change: Jun 26, 2009
Password Expires: Sep 24, 2009
Password Inactive: Never
Account Expires: Never
After disabling password aging:
# chage -l username
Minimum: 0
Maximum: 99999
Warning: 7
Inactive: -1
Last Change: Jun 26, 2009
Password Expires: Never
Password Inactive: Never
Account Expires: Never
As a note, please only disable password aging when there is no other way to fix the problem.
We have a combination of Cisco 2500 terminal server (oldies) and some Avocent ACS terminal servers. All our cisco kit authenticates against a tacacs server (tac_plus) and I want to include the Avocent in the same central user-management infrastructure.
The Avocent manual includes some commands to configure it to authenticate against different back-ends. The tacacs commands and options are all explained, but these commands did not give me a working setup. Below I outline the steps in a small how-to to setup Tacacs authentication on an Avocent terminal server.
[root@hostname root]# CLI
- Thanks for using the CLI -
This interface allows you to easily modify configurations to customize
and define the functionality of your unit.
Some basic and useful keys are:
up/down arrow - navigates up/down in the command history
tab (once/twice) - shows the next possible option(s)
Other hints:
Put quotes around strings that contain spaces.
Please refer to the Reference Guide for other special keys and
additional information on how to use this interface.
Press TAB to see the list of available options.
cli>
cli>config physicalports all access authtype TacacsPlusDownlocal
cli>config security authentication authtype tacasdownlocal
cli>config security authentication tacplusauthsvr1 10.x.x.x
cli>config security authentication tacplussecret T@C@CSk3y
cli>config runconfig
cli>config savetoflashadf
With the setup described above I was not able to succesfully login to the Avocent with a valid tacacs user. The following entries was written in the tacacs log file:
Thu Jul 30 18:29:16 2009 [23176]: pap-login query for 'testuser' ssh from hostname.domain rejected
Thu Jul 30 18:29:16 2009 [23176]: login failure: testuser hostname.domain (10.x.x.x) ssh
user = testuser {
default service = permit
name = "Test User"
login = cleartext "password"
pap = cleartext "password"
service = exec {
priv-lvl = 15
}
}
Thu Jul 30 18:47:12 2009 [1541]: authorization query for 'rancid' ssh from cltsp-ts01.ams-spa rejected
Yesterday I read an article in the Washington Post about a big security breach at Network Solutions where >500.000 credit and debit cards are stolen. Network Solutions acknowledge this security incident on their site. They claim the cause of this incident was malicious code that was uploaded to a platform supporting their merchants sites. How this was possible and how this could lead to the possible theft of half a million credit card numbers is unclear.
I’ve been involved in several PCI projects to help our customers become PCI compliant as required by the credit card issuing companies. While the need for most of the procedures and measures required by PCI is clear, some seem useless, costly and/or superfluous to implement. But after incidents like these, it is a lot easier to explain customers the point of the measures PCI requires of companies handling credit card data. Basically everything is aimed at protecting the creditcard data, and making sure in case of a security incident all needed audit trails are available to investigate the cause and source of the attack. At least that is how I look at it.
For instance, PCI requires you to audit the integrity of the files present on the system. This should not only include OS files, but also the (web) application code. It is premature to speculate if this ‘malicious code’ could have been detected by running a properly configured host-based IDS on the platform, such as Tripwire or Samhain.
Secondly, PCI requires you to establish roles that have access to a production platform to upload code. Staff members should be part of a role that authorizes them for the access they need to do their job. Access should be restricted to allow only this traffic. Not only should access be locked, but audit trails of all activity should be available upon request. These audit trails not only include who logged in to the systems, but should also include network IDS logs (eg. SNORT), commands that are executed (eg. sudo), output of these commands (eg. rootsh) and the reports of host-based IDS’s. Together these tools should give an auditor a good insight in the activity on a (compromised) server. Also SElinux could be a big help in restricting access. Strangely PCI does not require or advise the use of SElinux, while it does require the use of application level firewalls (eg. mod_security in Apache). But this is a different discussion
To secure the creditcard data itself, PCI requires that all creditcards should be stored in encrypted from. Manual access to keys to decrypt this data should not be possible. Based on this you can infer that the creditcard data is compromised by some sort of ‘man-in-the-middle’ attack. The malicious code could have intercepted the data after it was decrypted in the webserver, leaving the SSL tunnel, and before it was encrypted and stored in the database. But this is just speculating of course…
I’m looking forward to more details on this incident. I hope this is made publicly available so we can learn from the mistakes that were made.