Fixing a Ceph Mon map after disaster!

Cephs weakest leak is configuration.. once a cluster is deployed is incredibly durable and will survive most mistakes without punishment. However adding a monitor that is unreachable via all machines can yield a very broken cluster that cannot be managed.

For example, if you add a new monitor and the automatically detected ip (ansible or kolla) isn’t correct, possibly a loopback or other assigned ip, you will loose the ability to use the ceph tools on the cluster because of a broken monitor map config.

So heres what you need to know in a nut shell to fix it.

  1. Stop your monitors
  2. Export a monitor map from the last known good monitor
  3. Edit the monitor map to fix the broken entry
  4. Repeat this for all the monitors that were “working”.
  5. Inject the monitor maps on those monitors
  6. Start the monitors and check for them to forum a quorum.
ceph-mon -c /etc/ceph/cluster-name-ceph.conf -i MONITOR_NAME  --extract-monmap /tmp/monmap
monmaptool --print /tmp/monmap
monmaptool --rm bad-host-entry /tmp/monmap
monmaptool --print /tmp/monmap
ceph-mon --c /etc/ceph/cluster-name-ceph.conf -i MONITOR_NAME --inject-monmap /tmp/monmap
chown ceph:ceph -R /var/lib/ceph/mon/cluster-monitor-name/
systemctl start ceph-mon.target


Centos 8 disable NetworkManager for the last time…

Well RedHat has made it clear they’re going to enforce their horrible application NetworkManager for a role that has been fine for 25 years as some basic text files… so lets disable it one last time before EL9.

sudo dnf install -y network-scripts
sudo systemctl disable --now firewalld NetworkManager
sudo systemctl enable network && sudo systemctl start network
sudo touch /etc/sysconfig/disable-deprecation-warnings

Reboot your MacOS java experience

Most us devops engineer nerds have to deal with old java versions for things like IPMI and network device configuration tools… New versions of Java literally refuse to work due to security issues… soooo I present to you.. a solution!

  1. Install OpenJDK from AdoptOpenJDK: https://adoptopenjdk.net/index.html
  2. Install OpenWebStart https://openwebstart.com/download/

Once you’ve done that, simply open the jnlp file with “OpenWebStart” and it will just “work”. You may need to open the settings app for OpenWebStart and select the OpenJDK JVM over Horacle java.

OpenStack allow IPV6 into your instances.

These rules below will allow you to add Ipv6 to all of your instances.. by default openstack only allows ipv6 between the same security group.

openstack security group list
+--------------------------------------+---------+------------------------+----------------------------------+------+
| ID                                   | Name    | Description            | Project                          | Tags |
+--------------------------------------+---------+------------------------+----------------------------------+------+
| 9817adcc-e504-479b-97e5-cc884c17d3dc | default | Default security group | 3d1f410104144004a65e74c7d8fa2612 | []   |
| a45b2351-331c-4a71-ab42-10f3f04364f6 | default | Default security group | 877dce1df8ea4f8ba6a28803ef40f0dd | []   |
| afe5f296-1971-49a9-9ef4-f6af98bd83f9 | default | Default security group | 26ef6bf16c1544699b6de2639f006950 | []   |
| def5f039-c5bc-4f0e-a2e0-3ffe537f9bee | default | Default security group |                                  | []   |
| e38fe90f-70b1-4411-984b-ace5b6f04530 | default | Default security group | 776cf3cded7541419baeef3002ebf742 | []   |
+--------------------------------------+---------+------------------------+----------------------------------+------+
#Add the rules to the security group for project "776cf3cded7541419baeef3002ebf742"
openstack security group rule create --protocol ipv6-icmp --ingress  e38fe90f-70b1-4411-984b-ace5b6f04530
openstack security group rule create --ethertype ipv6 --protocol tcp --ingress  e38fe90f-70b1-4411-984b-ace5b6f04530
openstack security group rule create --ethertype ipv6 --protocol udp --ingress  e38fe90f-70b1-4411-984b-ace5b6f04530

MacOS keybound Audio Switcher

I constantly switch between speakers and headset.. like 5-6 times per day for conference calls and such…. and changing the audio source was super annoying.. SO THIS BECAME A THING.

Install switchaudio-osx from brew

# brew install switchaudio-osx

Then you can list the audio devices available on your system to edit the code in the next step.

# /opt/homebrew/Cellar/switchaudio-osx/*/SwitchAudioSource -a
G533 Gaming Headset
Logitech Webcam C930e
DELL U2713HM
HP Z38c
HP Z38c
G533 Gaming Headset
External Headphones
Mac Studio Speakers

Open automator and create a a new quick action. Automator:

Using MacOS “Automator” and “switchaudio-osx” from Brew I was able to automate switching to my logitech 533 headset automagically. I have this setup on a key binding for control+F13

The code:

on run {input, parameters}
	set theSwitch to "/usr/local/Cellar/switchaudio-osx/EDIT-ME/SwitchAudioSource"
	set theSource to do shell script theSwitch & " -c"
	try
		if theSource = "Built-in Output" then
			do shell script theSwitch & " -t output -s \"G533 Gaming Headset\""
			do shell script theSwitch & " -t input -s \"G533 Gaming Headset\""
			display notification "Audio switched to G533 Headset." with title " Audio Input/Output Switcher"
		else
			do shell script theSwitch & " -t output -s \"Built-in Output\""
			do shell script theSwitch & " -t input -s \"Built-in Microphone\""
			display notification "Audio switched to Internal iMac Devices." with title " Audio Input/Output Switcher"
		end if
	end try
	return input
end run

Building OpenStack Kolla Images from Source

There are several ways to deploy images for kolla. You can use docker hub, you can deploy them from a local private registry and you can build them as binary (rpm/packages) or from a combination known as source.

#Source Build

#OpenStack Basics
kolla-build --registry your.dockerrepo.com:4000 --push -t source fluentd kolla-toolbox cron chrony memcached mariadb rabbitmq dnsmasq keepalived haproxy -T 16 --tag train

# Projects
kolla-build --registry your.dockerrepo.com:4000 --push -t source  nova keystone cinder tgtd iscsid glance neutron openvswitch masakari placement aodh ironic horizon octavia manilla heat watcher  -T 16 --tag train

Having issues with kolla-build –push, so after all images are build I push them to my private registry.

#probably some docker specific commands for this but works.
docker images |grep your.dockerrepo.com| awk {'print $1'} | xargs -I {} docker push {}:train

How to repackage your initrd for a newer kernel on CentOS 7

The CentOS kernel is really old… some hardware requires a newer kernel, like intel VROC requires kernel 4.15+ to work properly…

This guide assumes you’ve installed the kernels for the CentOS AltArch kernel repo: http://mirror.centos.org/altarch/7/kernel/x86_64/

#Download latest pxe kernel initramfs
wget http://mirror.centos.org/centos-7/7/os/x86_64/images/pxeboot/vmlinuz -O /tmp/pxeinitrd.img

#Make a directory
mkdir /tmp/pxeinitrd
cd /tmp/pxeinitrd

#Extract the kernel into the folder
/usr/lib/dracut/skipcpio /tmp/pxeinitrd.img | xzcat | cpio -idmv

#Remove the old kernel modules
rm -rf lib/modules/*

#Copy in the kernel modules from your new kernel, in this case its 4.19.84-300.x64_64 or CentOS AltArch
rsync -r /lib/modules/4.19.84-300.el7.x86_64/* lib/modules/4.19.84-300.el7.x86_64/

#Compile the ramdisk..
find . 2>/dev/null | cpio -c -o | xz -9 --format=lzma > /tmp/initrd.4.19.84-300.el7.x86_64.img

#Grab your files and stick em in your pxe directory
cp /tmp/initrd.4.19.84-300.el7.x86_64.img /var/lib/tftpboot/images/centos7/initrd.img
cp /boot/vmlinuz-4.19.84-300.el7.x86_64 /var/lib/tftpboot/images/centos7/vmlinuz

Make your ElasticSearch Fly!

Just run this to reduce the write workload of your cluster… (this isn’t safe for critical data.. fine for logging ect.)

curl -XPUT 'http://127.0.0.1:9200/_all/_settings?preserve_existing=true' -d '{
"index.number_of_replicas" : "0",
"index.translog.durability" : "async",
"index.refresh_interval" : "60s"
}'

XIO Passwords

Default Password for EMC XtremIO:
XtremIO Management Server (XMS)

  • Username: xmsadmin
    password: 123456 (prior to v2.4)
    password: Xtrem10 (v2.4+)

XtremIO Management Secure Upload

  • Username: xmsupload
    Password: xmsupload

XtremIO Management Command Line Interface (XMCLI)

  • Username: tech
    password: 123456 (prior to v2.4)
    password: X10Tech! (v2.4+)

XtremIO Management Command Line Interface (XMCLI)

  • Username: admin
    password: 123456 (prior to v2.4)
    password: Xtrem10 (v2.4+)

XtremIO Graphical User Interface (XtremIO GUI)

  • Username: tech
    password: 123456 (prior to v2.4)
    password: X10Tech! (v2.4+)

XtremIO Graphical User Interface (XtremIO GUI)

  • Username: admin
    password: 123456 (prior to v2.4)
    password: Xtrem10 (v2.4+)

XtremIO Easy Installation Wizard (on storage controllers / nodes)

  • Username: xinstall
    Password: xiofast1

XtremIO Easy Installation Wizard (on XMS)

  • Username: xinstall
    Password: xiofast1

Basic Input/Output System (BIOS) for storage controllers / nodes

  • Password: emcbios

Basic Input/Output System (BIOS) for XMS

  • Password: emcbios

Objective

  • You want to add 12 additional SSDs to an existing Starter X-Brick (10TB with only 5TB installed) in your environment (less than 12 is not supported, however it is technically possible).

Prerequisites

  • Valid support contract with Dell-EMC, you will need to access documentation that requires a valid login to Dell-EMC support.
  • X-Brick is fully functional and connected to an XMS.
  • Have a copy of the default passwords for XtremIO, I cannot list them here due to the Dell-EMC partner agreement. The accounts you will be using are: tech and xmsadmin. You have to access Dell-EMC support and search for Article number “332100”.
  • Have access to the XtremIO Management System (XMS).
  • The “EMC XtremIO Storage Array Software Installation and Upgrade Guide”, “Chapter 9, Expanding a 10TB Starter X-Brick (5TB)” from Dell-EMC supportcovers this in detail. This procedure does cover the mechanism from the UI and SSH. I had problems with the UI method and was forced to use the SSH procedure.

Step 1 – Install the additional SSD drives into the X-Brick Chassis

  • Open the rack that houses the X-Brick you want to add storage to.
  • Remove the 12 plastic SSD fillers from slots 13 to 24.
  • Install the 12 SSD drives into slots 13 to 24.

Step 2 – Login to the XMS UI

  • The XMS UI will be used to track the SSD drives being brought online via the Alerts & Events screen.
  • The SSH session in Step 3 will be used to issue the commands to bring each SSD online. It takes approximately 3 minutes per SSD.
  • Access the XMS UI by opening a browser and entering https://<XMS IP address> from your JumpBox/Laptop. Download the Java applet and launch it. Accept any Java warnings and login as “tech” (get default password from Dell-EMC support). This is a configured XMS instance, you should see the X-Brick cluster in the UI.
  • Select the Inventory pane of the UI, select the Table View and then select the SSD object. The new SSDs should have a “DPG State” of “Not in DPG” and “Lifecycle State” of “Uninitialized”. The existing SSDs will be “In DPG” and “Healthy” respectively.
  • Make a note of the X-Brick ID, the DPG ID and the DPG “Useful SSD Space”and “User Space”.
  • Keep the XMS UI open with the Alerts & Events window selected. This is how the status of each SSD addition will be tracked.

Step 3 – Initialize each SSD and bring Online

  • Open Putty and SSH to the XMS IP address and login with “xmsadmin” and then with username “tech” (get default password from Dell-EMC support).
  • Use the command “show-ssds” to get the SSD list of the X-Brick, including the WWN identifiers. The WWN identifier for each slot will be used in the following steps.
  • Starting from Slot 13, sequentially execute the following commands. Use the X-Brick, DPG and WWN IDs recorded earlier.
  • Use the command “add-ssd brick-id=”<X-Brick ID>” ssd-uid=”<SSD-WWN>” is-foreign-xtremapp-ssd” to initialize the SSD in the X-Brick. My use-case had SSDs from another X-Brick, so I had to force the command by using the “is-foreign-xtremapp-ssd” flag.
  • Use the command “assign-ssd dpg-id=”<DPG ID>” ssd-uid=”<SSD-WWN>””to add the SSD to the Data Protection Group (DPG).
  • Check the XMS Alerts and Events UI to track the percentage of completion for this task.
  • As each event has completes (it will turn Green with a “Cleared” state), proceed to the next slot, until Slot 24 is reached and completed.
  • Select the Inventory pane of the XMS UI, select the Table View and then select the SSD object. All 25 SSDs should have a “DPG State” of “In DPG” and a “Lifecycle State” of “Healthy”.
  • Then select the Data Protection Groups object and verify the DPG “Useful SSD Space” and “User Space” have doubled.
  • The XMS Dashboard will also show a doubling of Physical Capacity.
  • Your XtremIO X-Brick solution is now ready to provide additional storage services: EMC XtremIO – Provisioning a LUN.