Youre cluster on fire? MDS won’t start?
Set the beacon grace GLOBALLY or MON AND MDS.
ceph config set mon mds_beacon_grace 360
ceph config set mds mds_beacon_grace 360
Youre cluster on fire? MDS won’t start?
Set the beacon grace GLOBALLY or MON AND MDS.
ceph config set mon mds_beacon_grace 360
ceph config set mds mds_beacon_grace 360
The Ceph defaults for this are a little too aggressive for most devices, this will give you a more reasonable recovery speed that does not tank the system as hard but still yields a quick stable recovery.
ceph config set osd osd_recovery_sleep_hdd 0.25
ceph config set osd osd_recovery_sleep_ssd 0.05
ceph config set osd osd_recovery_sleep_hybrid 0.10
Sometimes you have failures that cannot be fixed… ie EC 2+1 and 2 drives failing… (btw this was the recommended default EC profile of 14.x..) and you should use 8+3 at minimum to prevent this!
Warning, everything below ensures data loss on the affected PG.
ceph pg PGID query | jq .acting # Stop OSD related to PG, figure out the shard id of the pg, generally its .s0, .s1, .s2 depending on your EC config. ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --pgid PGID.s1/2 --force --op remove # Restart the osd, wait for it to attempt to peer, stop it then mark it complete. ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --pgid PGID.s1/2 --op mark-complete # Tell the customer your mistake is acceptable.. ceph pg 13.df mark_unfound_lost delete
Lots of poor documentation around the interwebs for this… here is the required packages to make this useful. If you want the normal driver with xorg just remove headless from the package name.
apt install linux-headers-$(uname -r) -y
apt install nvidia-headless-470-server nvidia-utils-470-server libnvidia-encode-470-server -y
This took forever to find, saving here for others.
This is a placeholder for me comprehending how video encoding works… I’ll update/edit as I become more familiar.. please don’t assume I have any idea what im talking about.
But, basically you have a GOP (group of pictures) and that GOP has a specified number of frames per second. So lets say you have a 30 FPS video, it has 30 frames per second of data, you can have a number of GOP that is different than that though.
So lets say you have a GOP size of 90, but your frame rate is 30 FPS. You will then have 29 P-Frames per I-Frame, For a total of 87 P-frames and 3 I-Frames.
I-Frames are ENTIRE picture, P-Frames are the “guess” at what changed since the last I-Frame. More I-Frames = more bandwidth.
https://en.wikipedia.org/wiki/Video_compression_picture_types
https://www.wowza.com/docs/how-to-encode-source-video-for-wowza-streaming-cloud#configure
https://ipvm.com/reports/test-i-frame-rate
https://www.axis.com/files/whitepaper/wp_bit_rate_66275_en_1512_hi.pdf
Watch frames in realtime
ffprobe -rtsp_transport tcp -i rtsp://root:[email protected]:554/axis-media/media.amp?streamprofile=ayet -show_frames | grep -E 'pict_type=I|coded_picture_number'
Cephs weakest leak is configuration.. once a cluster is deployed is incredibly durable and will survive most mistakes without punishment. However adding a monitor that is unreachable via all machines can yield a very broken cluster that cannot be managed.
For example, if you add a new monitor and the automatically detected ip (ansible or kolla) isn’t correct, possibly a loopback or other assigned ip, you will loose the ability to use the ceph tools on the cluster because of a broken monitor map config.
So heres what you need to know in a nut shell to fix it.
ceph-mon -c /etc/ceph/cluster-name-ceph.conf -i MONITOR_NAME --extract-monmap /tmp/monmap
monmaptool --print /tmp/monmap
monmaptool --rm bad-host-entry /tmp/monmap
monmaptool --print /tmp/monmap
ceph-mon --c /etc/ceph/cluster-name-ceph.conf -i MONITOR_NAME --inject-monmap /tmp/monmap
chown ceph:ceph -R /var/lib/ceph/mon/cluster-monitor-name/
systemctl start ceph-mon.target
Well RedHat has made it clear they’re going to enforce their horrible application NetworkManager for a role that has been fine for 25 years as some basic text files… so lets disable it one last time before EL9.
sudo dnf install -y network-scripts
sudo systemctl disable --now firewalld NetworkManager
sudo systemctl enable network && sudo systemctl start network
sudo touch /etc/sysconfig/disable-deprecation-warnings