Not a bad way to spend a friday afternoon..Here are my raw notes. I make up these tech notes for myself and this is not a bad addition:
Fri Mar 31 14:56:32 EDT 2017 LESSON: SIMPLE ACTIVE-ACTIVE HTTPD CLUSTER
This is based on the RedHat manual "Linux 7 High Availability Add On
Administration" except that this follows an active-active setup and assumes
there is a cluster filesystem available.
PACKAGES NEEDED:
wget - needed by pacemaker (for status checks; supposedly "curl" is also
supported and must be specified by the ocf client= option)
lynx - OPTIONAL; to test status yourself.
ASSUMED CONDITIONS:
- httpd is already installed; furthermore it is enabled and running as stock
via systemd
- there is a clustered gfs filesystem on /volumes/data1 (for common
html content)
PART 1: HTTPD
Set up the document root as desired. In this example, the html documents
are rooted at /volumes/data1/www and is common to all nodes. In the config
file /etc/httpd/conf/httpd.conf:
DocumentRoot "/volumes/data1/www"
<Directory "/volumes/data1/www">
AllowOverride None
Require all granted
</Directory>
Put some data on the directory; in this simple example there would be a file
named /volumes/data1/www/index.html
<html>
<body>
<h1>hello</h1>
This is a test website from JondZ
</body>
</html>
At the end of the config, put the following; this is used by pacmaker to check
status.
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Location>
Use lynx to check that it works:
lynx http://127.0.0.1/server-status
When satisfied that things are working, disable httpd activation by systemd;
the service will be managed by pacemaker instead.
systemctl disable httpd
systemctl stop httpd
PART 2: LOGROTATE
Edit the file /etc/logrotate.d/httpd and modify the "postrotate" section:
# This is the old stuff. Comment this out.
# Since httpd is going to be managed
# by pacemaker, not by systemd, this is no longer valid:
#
# /bin/systemctl reload httpd.service > /dev/null 2>/dev/null || true
#
# This is the correct line that RedHat recommends. Note that
# the PID file is produced by pacemaker (or httpd itself?) and is
# probably true only as long as httpd is not managed by systemd.
#
/usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -c \
"PidFile /var/run/httpd.pid" -k graceful > /dev/null 2>/dev/null \
|| true
#
# This is how I personally respawn apache on old production systems
# but this NO LONGER RELIABLY WORKS (need testing).
#
# /sbin/apachectl graceful > /dev/null 2>/dev/null && true
Test logrotate. First of all make sure that /var/run/httpd.pid is current.
Then force rotations by "logrotate -f /etc/logrotate.conf". Also watch the
pid changes on the httpd processes (on a separate terminal you could say
watch -n1 "ps -efww | grep httpd" and watch the pid's being replaced.
PART 3: PCS RESOURCE ENTRY
I added the pcs resource as follows:
pcs resource create batwww apache
configfile="/etc/httpd/conf/httpd.conf"
statusurl="http://127.0.0.1/server-status" clone
The option "clone" makes the httpd run an all nodes (instead of just one
instance).
JondZ 201703
Friday, March 31, 2017
Thursday, March 30, 2017
Old Fashioned
I value my data: I possses an organizer that has no internet connection and I use a tape drive for backup.
Unlike modern gadgets, I do not have to recharge my organizer every day. It also does not require backlight so is easier on my eyes. I also do no trust the "cloud". I had an android-based cell phone password organizer a while ago: not any more.
Tape is very cheap and I do not have to worry about having to replace spinning disks every 2-5 years. Tape is still the last expensive option and is extremely easy to use. I can just type this in the morning:
screen
tar cvfpb /dev/st1 128 files...
Contrary to popular myths, tape drives are actually fast. I have a slow server (an athlon 5350 motherboard) and slow disk (actually a QLA iSCSI over a NetGear NAS device, and I measure tape speed at 20 to 30 MegaBytes per second. It sounds as if my server cannot deliver the bytes fast enough, resulting in a motor pause every few seconds. That implies that the LTO-3 tape drive is capable of more thoroughput. In my case, I might also use an LTO-1 drive for smaller jobs just to keep the motor humming nicely.
Unlike modern gadgets, I do not have to recharge my organizer every day. It also does not require backlight so is easier on my eyes. I also do no trust the "cloud". I had an android-based cell phone password organizer a while ago: not any more.
Tape is very cheap and I do not have to worry about having to replace spinning disks every 2-5 years. Tape is still the last expensive option and is extremely easy to use. I can just type this in the morning:
screen
tar cvfpb /dev/st1 128 files...
Contrary to popular myths, tape drives are actually fast. I have a slow server (an athlon 5350 motherboard) and slow disk (actually a QLA iSCSI over a NetGear NAS device, and I measure tape speed at 20 to 30 MegaBytes per second. It sounds as if my server cannot deliver the bytes fast enough, resulting in a motor pause every few seconds. That implies that the LTO-3 tape drive is capable of more thoroughput. In my case, I might also use an LTO-1 drive for smaller jobs just to keep the motor humming nicely.
Tape drive (LTO-3) bought from ebay. |
A Palm Organizer |
Wednesday, March 22, 2017
learning experiments on gfs2 clustering: no-quorum-policy,interleave, ordered
It has been probably a week of gfs2 (global filesystem, 2) crash course in my personal study on clustered filesystems. Here is an in depth experiment result on 3 detail points that is mentioned in the RedHat manual:
Point 1: set no-quorum-policy to freeze
Point 2: when creating dlm and clvmd clones, set interleave=true
Point 3: when creating dlm and clvmd clones, set ordered=true
Experiments and explanations:
Point 1: What does "no-quorum-policy=freeze" do?
To differentiate a "freeze" with something else, a gfs2 cluster filesystem is tested with the following two options:
pcs property set no-quorum-policy=stop
pcs property set no-quorum-policy=freeze
With "stop", the resources are stopped, resulting in the gfs2 filesystems being unmounted (because the filesystems are just services).
With "freeze", I/O is blocked, until the problem is corrected. Specifically, commands like this are frozen:
ls > /path/to/gfs2/filesystem/sample-output.txt
When the problem gets fixed and the cluster becomes quorate again, the command resumes normally.
Point 2: interleave=true
This is the parameter that caused me much grief for a day or so. When I had my first sucessful gfs2 clustered filesystem configured i was dissapointed that the filesystems were being unmounted when the nodes re-enter the cluster. I found the answer by searching the web: ALL instances dlm or/and clvmd clones need to restart before ANY gfs mount, when interleave = false.
So basically if resource2 clone is dependent on resource1 clone, and interleave=false, then ALL instances of resource1 has to be present before ANY instance of resource1. This results in the gfs2 filesystems being unmounted and re-mounted (in our example, where resource1 is the gfs2 mounts).
Thank you for the person who posted it, which I found on google.
Point 3: ordered=true
I have no observable difference to report, seems to make no difference either way. I have tested this to true and false and the lm/clvm processes seem to start the same way on all nodes.
JondZ 201603
Point 1: set no-quorum-policy to freeze
Point 2: when creating dlm and clvmd clones, set interleave=true
Point 3: when creating dlm and clvmd clones, set ordered=true
Experiments and explanations:
Point 1: What does "no-quorum-policy=freeze" do?
To differentiate a "freeze" with something else, a gfs2 cluster filesystem is tested with the following two options:
pcs property set no-quorum-policy=stop
pcs property set no-quorum-policy=freeze
With "stop", the resources are stopped, resulting in the gfs2 filesystems being unmounted (because the filesystems are just services).
With "freeze", I/O is blocked, until the problem is corrected. Specifically, commands like this are frozen:
ls > /path/to/gfs2/filesystem/sample-output.txt
When the problem gets fixed and the cluster becomes quorate again, the command resumes normally.
Point 2: interleave=true
This is the parameter that caused me much grief for a day or so. When I had my first sucessful gfs2 clustered filesystem configured i was dissapointed that the filesystems were being unmounted when the nodes re-enter the cluster. I found the answer by searching the web: ALL instances dlm or/and clvmd clones need to restart before ANY gfs mount, when interleave = false.
So basically if resource2 clone is dependent on resource1 clone, and interleave=false, then ALL instances of resource1 has to be present before ANY instance of resource1. This results in the gfs2 filesystems being unmounted and re-mounted (in our example, where resource1 is the gfs2 mounts).
Thank you for the person who posted it, which I found on google.
Point 3: ordered=true
I have no observable difference to report, seems to make no difference either way. I have tested this to true and false and the lm/clvm processes seem to start the same way on all nodes.
JondZ 201603
Monday, March 13, 2017
DISCARD EFFECT ON THIN VOLUMES
Notes by JondZ
2017-03-14
This note was prompted by my need for the use of SNAPPER to protect a massive amount of data. This morning I realized the very good effect ofdiscard in space-savings as when dealing with Terabytes of data it is good to save as much space as possible
In this example the thin POOL is tp1 and the thin VOLUME of interest is te1. It is like this because I am merely testing out a configuration thatalready exists.
These are dumped-out unedited notes.
INITIAL CONDITIONS
------------------------------------------------------------------------
te1 is a 1-Gig (thin) disk mounted on /volumes/te1.
The actual, physical thin volume POOL is sized at 10.35 Gigs right now.
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
EFFECT 1: A 500-MEG FILE INSERTED
-------------------------------------------------------------------
Notice the increase in useage of "te1" now up to 52.36. The thin volume tp1 increased as well, 27.22 usage.
dd if=/dev/zero of=500MFILE21 bs=1024 count=500000
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.36
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
EFFECT 2: 500 FILE REMOVED
---------------------------------------------------------------------
Removing a file did not reduce the Thin volume usage. The numbers are the same for the pool use percentages.
root@epike-OptiPlex-7040:/volumes/te1# rm 500MFILE21
root@epike-OptiPlex-7040:/volumes/te1# df -h -P .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bmof-te1 976M 1.3M 908M 1% /volumes/te1
root@epike-OptiPlex-7040:/volumes/te1#
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.45
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.23 18.46
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
EFFECT 3: fstrim
-----------------------------------------------------------------
FSTRIM will reclaim spaces on the thin volume AND the thin pool:
root@epike-OptiPlex-7040:/volumes/te1# fstrim -v /volumes/te1
/volumes/te1: 607.2 MiB (636727296 bytes) trimmed
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.23 18.55
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
Well it does not show here, but I recall that the thin POOL also is reduced. Perhaps the snapshot gets in the way? It has been put there automatically while I was composing this text.
There, much better:
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 delete 1
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# fstrim -v /volumes/te1
/volumes/te1: 239.4 MiB (251031552 bytes) trimmed
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
The numbers are down to 4.77 consumed on the Thin VOLUME, and 22.62 percent on the thin POOL.
EFFECT 4: mount with DISCARD option automatically reclaims THIN space
----------------------------------------------------------------------
This example demonstrates that thin volume space are automatically reclaimed
and returned to POOL automatically, without needing to manually run fstrim.
root@epike-OptiPlex-7040:/volumes/te1# mount -o remount,discard /dev/mapper/bmof-te1
root@epike-OptiPlex-7040:/volumes/te1# !dd
dd if=/dev/zero of=500MFILE24 bs=1024 count=500000
500000+0 records in
500000+0 records out
512000000 bytes (512 MB, 488 MiB) copied, 0.553593 s, 925 MB/s
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.39
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# rm 500MFILE24
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.39
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.79
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# ls
lost+found
root@epike-OptiPlex-7040:/volumes/te1#
-----------------------------------------------------------------------
But does the reclamation work thru snaphots layers? Well it would be difficult to test all combinations, but lets at least verify that the spaces are reclaimed when the snaphots are deleted.
First, mount with the discard mode
root@epike-OptiPlex-7040:~# !mount
mount -o remount,discard /dev/mapper/bmof-te1
root@epike-OptiPlex-7040:~#
the initial conditions are:
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
Ok, so LV is 4.77 percent, LV POOL is 22.62 percent.
So..consume space.
root@epike-OptiPlex-7040:/volumes/te1# !dd
dd if=/dev/zero of=500MFILE24 bs=1024 count=500000
500000+0 records in
500000+0 records out
512000000 bytes (512 MB, 488 MiB) copied, 0.559463 s, 915 MB/s
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.37
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
root@epike-OptiPlex-7040:/volumes/te1#
snapshot, and consume space some more..
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 create
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.45
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.23 18.36
root@epike-OptiPlex-7040:/volumes/te1#
root@epike-OptiPlex-7040:/volumes/te1# !dd:p
dd if=/dev/zero of=500MFILE24 bs=1024 count=500000
root@epike-OptiPlex-7040:/volumes/te1# dd if=/dev/zero of=200mfile bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.211273 s, 969 MB/s
root@epike-OptiPlex-7040:/volumes/te1# !snapper
snapper -c te1 create
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 71.53
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te1-snapshot2 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 29.08 19.82
root@epike-OptiPlex-7040:/volumes/te1#
Then remove the files. The numbers should not go down since there are snap volumes.
root@epike-OptiPlex-7040:/volumes/te1# rm 200mfile 500MFILE24
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.78
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te1-snapshot2 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 29.08 20.12
Ok so I stand corrected: The LVM VOLUME usage went down, but the LVM POOL did not.
That actually makes sense since the snapshot consumes the space.
What happens when the snaphots are removed, are the spaces reclaimed into the thin POOL?
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 delete 1
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 delete 2
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.78
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
root@epike-OptiPlex-7040:/volumes/te1#
It does!!! When the snap volumes are removed, the spaces are reclaimed into the thin pool.
CONCLUSION:
--------------------
When working with thin volumes, use DISCARD mount option, even (or, especially) when not using SSD's.
OTHER TESTS
-----------
I tested mounting normally, consume space, then mount with discard option. What happens is that the space are not automatically reclaimed just by mounting. fstrim needs to run, and snapshots need to be deleted. Still there is no harm and in fact an advantage to add "discard" option in fstab even for existing (thin volume) mounts.
JondZ 20170314
Notes by JondZ
2017-03-14
This note was prompted by my need for the use of SNAPPER to protect a massive amount of data. This morning I realized the very good effect ofdiscard in space-savings as when dealing with Terabytes of data it is good to save as much space as possible
In this example the thin POOL is tp1 and the thin VOLUME of interest is te1. It is like this because I am merely testing out a configuration thatalready exists.
These are dumped-out unedited notes.
INITIAL CONDITIONS
------------------------------------------------------------------------
te1 is a 1-Gig (thin) disk mounted on /volumes/te1.
The actual, physical thin volume POOL is sized at 10.35 Gigs right now.
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
EFFECT 1: A 500-MEG FILE INSERTED
-------------------------------------------------------------------
Notice the increase in useage of "te1" now up to 52.36. The thin volume tp1 increased as well, 27.22 usage.
dd if=/dev/zero of=500MFILE21 bs=1024 count=500000
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.36
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
EFFECT 2: 500 FILE REMOVED
---------------------------------------------------------------------
Removing a file did not reduce the Thin volume usage. The numbers are the same for the pool use percentages.
root@epike-OptiPlex-7040:/volumes/te1# rm 500MFILE21
root@epike-OptiPlex-7040:/volumes/te1# df -h -P .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bmof-te1 976M 1.3M 908M 1% /volumes/te1
root@epike-OptiPlex-7040:/volumes/te1#
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.45
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.23 18.46
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
EFFECT 3: fstrim
-----------------------------------------------------------------
FSTRIM will reclaim spaces on the thin volume AND the thin pool:
root@epike-OptiPlex-7040:/volumes/te1# fstrim -v /volumes/te1
/volumes/te1: 607.2 MiB (636727296 bytes) trimmed
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.23 18.55
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
Well it does not show here, but I recall that the thin POOL also is reduced. Perhaps the snapshot gets in the way? It has been put there automatically while I was composing this text.
There, much better:
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 delete 1
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# fstrim -v /volumes/te1
/volumes/te1: 239.4 MiB (251031552 bytes) trimmed
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1#
The numbers are down to 4.77 consumed on the Thin VOLUME, and 22.62 percent on the thin POOL.
EFFECT 4: mount with DISCARD option automatically reclaims THIN space
----------------------------------------------------------------------
This example demonstrates that thin volume space are automatically reclaimed
and returned to POOL automatically, without needing to manually run fstrim.
root@epike-OptiPlex-7040:/volumes/te1# mount -o remount,discard /dev/mapper/bmof-te1
root@epike-OptiPlex-7040:/volumes/te1# !dd
dd if=/dev/zero of=500MFILE24 bs=1024 count=500000
500000+0 records in
500000+0 records out
512000000 bytes (512 MB, 488 MiB) copied, 0.553593 s, 925 MB/s
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.39
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# rm 500MFILE24
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.39
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs -a | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.79
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
[tp1_tdata] bmof Twi-ao---- 10.35g
[tp1_tmeta] bmof ewi-ao---- 8.00m
root@epike-OptiPlex-7040:/volumes/te1# ls
lost+found
root@epike-OptiPlex-7040:/volumes/te1#
-----------------------------------------------------------------------
But does the reclamation work thru snaphots layers? Well it would be difficult to test all combinations, but lets at least verify that the spaces are reclaimed when the snaphots are deleted.
First, mount with the discard mode
root@epike-OptiPlex-7040:~# !mount
mount -o remount,discard /dev/mapper/bmof-te1
root@epike-OptiPlex-7040:~#
the initial conditions are:
te1 bmof Vwi-aotz-- 1.00g tp1 4.77
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
Ok, so LV is 4.77 percent, LV POOL is 22.62 percent.
So..consume space.
root@epike-OptiPlex-7040:/volumes/te1# !dd
dd if=/dev/zero of=500MFILE24 bs=1024 count=500000
500000+0 records in
500000+0 records out
512000000 bytes (512 MB, 488 MiB) copied, 0.559463 s, 915 MB/s
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.37
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.22 18.26
root@epike-OptiPlex-7040:/volumes/te1#
snapshot, and consume space some more..
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 create
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 52.45
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 27.23 18.36
root@epike-OptiPlex-7040:/volumes/te1#
root@epike-OptiPlex-7040:/volumes/te1# !dd:p
dd if=/dev/zero of=500MFILE24 bs=1024 count=500000
root@epike-OptiPlex-7040:/volumes/te1# dd if=/dev/zero of=200mfile bs=1024 count=200000
200000+0 records in
200000+0 records out
204800000 bytes (205 MB, 195 MiB) copied, 0.211273 s, 969 MB/s
root@epike-OptiPlex-7040:/volumes/te1# !snapper
snapper -c te1 create
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 71.53
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te1-snapshot2 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 29.08 19.82
root@epike-OptiPlex-7040:/volumes/te1#
Then remove the files. The numbers should not go down since there are snap volumes.
root@epike-OptiPlex-7040:/volumes/te1# rm 200mfile 500MFILE24
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.78
te1-snapshot1 bmof Vri---tz-k 1.00g tp1 te1
te1-snapshot2 bmof Vri---tz-k 1.00g tp1 te1
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 29.08 20.12
Ok so I stand corrected: The LVM VOLUME usage went down, but the LVM POOL did not.
That actually makes sense since the snapshot consumes the space.
What happens when the snaphots are removed, are the spaces reclaimed into the thin POOL?
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 delete 1
root@epike-OptiPlex-7040:/volumes/te1# snapper -c te1 delete 2
root@epike-OptiPlex-7040:/volumes/te1# !lvs
lvs | grep tp1
te1 bmof Vwi-aotz-- 1.00g tp1 4.78
te2 bmof Vwi-aotz-- 1.00g tp1 97.66
te3 bmof Vwi-aotz-- 1.00g tp1 4.75
te4 bmof Vwi-aotz-- 3.00g tp1 42.32
tp1 bmof twi-aotz-- 10.35g 22.62 15.28
root@epike-OptiPlex-7040:/volumes/te1#
It does!!! When the snap volumes are removed, the spaces are reclaimed into the thin pool.
CONCLUSION:
--------------------
When working with thin volumes, use DISCARD mount option, even (or, especially) when not using SSD's.
OTHER TESTS
-----------
I tested mounting normally, consume space, then mount with discard option. What happens is that the space are not automatically reclaimed just by mounting. fstrim needs to run, and snapshots need to be deleted. Still there is no harm and in fact an advantage to add "discard" option in fstab even for existing (thin volume) mounts.
JondZ 20170314
Subscribe to:
Posts (Atom)
Creating ipip tunnels in RedHat 8 that is compatible with LVS/TUN. Every few seasons I built a virtual LVS (Linux Virtual Server) mock up ju...
-
Occasionally we are troubled by failure alerts; since I have installed the check_crm nagios monitor plugin the alert seems more 'sensiti...
-
This is how to set up a dummy interface on RedHat/CentOS 8.x. I cannot make the old style of init scripts in /etc/sysconfig/network-scripts...
-
While testing pacemaker clustering with iscsi (on redhat 8 beta) I came upon this error: Pending Fencing Actions: * unfencing of rh-8beta...