JondZ

Sunday, February 16, 2020

RH/CentOS 8 dummy network interface

This is how to set up a dummy interface on RedHat/CentOS 8.x. I cannot make the old style of init scripts in /etc/sysconfig/network-scripts/ work anymore for the dummy network interface.

Its all NetworkManager now--here is what I pieced together from surfing the web. This one stays up even after reboot:

nmcli connection add type dummy ifname dummy0 con-name dummy0
nmcli con mod dummy0 ipv4.address 192.168.8.88/32
nmcli con mod dummy0 ipv4.method manual autoconnect yes
nmcli con up dummy0
nmcli con show

Monday, September 16, 2019

Old DDS/4mm Tape drive speed issues

Ok, so I bought a tape drive from ebay..its a Dell PowerVault 100T dds4 tape drive (internal seagate std1400lw). I really like the 4mm tape format because theyre so small. Just perfect for freezing my usual data output (some notes, documents).

At first I thought I bought a lemon, since the drive was super slow. As usual I played around with block sizes. Normally I leave the blocksize to 512, which is usually what I find tape drives to have the default set up as. I usually would do this:

mt -f /dev/st1 setblk 512 # if necessary
tar cvfp /dev/st1 ....

Or, sometimes, If I want the default tar blocking factor to be the same as the hardware block size:

mt -f /dev/st1 setblk 10240 # thats 20 tar blocks of 512 each
tar cvfp /dev/st1 ...

But the tape drive is SUPER SLOW no matter how I tried...Until I tried 4096 (page size?) on a whim. And this drive just flew! So this is how to make this drive spin fast:

mt -f /dev/st1 setblk 4096 # page size
tar cvfpb /dev/st1 8 ...
tar tvfb /dev/st1 8 .....

What a surprise that was.. and I was using tapes for years.

--------------UPDATE-------------

20190918

The slow speed is for READING BACK DATA, not writing them. For some reason the tape drive stalls delivering data for data transfers > 4k. Therefore, to read back data, specify transfer buffers of 4k or less:

mbuffer_style# mbuffer -s 4096 -i /dev/st2 | tar tvf -
normal_tar# tar xvfpb /dev/st2 8

JondZ 20190917

Wednesday, July 03, 2019

pacemaker fail count timing experiments

Occasionally we are troubled by failure alerts; since I have installed the check_crm nagios monitor plugin the alert seems more 'sensitive'. I have come to understand that pacemaker needs manual intervention when things fail. The things that especially fail on our site is the vmware fencing---every few weeks one or two logins to vmware would fail, forcing me to login and issue a "pcs resource cleanup" to reset the failure.

I am doing this experiment now to undestand the parameters that I need to adjust on a live system. These observations are done on a RH7.6 cluster, with a dummy shell script as a resource.

dummy shell script, configured to fail:

#! /bin/bash
exit 1 # <-- this is removed when we want the result to succeed
while :
do
date
sleep 5
done > /tmp/do_nothing_log

OBSERVATIONS 1:

CLUSTER CONDITIONS: cluster property "start-failure-is-fatal" to defaults (true).
RESOURCE CONDITIONS: defaults
RESULT: node1 is tried ONCE, node2 is tried ONCE, then nothing is tried again. When a resource fails the fail count is immediately set to INFINITY (1000000). This is why the documentation says "the node will no longer be allowed to run the failed resource" until a manual intervention happens.

OBSERVATIONS 2:

CLUSTER CONDITIONS: "start-failure-is-fatal" to FALSE ("pcs property set start-failure-is-fatal=false; pcs resource cleanup")
RESOURCE CONDITIONS: defaults
RESULT: resource is tried to restart on node1 nonstop (to infinity?). It does not appear to be attempted to restart on another node.

OBSERVATIONS 3:

CLUSTER CONDITIONS: start-failure-is-fatal" to FALSE ("pcs property set start-failure-is-fatal=false; pcs resource cleanup")
RESOURCE CONDITIONS: migration-threshold=10 ("pcs resource update resname meta migration-threshold=10; pcs resource cleanup")
RESULT: Resource is retried 10 times on node1, then retried 10 times on node2, then retried no longer.

OBSERVATIONS 4:

CLUSTER CONDITIONS: start-failure-is-fatal=false, cluster-recheck-interval=180.
RESOURCE CONDITIONS: migration-threshold=10 and failure-timeout=2min ("pcs resource update resname meta failure-timeout=2min")
RESULT: Resource is retried 10 times on node 1, 10 times on node2. Errors are cleared after 2 minutes. After that, resource is tried ONCE for node1 but 10 times on node2 every cluster-recheck-interval (3 minutes). Thats because the errors condition is gone but the counters do not necessarily reset (but sometimes they do on other nodes, when its tried on one node).

GROUP RESOURCES CONSIDERATION:

I an unable to apply the resources migration-threshold, failure-timeout at this moment. It seems to still be a property of the individual resources.
Update resource meta resname as usual regardless of whether it is part of a group or not; behavior should be as expected (group proceeds from one resource to the next in the list anyway).

JondZ 20190703

Friday, March 15, 2019

pacemaker unfencing errors

While testing pacemaker clustering with iscsi (on redhat 8 beta) I came upon this error:

Pending Fencing Actions:
* unfencing of rh-8beta-b pending: client=pacemaker-controld.2603, origin=rh-8beta-b

It took me almost the whole morning to understand how to clear the error. Since the stonith resource includes the clause "meta provides=unfencing", this means that the fencing agent should account for unfencing, meaning we should simply reboot the node (rh-8beta-b in this case).

RedHat documentation explains this as well: " ...The act of booting in this case implies that unfencing occurred..."

Wednesday, October 31, 2018

VDO Troubles

Once in a while, on very large VDO volumes, storage just dissapears. In my case I had one 42TB volume that just refused to boot. Here are some of the symptoms:

Boot dropped single user into a shell
"vdo start" says the vdo volume was already started, but there is no entry in /dev/mapper/.

In this case I retried "vdo stop" then "vdo start" a few times until the vdo volume became available.

Further observation implies that systemd kills the vdo startup process. What happened was that vdo was interrupted on its long boot up---a timing issue.

A working solution for me was to include a timeout value in /etc/fstab for the particular vdo volumes:

# /etc/fstab
/dev/vgsample/lvsample /somedirectory ext4 defaults,discard,x-systemd.requires=vdo.service,x-systemd.device-timeout=10min 0 0

Working so far, survived a few reboots.

JondZ 20181031

Sunday, September 02, 2018

Clustering legacy websites with keepalived and haproxy.

We had a legacy website we could not get rid of, it was very stable on debian lenny 5.0. After much grief I managed to compile and RPM version of php 5.2 which it needed --- not php 5.3, not 5. anything, it HAS to be 5.2. Ok so, finally I was able to get it running in CentOS 6.

I also clustered the filesystem using gfs2. I did not remember until yesterday that I configured this server to also server SMB share. It was like 5 years ago and users are still on it! So I also had to use clustered samba (CTDB). Finally I got it working again.

I was able to get this thing clustered using keepalived cookie tricks--basically haproxy tricks users browser into connecting to only one and the same backend webserver, but different users connect to different backends. Thats basically Active/Active already. I just finished and tested this right now even on a holiday weekend.

So what we have now is

---ext ip ----- frontserver1/frontserver2 ------- webserver1/webserver2

The webservers themselvfes also host mysql. Basically a four-pack HA setup. I tested rebooting nodes and the websites continue to work.

Its good unfortunately a lot of IP addresses were allocated. about 7 instead of just one . But everythings virtual so I guess thats ok.

JondZ

Tuesday, August 28, 2018

a day at the office, spare rh ha license

Ok so I finally got HA/GFS2 nodes at work. I'm getting a lot of mileage out of the licenses I was allowed to use, more than I expected. First of all, a RH "unit" actually counts as one half when ran as virtual machines. This enabled me to run a 6-pack stack consisting of 2 routers (keepalive/haproxy), 2 fileservers (GFS2), and 2 PGSQL (Active/Passive nodes).

Today I discovered that the RH Resilient Storage (GFS2) license already includes HA (pacemaker) license. I called RH support to verify and make sure. So, I can take that license and build corosync Q devices. Our 2-node clusters are going to be upgraded with Q devices.

It is a nice day.

Wednesday, May 02, 2018

vdo in an lvm sandwich

I have finished migrating one of the backup servers that we have (where I work) from a standard LVM setup to one that has a VDO engine. It turns out the compression ratio was 41% or something like 10TB physical to 25T logical blocks--or something like that if I undertood the vdostats output correctly.

Someday I might post a quick how-to here. I already posted a maintenance document in out internal website so i am a little tired right now.

Anyhow what I ended up doing was an LVM-over-VDO-over-LVM setup like this:

--------------------------------------------------
LVM disks -- actual exposed disks -- /data1 /data2, etc
-------------------------------------------------------
VGS (actual usable volume group, vg1 for example)
-------------------------------------------------------
vdo1 (dedup/compression/zero-elimination)
-------------------------------------------------------
Logical Volume /dev/vg0base/disk1
-------------------------------------------------------
VGS (vg0base for example)
-------------------------------------------------------
PHYSICAL DISKS
-------------------------------------------------------

What I like about this (except for the added complexity) is that the scheme doesnt really change existing setups too much. Provisioning is done in a normal way starting from the vg1 volume group above. After set up I can pretty much forget about the lower levels and do the usual disk creation such as "lvcreate -n disk1 -L 1G vg1".

The advantages of this setup are as follows:

1. Physical disks can be added if needed. Unfortunately I had to write a maintenance manual (where I work) becuase all the elements of the stack need to be expanded as well.
2. The VDO layer is easy to expand. I wanted to over provision I can simply expand the logical size of vdo1. The upper layers will follow the size (though not automatically--maintenance manual again).

Since there was existing data already (of about 25T) I had to bootstrap from a small existing space and slowly expand the vdo layer. Fortunately the backup data were contained in separate partitions (/volumes/data{1,2,3,4}) so I was able to incrementaly migrat them, though the whole process took me a week to accomplish.

jondz

Wednesday, February 28, 2018

host to host nc/tar tricks

Here are some tricks to transfer files between hosts using tar. In the old days by habit I would just do something like (tar cvfp - ) | ( ssh host "cd /path/to" && tar xvfp - ) but here are even more simpler things, assuming you have open terminals on both hosts.

CASE 1: SENDER IS LISTENING
Here are some commands that will work; the sender command needs to be typed first then it will block until a consumer appears on the other side of the pipe.

sender% tar cvf - . | nc -l -p 1234
sender% tar cvf >(nc -l -p 1234) . # process substitution

receiver% tar xvf - < /dev/tcp/ip-of-sender/1234
receiver% tar xvf <( dd obs=10240 < /dev/tcp/ip-of-sender/1234)

The magic number 10240 is 512*20, or the default 20 512-block records of the tar command.

CASE 2. RECEIVER IS LISTENING

This is the case when you are on the receiving end and want to initiate a receive before switching over to the sending terminal. Type the receiver command first and IO will block until there is a sender:

receiver% nc -l -p 1234 | tar xvf -

sender% tar cvf - . > /dev/tcp/host-or-ip-of-receiver/1234
sender% < /dev/tcp/host-or-ip/1234 tar cvf - .
sender% tar cvf >(cat > /dev/tcp/host-or-ip/1234) .

NOTES:
Sometimes it is necessary to add "-N" option to nc, depending on platform/distro.

Edward 20180228

Saturday, December 16, 2017

tape drive over iscsi problems

One of the problem with running a tape drive as an iscsi target is that there was no software I found that worked. I tried IETD, TGT, and of course TARGETCLI. After thinking and googling about this problem for about a day or so I decided to see if I could patch the python code on which targetcli runs. I am suprised I can still code!!! This tool me perhaps half an hour to figure, it has been a long time since i wrote any thing.

This file is ... rtslib/utils.py

-----------PATCH 1 of 2 --------------------------------------
1. In function convert_scsi_path_to_hctl

OLD:
    try:
        hctl = os.listdir("/sys/block/%s/device/scsi_device"
                          % devname)[0].split(':')
    except:
        return None
    return [int(data) for data in hctl]

NEW:
    try:
       hctl = os.listdir("/sys/block/%s/device/scsi_device"
                             % devname)[0].split(':')
       return [int(data) for data in hctl]
    except OSError: pass

    try:
       hctl = os.listdir("/sys/class/scsi_tape/%s/device/scsi_device"
                          % devname)[0].split(':')
       return [int(data) for data in hctl]
    except OSError: pass

    return None

-----------PATCH 2 of 2 --------------------------------------
In function convert_scsi_hctl_to_path
OLD:
    for devname in os.listdir("/sys/block"):
        path = "/dev/%s" % devname
        hctl = [host, controller, target, lun]
        if convert_scsi_path_to_hctl(path) == hctl:
            return os.path.realpath(path)
NEW:
    for devname in os.listdir("/sys/block"):
        path = "/dev/%s" % devname
        hctl = [host, controller, target, lun]
        if convert_scsi_path_to_hctl(path) == hctl:
            return os.path.realpath(path)
    try:
        for devname in os.listdir("/sys/class/scsi_tape"):
            path = "/dev/%s" % devname
            hctl = [host, controller, target, lun]
            if convert_scsi_path_to_hctl(path) == hctl:
                return os.path.realpath(path)
    except OSError: pass

Friday, December 01, 2017

pgsql archive command notes

This is a critique of documented "archive_command" usage in pgsql. There is an example which says:

archive_command = 'test ! -f /destination/%f && cp %p /destination/%f"

I wouldnt use this. The problem is if the disk is full, I tested cp to produce short files (partial files). Even if it does exit with a nonzero exit code, the next copy attempt will be seen as a success ("test ! -f __" is false since the file is already there).
What I would do is use rsync. In its default setting, it does not create short files:

archive_command = 'test ! -f /destination/%f && rsync %p /destination/%f"

ep

Sunday, October 15, 2017

drbd and lvm: so many combinations

On vacation from a major dental surgery I am currently learning and testing these three DRBD/LVM combinations and thinking about which one I would use on a real production setup.

1. DRBD over plain device
2. DRBD over LVM
3. LVM over DRBD
4. LVM over DRBD over LVM

1. DRBD over plain device. This puts actual device names such as sdb1 in the drbd configuration. I dont like that. There are ways around this such as using multipath or using /dev/disk/by-id. I havent tested those yet with drbd but the point is the actual device names are in the configuration files and they had better agree with the real devices (after years of uptime and changeover of sysadmins :).

2. DRBD over LVM. This puts an abstraction layer at the lowest layer and it avoids having to place actual device names in drbd resource files.   For example:

/etc/drbd.d/some-resource.res

resource __ {
...
...
device /dev/drbd0
disk /dev/vg/lvdisk0
...
}

There you go, no /dev/sdb1 or whatever in the disk configuration. This avoids problems arising from devices switching device names on reboot.

3. LVM over DRBD

As the name implies, puts the flexibility of provisioning in the drbd layer, where it is closer to the application. It would make typical provisioning such as disk allocation, destruction, extensions and shrinkings much easier. Howerer I still do not like writing device names in the config files...

4. LVM over DRBD over LVM.

LVM over DRBD over LVM is probably the most flexible solution. There are no actual device names in DRBD configuration; LVM is very much resilient with machine restarts due to its auto detection of metadata in whatever order the physical disks comes up in.   With this combinations I can re arrange the phsyical backing storage and at the same time have the flexibility of LVM on the upper layer. The only issue is having to ADJUST STUFF IN /etc/lvm/lvm.conf.

in /etc/lvm/lvm.conf

    # filter example --
    # /dev/vd* on the physical layer,
    # /dev/drbd* on the drbd layer
    filter = [ "a|^/dev/vd.*|", "a|drbd.*|", "r|.*/|" ]
    write_cache_state = 0
    use_lvmetad = 0

Just a few lines of config. This is fine. The problem is having to remember what all these configuration mean after 2 years of uptime...

---

JondZ 20171015

Wednesday, October 11, 2017

reducing lvm drbd disk size

Here is a snippet of my notes for reducing drbd disk size (assuming that the physical device is on LVM which can be resized).

Just remember that a drbd device is a container, and HAS METADATA. Therefore think about it as a filesystem. Also, this procedure will only work if the disks are ONLINE (the disks are attached, and drbd is running).

In this example, A filesystem has only 100 megs worth of data; we want to shrink the physical store down from 500 to about 120 Megs.

WARNING: This procedure can be destructive is done wrong.

1. Note the filesytem consumed size. For this example the filesystem contains 100M worth fo data. Shrink the filesystem. Note that -M resizes to minimum size.

   umount /dev/drbd0
   fsck -f /dev/drbd0
   resize2fs -M /dev/drbd0

At this point the filesytem on /dev/drbd0 should be at the minimum (i.e., close to the consumed size---about 100 MB in this example). If you are not sure, mount the fileystem again and use "df" or use tune2fs (if ext4) to MAKE SURE.

2. Resize the drbd device. Make sure it is higher than the fileystem size because drbd also uses disk space for metadata!

   drbdadm -- --size=110M resize r0

If you would type in "lsblk" at this point, drbd0 should show about 110M.

3. Shrink the physical backing device to a bit higher than the drbd device:

   on first node (drb7): lvresize -L 120M /dev/drt7/disk1
   on first node (drb8): lvresize -L 120M /dev/drt8/disk1

4. Size up the drbd device to use up all available LV space:

   drbdadm resize r0

4. Finally size up the filesystem:

   resize2fs /dev/drbd0

5. Mount and verify that the filesystem is indeed 120 Megs.

Friday, October 06, 2017

Stress testing drbd online verification

DRB is so nice. It is really very nice to have this skill available--I would use it personally, at home or at office production. This is a very nice talent to have, practically in the real world, to be able to string together 2 computers with a network cable and replicate disks from one to the other automatically.

I just stress tested the "online verification" procedure. Basically I wanted to see how I would formulate a recovery procedure for a corrupted disk. In summary this is what I did---

1. Configure checksum method for online verification.
2. Perform online verification to compare disks. This is as simple as typing out "drbdadm verify <resource>" and watching the logs in /var/log/messages.

STRESS TEST. To make sure I would recover from a failed disk I tested out this scenario:

3. On node1, stop drbd (systemctl stop drbd)
4. Force a disk corruption, for example dd if=/dev/zero of=/dev/vdb1
5. start drbd (systemctl start drbd)

RECOVERY PROCEDURE: Here is what I came up with as a procedure.

6. drbdadm verify r0 # r0 is the resource name

At this point I would notice the disk corruption in /var/log/messages.

7. On the "bad" node:

drbdadm secondary r0
drbdadm invalidate r0

This is the summary of procedure (perhaps with some minor detail I forgot about). After the "invalidate" instructions the disk should sync again. Just make sure that the correct disk on the correct node is identified and invalidated.

-------
JondZ

Thursday, October 05, 2017

drbd diskless mode

I am still scratching my head over this one--that it is actually possible. Sure I ran diskless stuff like iSCSI with special hardware cards before, but drbd?

I detached the disk and then turned the resource into active. So basically the node, without a disk, is talking to a node, with a disk, and pretending that the disk is local:

[root@drb6 tmp]# drbdadm status
r0 role:Primary
disk:Diskless
drb5 role:Secondary
    peer-disk:UpToDate

[root@drb6 tmp]# df
Filesystem          1K-blocks    Used Available Use% Mounted on
/dev/drbd0            1014612   33860    964368   4% /mnt/tmp

On this node, /dev/drbd0 has no disk backing store for drbd0. drbd0 is for all practical purposes a normal block device.

That is amazing...

jondz

Wednesday, October 04, 2017

My first DRBD cluster test

Here is my first cluster. It took me the WHOLE MORNING to figure out (I misunderstood the meaning of the "clone-node-max" property). Anyhow this is a 4-node active/passive drbd storage cluster.

In this example, only the Primary (pcs "Master") can use the block device at any one time. The nodes work correctly in that nodes are promoted/demoted as expected when they leave/enter the cluster.

I will have to re-do this entire thing from scratch to make sure I can do it again and keep notes (so many things to remember!). I will also enable some service here to use the block device: maybe an nfs or LIO iSCSI server or something.

Here are my raw notes and a sample "pcs status" output:

------------RAW NOTES-- SORT IT OUT LATER ----------

pcs resource create block0drb ocf:linbit:drbd drbd_resource=r0
pcs resource master block0drbms block0drb master-max=1 master-node-max=1 clone-max=4
# pcs resource update block0drbms clone-node-max=3 THIS IS WRONG--SHOULD BE 1 BECAUSE ONLY 1 CLONE SHOULD RUN ON EACH NODE (see below later)

pcs resource update block0drbms meta target-role='Started'
pcs resource update block0drbms notify=true

[root@drb3 cores]# systemctl disable drbd
Removed symlink /etc/systemd/system/multi-user.target.wants/drbd.service.

pcs resource update block0drb meta target-role="Started"
pcs resource update block0drb drbdconf="/etc/drbd.conf"

pcs property set stonith-enabled=false
pcs resource update block0drbms clone-node-max=1

pcs resource enable block0drbms

Also (info from the web), fix wrong permissions if needed:

--- chmod 777 some file in /var if needed ----

chmod 777 /var/lib/pacemaker/cores

---------- EXAMPLE PCS STATUS OUTPUT --------

[root@drb3 ~]# pcs status
Cluster name: drbdemo
Stack: corosync
Current DC: drb2 (version 1.1.16-12.el7_4.2-94ff4df) - partition with quorum
Last updated: Wed Oct 4 11:38:16 2017
Last change: Wed Oct 4 11:30:36 2017 by root via cibadmin on drb4

4 nodes configured
4 resources configured

Online: [ drb1 drb2 drb3 drb4 ]

Full list of resources:

Master/Slave Set: block0drbms [block0drb]
Masters: [ drb2 ]
Slaves: [ drb1 drb3 drb4 ]

Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@drb3 ~]#

Wednesday, May 03, 2017

Random Ansible stuff -- commenting out variables

I am currently learning Ansible. This is due to the fact that I realized I had to have a way to simultaneously configure many servers: I was up to using 6 (six) virtual centOS servers to learn glusterfs and manually coniguring each one was geting troublesome. Anyhow I think I am a week into this already.

Todays lesson is: How to replace variables with comments on top. This is a personal favorite style of mine, specifically in the form:

   # Previous value was VAR=value changed 20170504
   VAR=newvalue

I use this variable A LOT, in fact on all of my configuration changes whenever I can.

Here are two examples from my personal tests.

One way is to simply use newlines:

         This task:
         ====================================================
         - name: positional backrefs embedded newline hacks
           lineinfile:
              dest: /tmp/testconfig.cfg
              regexp: '^(TESTCONFIGVAR9)=(.*)'
              line: '# \1 modified {{mod_timestamp_long}}
                    \n# \1 = (old value was) \2
                    \n\1=newvalue'
              backrefs: yes
              state: present
         ====================================================
         Produces this output:
         ====================================================
         # TESTCONFIGVAR9 modified 20170504T001157
         # TESTCONFIGVAR9 = (old value was) test
         TESTCONFIGVAR9=newvalue
         ====================================================

Another way is to split up the task:

         These tasks:
         ====================================================
         - name: another attempt at custom mod notes, step 1
           lineinfile:
              dest: /tmp/testconfig.cfg
              regexp: '^(TESTCONFIGVAR6=.*)'
              line: '# OLD VALUE: \1 {{ mod_timestamp_long }}'
              backrefs: yes
         - name: another attempt at custom mod notes, step 2
           lineinfile:
              dest: /tmp/testconfig.cfg
              insertafter: '# OLD VALUE: '
              line: 'TESTCONFIGVAR6=blahblahblah'
         ====================================================
         Result in this output:
         ====================================================
         # OLD VALUE: TESTCONFIGVAR6=test 20170504T001157
         TESTCONFIGVAR6=blahblahblah
         ====================================================

If anybody is reading this, I am open to suggestions (since I am still learning this at the moment).

JondZ Thu May 4 00:16:35 EDT 2017

Wednesday, April 05, 2017

todays random thoughts

As I write this my home server is down; I was learning glusterfs when I accidentally rebooted the Xen server which was holding all my virtual machines. It has been a few minutes; it is unusual so the server may have crashed already.

Anyhow--

Today's lesson is: FIX HOSTNAMES FIRST before setting up glusterfs. Glusterfs needs a good working hostname resolution in order to work. Gluster is miserable with a broken DNS.

It also does not help that somewhere along the line, something, or somebody (*cough* ISP *cough*) modifies the dns queries that returns some far-off IP addresses on failed resolution.

Specifically make sure these work and actually point to your servers:

       node-testing-1.yoursubdomain.domain.net
       node-testing-2.yoursubdomain.domain.net
       node-testing-3.yoursubdomain.domain.net
       node-client-test.yoursubdomain.domain.net

ALSO make sure these work and actually point your servers (this is the part that someting in the dns query path might return some random IP address making the gluster server contact some unknown far-off host):

      node-testing-1
      node-testing-2
      node-testing-3
      node-client-test

The way I did this was put "yoursubdomain.domain.net" in the "search" parameter of /etc/resolv.conf. Others will probably just put the entries in /etc/hosts. Whatever works.

By the way, configuring the search parameter on resolv.conf differs between debian and redhat-derived distributions. For deiban-derived it is best to install "resolvconf" and put a keyword in /etc/network/interfaces; for redhat-derived it is easer to just use "nmtui" or put a keyword in /etc/sysconfig/network-scripts/whatever/ifcfg-whatever

My server is back online...thank you for reading this.

JondZ 20170505

Friday, March 31, 2017

Expermiment on learning active-active httpd

Not a bad way to spend a friday afternoon..Here are my raw notes. I make up these tech notes for myself and this is not a bad addition:

Fri Mar 31 14:56:32 EDT 2017 LESSON: SIMPLE ACTIVE-ACTIVE HTTPD CLUSTER

This is based on the RedHat manual "Linux 7 High Availability Add On
Administration" except that this follows an active-active setup and assumes
there is a cluster filesystem available.

PACKAGES NEEDED:

wget - needed by pacemaker (for status checks; supposedly "curl" is also
       supported and must be specified by the ocf client= option)
lynx - OPTIONAL; to test status yourself.

ASSUMED CONDITIONS:

- httpd is already installed; furthermore it is enabled and running as stock
via systemd
- there is a clustered gfs filesystem on /volumes/data1 (for common
html content)

PART 1: HTTPD

Set up the document root as desired. In this example, the html documents
are rooted at /volumes/data1/www and is common to all nodes. In the config
file /etc/httpd/conf/httpd.conf:

        DocumentRoot "/volumes/data1/www"
        <Directory "/volumes/data1/www">
            AllowOverride None
            Require all granted
        </Directory>

Put some data on the directory; in this simple example there would be a file
named /volumes/data1/www/index.html

        <html>
        <body>
        <h1>hello</h1>
        This is a test website from JondZ
        </body>
        </html>

At the end of the config, put the following; this is used by pacmaker to check
status.

        <Location /server-status>
        SetHandler server-status
        Order deny,allow
        Deny from all
        Allow from 127.0.0.1
        </Location>

Use lynx to check that it works:

        lynx http://127.0.0.1/server-status

When satisfied that things are working, disable httpd activation by systemd;
the service will be managed by pacemaker instead.

        systemctl disable httpd
        systemctl stop httpd

PART 2: LOGROTATE

Edit the file /etc/logrotate.d/httpd and modify the "postrotate" section:

        # This is the old stuff. Comment this out.
        # Since httpd is going to be managed
        # by pacemaker, not by systemd, this is no longer valid:
        #
        # /bin/systemctl reload httpd.service > /dev/null 2>/dev/null || true
        #
        # This is the correct line that RedHat recommends. Note that
        # the PID file is produced by pacemaker (or httpd itself?) and is
        # probably true only as long as httpd is not managed by systemd.
        #
        /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -c \
        "PidFile /var/run/httpd.pid" -k graceful > /dev/null 2>/dev/null \
        || true
        #
        # This is how I personally respawn apache on old production systems
        # but this NO LONGER RELIABLY WORKS (need testing).
        #
    # /sbin/apachectl graceful > /dev/null 2>/dev/null && true


Test logrotate. First of all make sure that /var/run/httpd.pid is current.
Then force rotations by "logrotate -f /etc/logrotate.conf". Also watch the
pid changes on the httpd processes (on a separate terminal you could say
watch -n1 "ps -efww | grep httpd" and watch the pid's being replaced.

PART 3: PCS RESOURCE ENTRY

I added the pcs resource as follows:

       pcs resource create batwww apache
       configfile="/etc/httpd/conf/httpd.conf"
       statusurl="http://127.0.0.1/server-status" clone

The option "clone" makes the httpd run an all nodes (instead of just one
instance).

JondZ 201703

Thursday, March 30, 2017

Old Fashioned

I value my data: I possses an organizer that has no internet connection and I use a tape drive for backup.

Unlike modern gadgets, I do not have to recharge my organizer every day. It also does not require backlight so is easier on my eyes. I also do no trust the "cloud". I had an android-based cell phone password organizer a while ago: not any more.

Tape is very cheap and I do not have to worry about having to replace spinning disks every 2-5 years. Tape is still the last expensive option and is extremely easy to use. I can just type this in the morning:

screen
tar cvfpb /dev/st1 128 files...

Contrary to popular myths, tape drives are actually fast. I have a slow server (an athlon 5350 motherboard) and slow disk (actually a QLA iSCSI over a NetGear NAS device, and I measure tape speed at 20 to 30 MegaBytes per second. It sounds as if my server cannot deliver the bytes fast enough, resulting in a motor pause every few seconds. That implies that the LTO-3 tape drive is capable of more thoroughput. In my case, I might also use an LTO-1 drive for smaller jobs just to keep the motor humming nicely.

Tape drive (LTO-3) bought from ebay.


A Palm Organizer