View Single Post
  #2   (View Single Post)  
Old 5th August 2013
J65nko J65nko is offline
Administrator
 
Join Date: May 2008
Location: Budel - the Netherlands
Posts: 4,125
Default

Pre-installation steps

For demonstration purpose two memory or RAM disks will be used.

Code:
# make md_create

Creating Memory disk devices /dev/md1 /dev/md2:
Memory disk devices:
md1     swap     2048M
md2     swap     2048M
crw-r-----  1 root  operator    0, 162 Aug  5 16:50 /dev/md1
crw-r-----  1 root  operator    0, 163 Aug  5 16:50 /dev/md2
crw-------  1 root  wheel       0,  63 Aug  5 14:58 /dev/mdctl
-------------end of md_create --------------------
Running # diskinfo on these disks.

Code:
# make diskinfo
if [ -e /dev/md1 ] ; then diskinfo -v /dev/md1 ; fi
/dev/md1
        512             # sectorsize
        2147483648      # mediasize in bytes (2.0G)
        4194304         # mediasize in sectors
        0               # stripesize
        0               # stripeoffset

if [ -e /dev/md2 ] ; then diskinfo -v /dev/md2 ; fi
/dev/md2
        512             # sectorsize
        2147483648      # mediasize in bytes (2.0G)
        4194304         # mediasize in sectors
        0               # stripesize
        0               # stripeoffset

-------------end of diskinfo --------------------
It is not always needed to edit the Makefile to have it do something differently.
To use this target on one of my disks reporting a 512 sectorsize to the OS but having a stripesize of 4096, I can override the makefile variable DISKS on the command line;
Code:
# make DISKS=/dev/ada1 diskinfo

if [ -e /dev/ada1 ] ; then diskinfo -v /dev/ada1 ; fi
/dev/ada1
        512             # sectorsize
        2000398934016   # mediasize in bytes (1.8T)
        3907029168      # mediasize in sectors
        4096            # stripesize
        0               # stripeoffset
        3876021         # Cylinders according to firmware.
        16              # Heads according to firmware.
        63              # Sectors according to firmware.
        S1E160MR        # Disk ident.

-------------end of diskinfo --------------------
After this intermezzo we start the real pre-installation work by creating 4K aligned GPT partitions.

WARNING: the following command will remove all partitioning information from the disks as defined in the makefile. Please make sure that this is correct and double-check the settings you configured by running # make show. You will lose data if the settings refer to the wrong disk.

Code:
# make partition

if gpart show /dev/md1 ; then  gpart destroy -F /dev/md1 ; fi
gpart: No such geom: /dev/md1.
gpart create -s gpt /dev/md1
md1 created
NR=$( echo /dev/md1 | tr -c -d '0-9' ) ; gpart add  -b 40 -s 128k -t freebsd-boot -l mdboot${NR} /dev/md1 ; gpart add              -t freebsd-zfs  -l mdisk_${NR}  /dev/md1
md1p1 added
md1p2 added
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 /dev/md1
bootcode written to md1
gpart show /dev/md1
=>     34  4194237  md1  GPT  (2.0G)
       34        6       - free -  (3.0k)
       40      256    1  freebsd-boot  (128k)
      296  4193975    2  freebsd-zfs  (2G)
[snip]
ls -l /dev/gpt
crw-r-----  1 root  operator    0, 170 Aug  5 16:56 mdboot1
crw-r-----  1 root  operator    0, 176 Aug  5 16:56 mdboot2
crw-r-----  1 root  operator    0, 168 Aug  5 16:56 mdisk_1
crw-r-----  1 root  operator    0, 174 Aug  5 16:56 mdisk_2
-------------end of partition --------------------
Here we see that the partitions start at sector 40 and 296. A quick command line modulo calculation with bc(1) confirms that the start sectors on a 8 x 512 = 4096 boundary. An alternative method, a division shows no remainder.

Code:
$ echo '296 % 8' | bc
0
$ echo 'scale=4 ; 296 / 8' | bc  
37.0000
Creating xthe gnop(8) devices with 4K sectors.

Code:
# make gnop4k

Creating gnop devices with 4K sectors .....
for X in /dev/md1 /dev/md2 ; do  NR=$( echo ${X} | tr -c -d '0-9' ) ;
 gnop create -S 4096 /dev/gpt/mdisk_${NR} ; done

ls -l /dev/gpt
crw-r-----  1 root  operator    0, 170 Aug  5 16:56 mdboot1
crw-r-----  1 root  operator    0, 176 Aug  5 16:56 mdboot2
crw-r-----  1 root  operator    0, 168 Aug  5 16:56 mdisk_1
crw-r-----  1 root  operator    0, 166 Aug  5 17:38 mdisk_1.nop
crw-r-----  1 root  operator    0, 174 Aug  5 16:56 mdisk_2
crw-r-----  1 root  operator    0, 172 Aug  5 17:38 mdisk_2.nop

gnop list
Geom name: gpt/mdisk_1.nop
WroteBytes: 0
ReadBytes: 131072
Writes: 0
Reads: 21
Error: 5
WriteFailProb: 0
ReadFailProb: 0
Offset: 0
Providers:
1. Name: gpt/mdisk_1.nop
   Mediasize: 2147311616 (2G)
   Sectorsize: 4096
   Mode: r0w0e0
Consumers:
1. Name: gpt/mdisk_1
   Mediasize: 2147315200 (2G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 151552
   Mode: r0w0e0

[snip]

gnop status
           Name  Status  Components
gpt/mdisk_1.nop     N/A  gpt/mdisk_1
gpt/mdisk_2.nop     N/A  gpt/mdisk_2
-------------end of gnop4k --------------------
Note the gnop GEOM provider with a sector size of 4096.
A ZFS storage pool with these freshly created 4K devices is the next step.

Code:
# make pool4k

Creating zpool with 4K gnop devices
if [ -f /tmp/zpool.cache ] ; then mv /tmp/zpool.cache /tmp/zpool.cache.prev ; fi
ls -l /dev/gpt/*nop
crw-r-----  1 root  operator    0, 166 Aug  5 17:38 mdisk_1.nop
crw-r-----  1 root  operator    0, 172 Aug  5 17:38 mdisk_2.nop

# ---- creating a ZFS mirror pool  without automatically mounting it (-m none) .....
zpool create -f -o cachefile=/tmp/zpool.cache -m none  super mirror /dev/gpt/mdisk_*nop

zpool list super
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
super  1.98G   560K  1.98G     0%  1.00x  ONLINE  -

zpool status super
  pool: super
 state: ONLINE
  scan: none requested
config:

        NAME                 STATE     READ WRITE CKSUM
        super                ONLINE       0     0     0
          mirror-0           ONLINE       0     0     0
            gpt/mdisk_1.nop  ONLINE       0     0     0
            gpt/mdisk_2.nop  ONLINE       0     0     0

errors: No known data errors

zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
super   468K  1.95G   144K  none
-------------end of pool4k --------------------
Export the pool:

Code:
# make export
Exporting super ......
---------------------------------
zpool export super
---------------------------------
Showing import status of super ......
---------------------------------
zpool import
   pool: super
     id: 4218734048000312785
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

        super                ONLINE
          mirror-0           ONLINE
            gpt/mdisk_1.nop  ONLINE
            gpt/mdisk_2.nop  ONLINE
-------------end of export --------------------
Destroy the gnop devices.

Code:
# make gnop_destroy
Destroy the 4K gnop devices
for X in /dev/md1 /dev/md2 ; 
  do  NR=$( echo ${X} | tr -c -d '0-9' ) ;
   gnop destroy /dev/gpt/mdisk_${NR}.nop ;
done

ls -l /dev/gpt
total 0
crw-r-----  1 root  operator    0, 170 Aug  5 16:56 mdboot1
crw-r-----  1 root  operator    0, 176 Aug  5 16:56 mdboot2
crw-r-----  1 root  operator    0, 168 Aug  5 16:56 mdisk_1
crw-r-----  1 root  operator    0, 174 Aug  5 16:56 mdisk_2
[snip]
-------------end of gnop_destroy --------------------
After the gnop devices have gone the way of the dodo, we now import our pool.

Code:
# make import

Import the pool
zpool import -o cachefile=/tmp/zpool.cache super
zpool list super
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
super  1.98G   720K  1.98G     0%  1.00x  ONLINE  -
---------------------------------
zpool status super
  pool: super
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        super       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            md1p2   ONLINE       0     0     0
            md2p2   ONLINE       0     0     0

errors: No known data errors
---------------------------------
mount
/dev/ada0s3a on / (ufs, local, noatime, soft-updates)
devfs on /dev (devfs, local, multilabel)
tmpfs on /tmp (tmpfs, local)
-------------end of import --------------------
The pool is ONLINE but not yet mounted.
Note that for some reason the import does not use the GPT labels, but the partition numbers. I only noticed this when I test with memory disks. With real disks the import uses the labels:

Code:
# ls -l /dev/gpt
total 0
crw-r-----  1 root  operator    0, 173 Jul 26 04:41 boot_1
crw-r-----  1 root  operator    0, 182 Jul 26 04:41 boot_2
crw-r-----  1 root  operator    0, 171 Jul 26 04:41 disk_1
crw-r-----  1 root  operator    0, 180 Jul 26 04:41 disk_2

# make import
Import the pool
zpool import -o cachefile=/var/tmp/zpool.cache super
zpool status
  pool: super
 state: ONLINE
  scan: none requested
config:

        NAME            STATE     READ WRITE CKSUM
        super           ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            gpt/disk_1  ONLINE       0     0     0
            gpt/disk_2  ONLINE       0     0     0
Make sure that the imported pool although created with 4K size gnop devices, has retained the proper ashift value.

Code:
#  make chk_ashift

Verify that ashift value is 12 (2^12 = 4096 ; 2^9 = 512)
=========================================================
zdb -C -U /tmp/zpool.cache super

MOS Configuration:
        version: 28
        name: 'super'
        state: 0
        txg: 65
        pool_guid: 4218734048000312785
        hostid: 556313802
        hostname: 'althusser.utp.xnet'
        vdev_children: 1
        vdev_tree:
            type: 'root'
            id: 0
            guid: 4218734048000312785
            children[0]:
                type: 'mirror'
                id: 0
                guid: 16715166418299641908
                metaslab_array: 30
                metaslab_shift: 24
                ashift: 12
                asize: 2142502912
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 5669787805994321979
                    path: '/dev/md1p2'
                    phys_path: '/dev/md1p2'
                    whole_disk: 1
                    create_txg: 4
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 10458960657003691384
                    path: '/dev/md2p2'
                    phys_path: '/dev/md2p2'
                    whole_disk: 1
                    create_txg: 4
====================================
zdb -C -U /tmp/zpool.cache super | grep ashift
                ashift: 12
-------------end of chk_ashift --------------------
That looks fine. With a 4K optimized ZFS pool, consisting of 4K aligned GPT partitions we can resume with the original procedure outlined by Vermaden, by setting options that will be inherited by the child ZFS datasets.

Code:
# make zfs_options
zfs set mountpoint=none super
zfs set checksum=fletcher4 super
zfs set atime=off super

zpool list super
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
super  1.98G   780K  1.98G     0%  1.00x  ONLINE  -

zpool status super
  pool: super
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        super       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            md1p2   ONLINE       0     0     0
            md2p2   ONLINE       0     0     0

errors: No known data errors
zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
super   504K  1.95G   144K  none
-------------end of zfs_options -------------------
For handling the boot environments with Vermaden's 'beadm' utility we create the following ZFS datasets and set the appropiate options:

Code:
# make zfs_fs
---------------------------------
zfs create                    super/ROOT
zfs create -o mountpoint=/mnt super/ROOT/default
zpool set bootfs=super/ROOT/default  super
zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
super                912K  1.95G   144K  none
super/ROOT           288K  1.95G   144K  none
super/ROOT/default   144K  1.95G   144K  /mnt
---------------------------------
mount
/dev/ada0s3a on / (ufs, local, noatime, soft-updates)
devfs on /dev (devfs, local, multilabel)
tmpfs on /tmp (tmpfs, local)
super/ROOT/default on /mnt (zfs, local, noatime, nfsv4acls)
-------------end of zfs_fs --------------------
As you can see we have now a ZFS dataset mounted on /mnt.

In the original procedure the ZFS swap space is configured after rebooting into the new system, but let us do it now.

Code:
# make zfs_swap

zfs create -V 256m   super/swap
zfs set org.freebsd:swap=on super/swap
zfs set checksum=off        super/swap
zfs set sync=disabled       super/swap
zfs set primarycache=none   super/swap
zfs set secondarycache=none super/swap

zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
super                265M  1.69G   144K  none
super/ROOT           288K  1.69G   144K  none
super/ROOT/default   144K  1.69G   144K  /mnt
super/swap           264M  1.95G    72K  -
-------------end of zfs_swap --------------------
Having a ZFS pool and datasets with the super/ROOT/default mounted on /mnt we are now ready to unpack the FreeBSD installation sets.
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump

Last edited by J65nko; 5th August 2013 at 11:13 PM. Reason: Added WARNING
Reply With Quote