Linux Best Practices

Hello Guys,

Let’s investigate this issue on Exadata At Customer – or Exadata Machine:

[oracle@tech01 ~]$ sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Mon May 18 16:15:50 2020
Version 19.7.0.0.0
Copyright (c) 1982, 2020, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup nomount pfile='/home/oracle/init.ora';
ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpbitsper
SQL> quit
Disconnected

The solutions is:

Reduce parameters processes, sessions, transactions and open_cursors for all databases. Follow the maximum used on each database to avoid to use much more than the database needs – and yes put some extra value since your database can grow. Update sysctl.conf parameters for your linux accommodate all databases. See the steps below:

=================================================================11) Check results for max limits on database used
=================================================================
column rname	format a30 	    heading 'Resource Name';
column curu	format 999,999,990	heading 'Current|Utilization' just c;
column maxu	format 999,999,990	heading 'Maximum|Utilization' just c;
column inita    format a10      heading 'Initial|Allocation'  just c;
column lim      format a10      heading 'Limit'               just r;
set lines 200
select inst_id, resource_name         rname
     , current_utilization   curu
     , max_utilization       maxu
     , initial_allocation    inita
     , limit_value           lim
from  gv$resource_limit  ;
select inst_id, resource_name         rname
     , current_utilization   curu
     , max_utilization       maxu
     , initial_allocation    inita
     , limit_value           lim
from  gv$resource_limit
WHERE resource_name in ('processes','sessions');
SQL> set lines 200
SQL> SQL> select inst_id, resource_name         rname
  2       , current_utilization   curu
  3       , max_utilization       maxu
  4       , initial_allocation    inita
  5       , limit_value           lim
  6  from  gv$resource_limit
  7
SQL> /
                                            Current      Maximum     Initial
   INST_ID Resource Name                  Utilization  Utilization  Allocation      Limit
---------- ------------------------------ ------------ ------------ ---------- ----------
         1 processes                               165          186       2048       2048
         1 sessions                                170          208       3100       3100
         1 enqueue_locks                            76           98      36542      36542
         1 enqueue_resources                        62           87      14684  UNLIMITED
         1 ges_procs                               153          173       2051       2051
         1 ges_ress                             11,663       87,988      69331  UNLIMITED
         1 ges_locks                               343          547      99571  UNLIMITED
         1 ges_cache_ress                       12,072       16,162          0  UNLIMITED
         1 ges_reg_msgs                            176        4,119       6126  UNLIMITED
         1 ges_big_msgs                             39          360       3826  UNLIMITED
                        0            0       1000       1000
         1 gcs_resources                        47,386       47,460     810054  UNLIMITED
         1 gcs_shadows                          33,188       34,349     810054  UNLIMITED
         1 smartio_overhead_memory             855,160    1,026,192          0  UNLIMITED
         1 smartio_buffer_memory               343,416    1,486,280          0  UNLIMITED
         1 smartio_metadata_memory                   0           56          0  UNLIMITED
         1 smartio_sessions                          3            4          0  UNLIMITED
         1 dml_locks                                 0           21      13640  UNLIMITED
         1 temporary_table_locks                     0            5  UNLIMITED  UNLIMITED
         1 transactions                              0            5       3410  UNLIMITED
.....
108 rows selected.
SQL>

Oracle suggested:

SEMMNI should be increased to accomodate more semaphores.
1. Query the current semaphore values in the kernel
     # /sbin/sysctl -a | grep sem
2. Modify SEMMNI value in the /etc/sysctl.conf.
From
kernel.sem = 250 32000 100 128
To
kernel.sem = 250 32000 100 200
3. # /sbin/sysctl -p

What worked for me:

### ANDRE ROCHA

# kernel.sem = 1024 60000 1024 256 -> original

kernel.sem = 1512 1200000 1512 1024

My original recipe:

=================================================================
List of best practices: ANDRE ROCHA
=================================================================1) Configure huge pages - Doc ID 361323.1  401749.1
2) Disable Transparent HugePages - Doc ID 1557478.1
3) Check if system using hugepage have this parameter USE_LARGE_PAGES enabled Doc ID 1392497.1
4) Check swap is set correct and swapiness - Doc ID 2240180.1 2404462.1  - vm.swappiness=10
5) Network Considerations - Doc ID 811306.1
6) Collect data for sar output and print graph to get swap memory results
7) About async IO, please check on aio.max-nr and fs.aio-nr.
8) Revise Kernel parameters with best practices/recommendations
9) Set/check kernel parameters can affect SGA size and shared pool
10) Check results of cluvfy for packages, kernel limits and general check results
14) Appendice of list of commands to be executed/get results
https://sarchart.weebly.com/howto.html
=================================================================Linux Kernel Lowmem Pressure Issues and Related Kernel Structures (Doc ID 452326.1)
ALERT: Disable Transparent HugePages on SLES11, RHEL6, RHEL7, OL6, OL7, and UEK2 and above (Doc ID 1557478.1)
HugePages on Linux: What It Is... and What It Is Not... (Doc ID 361323.1)
System Hangs Due To Process Kswapd0 (Doc ID 2240180.1)
Why is SWAP being used instead of available physical memory? (Doc ID 2404462.1)
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-tunables
A low swappiness value is recommended for database workloads. For example, for Oracle databases,
Red Hat recommends a swappiness value of 10. vm.swappiness=10
Top 5 Issues That Cause Node Reboots or Evictions or Unexpected Recycle of CRS (Doc ID 1367153.1)
If the platform is Linux, set up hugepages and set kernel parameter vm.min_free_kbytes to reserve 512MB.
Setting hugepages is probably the single most important thing to do on Linux. Note that memory_target can not
be set when using hugepages.
ALERT: Disable Transparent HugePages on SLES11, RHEL6, RHEL7, OL6, OL7, and UEK2 and above (Doc ID 1557478.1)
Because Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC,
Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may
cause problems even in a single-instance database environment with unexpected performance problems or delays.
As such, Oracle recommends disabling Transparent HugePages on all Database servers running Oracle.
Note: on UEK2 and above, check the existence of the /sys/kernel/mm/transparent_hugepage/ directory.
If this directory does not exist, then the transparent hugepage is removed from the kernel, so there is no need to
disable the transparent hugepage
For Oracle Linux 7, Oracle published MOS document 2066217.1
Oracle Linux 7 - How to disable Transparent HugePages for RHCK kernel?
Also, add the following line to /etc/default/grub file:
transparent_hugepage=never
cat /etc/grub.conf
title Oracle Linux Server (2.6.32-300.25.1.el6uek.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-300.25.1.el6uek.x86_64 ro root=LABEL=/ transparent_hugepage=never
        initrd /initramfs-2.6.32-300.25.1.el6uek.x86_64.img
=================================================================1) Configure huge pages - Doc ID 361323.1
=================================================================https://oracle-base.com/articles/linux/configuring-huge-pages-for-oracle-on-linux-64
Implement HugePages for the database instances.  HugePages provides a method to have larger page sizes and it will
lock the SGA into physical memory thus eliminating the need for system page table lookups for the SGA.
This is especially important on systems with high memory allocations due to eliminate the management overhead
associated with the such configurations.  HugePages is also recommended in environments where server stability
issues are evident – various spins/hangs, which are not attributed to Oracle Clusterware or other known OS issues.
To compute size, see Document 401749.1. For more information, see Document 361323.1. In 11gR1 and above, you must disable
AMM to use HugePages as directed in Document 749851.1.
$ grep Huge /proc/meminfo
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
$
Execute hugepages_setting script:
$ ./hugepages_setting.sh
Recommended setting: vm.nr_hugepages = 305
$
Edit file and add the result for the script hugepages_settings.sh: vm.nr_hugepages
/etc/sysctl
Run the following command as the "root" user.
# sysctl -p
To be permanet, edit the "/etc/grub.conf" file, adding "hugepages=306" to the end of the kernel
line for the default kernel and reboot.
You can now see the HugePages have been created, but are currently not being used.
Check the HugePages information again.
$ grep Huge /proc/meminfo
AnonHugePages:         0 kB
HugePages_Total:     306
HugePages_Free:       98
HugePages_Rsvd:       93
HugePages_Surp:
Hugepagesize:       2048 kB
$
$
If you prefer, you can set these parameters to a value just below the size of physical memory of
the server. This way you can forget about it, unless you add more physical memory.
Update Oracle database
ALTER SYSTEM SET use_large_pages=only SCOPE=SPFILE;
SHUTDOWN IMMEDIATE;
STARTUP;
=================================================================2) Setting Shell Limits for the Oracle User
=================================================================
https://docs.oracle.com/cd/E51773_01/doc.12105/e51150/dbinstall.htm
To improve the performance of the software, you must increase the shell limits for the oracle user.
Add the following lines to the /etc/security/limits.conf file:
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
If not already present, add the following lines to the /etc/pam.d/login file:
session required /lib64/security/pam_limits.so
session required pam_limits.so
Depending on the oracle user's default shell, you need to make changes to it. For the Bourne, Bash, or Korn shell, add the following lines to the bottom of the /etc/profile file:
if [ $USER = "oracle" ]; then
    if [ $SHELL = "/bin/ksh" ]; then
        ulimit -p 16384
        ulimit -n 65536
    else
        ulimit -u 16384 -n 65536
    fi
fi
Add the following entries into the "/etc/security/limits.conf" script or
"/etc/security/limits.d/99-grid-oracle-limits.conf" script, where the setting is at least the size
of the HugePages allocation in KB (HugePages * Hugepagesize).
In this case the value is 306*2048=626688.
* soft memlock 626688
* hard memlock 626688
=================================================================2) Disable Transparent HugePages - Doc ID 1557478.1
=================================================================
If you are running RedHat/OEL 6/7, SLES 11/12 or UEK2 kernels, be sure to disable Transparent HugePages to prevent
performance problems and Node/Instance evictions.  Our long standing recomendation of using general HugePages still stands
(THP is a different implementation than general HugePages).  See Document 1557478.1 for additional details.
You can check the current setting using the following command, which is displaying the default value of "enabled=[always]".
# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
For Oracle Linux 6 the preferred method to disable Transparent HugePages is to add "transparent_hugepage=never" to the kernel boot line in the "/boot/grub/grub.conf" file.
title Oracle Linux Server (2.6.39-400.24.1.el6uek.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.39-400.24.1.el6uek.x86_64 ro root=/dev/mapper/vg_ol6112-lv_root rd_NO_LUKS  KEYBOARDTYPE=pc KEYTABLE=uk
LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16  rd_NO_DM rd_LVM_LV=vg_ol6112/lv_swap rd_LVM_LV=vg_ol6112/lv_root rhgb quiet numa=off
transparent_hugepage=never
        initrd /initramfs-2.6.39-400.24.1.el6uek.x86_64.img
The server must be rebooted for this to take effect.
Alternatively, add the following lines into the "/etc/rc.local" file and reboot the server.
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
   echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
   echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
Whichever method you choose, remember to check the change has work after reboot.
# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
#
=================================================================3) Check if system using hugepage have this parameter USE_LARGE_PAGES enabled Doc ID 1392497.1
=================================================================show parameter USE_LARGE_PAGES
=================================================================4) Check swap is set correct and swapiness - Doc ID 2240180.1 2404462.1  - vm.swappiness=10
=================================================================
Free -h
Take a backup of the current sysctl.conf file to be used in case of rollback
cp /etc/sysctl.conf /root/sysctl.conf_TICKETNUMBER
Review current swappiness configuration
cat /proc/sys/vm/swappiness
60
Change the current in-memory setting for vm.swappiness
echo 10 > /proc/sys/vm/swappiness
Making Permanent the mv.swappiness
# echo 'vm.swappiness=10' >> /etc/sysctl.conf
Confirm the new swappiness value
 # cat /proc/sys/vm/swappiness
---Result 10.
=================================================================5) Network Considerations - Doc ID 811306.1
=================================================================
RAC and Oracle Clusterware Best Practices and Starter Kit (Linux) (Doc ID 811306.1)
As workload dictates, rmem_max and wmem_max kernel parameters should be increased beyond the default 256kb.  These values determine how much kernel buffer memory is allocated per socket opened for network reads and writes:
net.core.rmem_default=262144
net.core.rmem_max=4194304 (for 11g and all RDS implementations)
net.core.rmem_max=2097152 (for 10g)
net.core.wmem_default=262144
net.core.wmem_max=1048576 (with RDS use at least 2097152)
Changing Kernel Parameter Values - sysctl.conf for 19c
https://docs.oracle.com/en/database/oracle/oracle-database/19/ladbi/changing-kernel-parameter-values.html#GUID-FB0CC366-61C9-4AA2-9BE7-233EB6810A31
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 2097152
kernel.shmmax = 4294967295
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
=================================================================6) Collect data for sar output
=================================================================
### All Days available on SAR
DT=$(ls /var/log/sa/sa[0-9][0-9] | tr '\n' ' ' | sed 's/\/var\/log\/sa\/sa/ /g')
>/tmp/sar-$(hostname)-multiple.txt
for i in $DT; do
LC_ALL=C sar -A -f /var/log/sa/sa$i >> /tmp/sar-$(hostname)-multiple.txt
done
ls -l /tmp/sar-$(hostname)-multiple.txt
=================================================================7) About async IO, please check on aio.max-nr and fs.aio-nr
=================================================================
What value should kernel parameter AIO-MAX-NR be set to ? (Doc ID 2229798.1)
The minimum value suggested as per http://docs.oracle.com/database/121/LTDQI/toc.htm#BABFIFCGfor aio-max-nr is 1048576 .
the DOCS provide the minimum required for a good install and basic usage.
This is how we calculate aio-max-nr parameter value -
show parameter processes
aio-max-nr =no of process per DB * no of databases * 4096
After deriving the value using this formula , you should still monitor the /proc/sys/fs/aio-nr value and also monitor I/O
loading.
Points to know -
1) Number  of processes per database  is not background server process number, it's the number of background server
process + user process.
2) Parameter aio-max-nr is a limitation of aio-nr, no impact was reported if you set it too high. But we should still
monitor /proc/sys/fs/aio-nr value and tune this value.
cat /proc/sys/fs/aio-nr
=================================================================8) Revise Kernel parameters with best practices/recommendations
=================================================================https://docs.oracle.com/database/122/LADBI/app_manual.htm#LADBI7865
https://docs.oracle.com/en/database/oracle/oracle-database/19/cwlin/changing-kernel-parameter-values.html#GUID-FB0CC366-61C9-4AA2-9BE7-233EB6810A31
=================================================================
9) Set/check kernel parameters can affect SGA size
and shared pool
=================================================================
shmmax 64G 85% of 16G
kernel.shmmax 68719476736 kernel.shmmax=15032385536
need fit/affect SGA
kernel.shmall 4294967296=4G 40% 16G = 6,4G = 6871947673
need fit/affect Shared pool
=================================================================
10) Check results of cluvfy for packages, kernel limits and general check results
=================================================================
Download it
https://www.oracle.com/database/technologies/cvu-downloads.html
### All Days available on SAR
DT=$(ls /var/log/sa/sa[0-9][0-9] | tr '\n' ' ' | sed 's/\/var\/log\/sa\/sa/ /g')
>/tmp/sar-$(hostname)-multiple.txt
for i in $DT; do
LC_ALL=C sar -A -f /var/log/sa/sa$i >> /tmp/sar-$(hostname)-multiple.txt
done
ls -l /tmp/sar-$(hostname)-multiple.txt
=================================================================
General results/recommendations :
=================================================================
Kernel: Reduce shmmax to 85% of 16G
Kernel: increase shmall to 40% of 16G
Set hugepages: vm.nr_hugepages = 964
Swap is under the recommendation but sounds ok since was not using swap memory on it.
Next approach to continue to work on it: Investigate sga_target memory settings since
sga is using only 2G memory entire memory.
ex.
shmmax  64G 85% of 16G (total memoria)
kernel.shmmax 68719476736 kernel.shmmax=15032385536
need fit/affect SGA
kernel.shmall 4294967296=4G 40% 16G = 6,4G = 6871947673
need fit/affect Shared pool  (total memoria)
Oracle Exadata Database Machine Setup/Configuration Best Practices (Doc ID 1274318.1)
s.file-max = 6815744 512 x processes (for example 6815744 for 13312 processes)
kernel.shmmax = RAM times 0.5 (or higher at customer's discretion - see Note:567506.1)
Oracle Database (RDBMS) on Unix AIX,HP-UX,Linux,Mac OS X,Solaris,Tru64 Unix Operating Systems
Installation and Configuration Requirements Quick Reference (8.0.5 to 11.2) (Doc ID 169706.1)
Upon startup of Linux database get ORA-27102: out of memory Linux-X86_64 Error: 28:
No space left on device (Doc ID 301830.1)
What value should kernel parameter AIO-MAX-NR be set to ? (Doc ID 2229798.1)
TECH: Unix Semaphores and Shared Memory Explained (Doc ID 15566.1)
Exadata sysctl comments:
Maximum shared memory allowed is 85 percent of physical RAM
Maximum shared memory pages allowed are 85 percent of physical RAM divided by page size
kernel.sem = 1024 60000 1024 256
Check processes -> used to calculate aio-max-nr
show parameter memor
grep SwapTotal /proc/meminfo
=================================================================
11) Check results for max limits on database used
=================================================================
column rname	format a30 	    heading 'Resource Name';
column curu	    format 999,999,990	heading 'Current|Utilization' just c;
column maxu	    format 999,999,990	heading 'Maximum|Utilization' just c;
column inita    format a10      heading 'Initial|Allocation'  just c;
column lim      format a10      heading 'Limit'               just r;
column inst_id  format a10      heading 'Limit'               just r;
set lines 200
select inst_id, resource_name         rname
     , current_utilization   curu
     , max_utilization       maxu
     , initial_allocation    inita
     , limit_value           lim
from  gv$resource_limit  ;
select inst_id, resource_name         rname
     , current_utilization   curu
     , max_utilization       maxu
     , initial_allocation    inita
     , limit_value           lim
from  gv$resource_limit
WHERE resource_name in ('processes','sessions');
=================================================================
12) Check results for swap and compare with total of memory
=================================================================
Requirements for Installing Oracle Database 12.1 on RHEL5 or OL5 64-bit (x86-64) (Doc ID 1529433.1)
b.) Swap disk space proportional to the system's physical memory as follows:
RAM	Swap Space
Between 1 GB and 2 GB	1.5 times the size of RAM
Between 2 GB and 16 GB	Equal to the size of RAM
More than 16 GB	16 GB
NOTE: The above recommendations (from the Oracle® Database Installation Guide 12c Release 1
(12.1) for Linux) are MINIMUM recommendations for installations. Further RAM and swap space may be
required to tune/improve RDBMS performance.
grep SwapTotal /proc/meminfo
=================================================================
List of commands to be executed
=================================================================
Swap check and memory
grep SwapTotal /proc/meminfo
free -m
uname -r
cat /etc/sysctl.conf
cat /etc/security/limits.conf
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /proc/sys/fs/aio-nr
cat /proc/sys/vm/swappiness
cat /etc/default/grub file:
cat /etc/grub.conf
cat /etc/security/limits.conf
show parameter memory
show parameter sga
show parameter processes
show parameter USE_LARGE_PAGES
NAME                                |Current/Acutal          |Suggested
------------------------------------|------------------------|------------------------------
HugePage                             not set                  vm.nr_hugepages = 964
shmmax                               64G                      85% of 16G
kernel.shmmax                        68719476736              kernel.shmmax=15032385536
kernel.shmall                        4294967296=4G            40% 16G = 6,4G = 6871947673
HugePage                             not set                  vm.nr_hugepages = 964
 = 64G
kernel.shmmax = 68719476736
### Exadata 202005 - standard
# kernel.sem = 1024 60000 1024 256
kernel.sem = 1512 1200000 1512 1024