by Mark Bole
"Pull is Easier, Push Is Better" — what does that mean? Following is a practical illustration of this concept in the context of managing Oracle software installations.
Oracle provides a comprehensive "universal" installer (OUI) for its database products. This graphical tool has an identical appearance across all platforms, is fairly easy to use, and of course is the only method supported by Oracle for installing its products. It includes a so-called "silent install" method, where instead of interactively clicking through the installer screens and menus, you can provide a pre-determined set of responses to control installer actions. Further, the OUI normally runs a system prerequisite check, maintains an inventory of all the various required and optional components that have been installed, and keeps detailed log files for each installation.
So, why would one even think of not using it? It turns out there are some
situations where you may be better off going “outside of the Oracle
box” when
it comes to repeat installations under Unix. This article describes the reasons
why, and how to effectively use the OUI in conjunction with the standard
Unix tool rsync
for
a more trouble-free, maintainable set of Oracle server installations.
Ralph Waldo Emerson said "A foolish consistency is the hobgoblin of little minds". He never defined what a "wise" consistency was, but it's guaranteed that inconsistency of almost any kind is the hobgoblin of computing systems. Despite the best efforts of Quality Assurance teams, the countless hours of regression testing, the small fortunes spent on multiple, redundant computing environments, how many times does a production outage occur because of one little change that someone neglected to replicate during the last rollout? How many times has a system hummed along nicely for nearly a year of 24x7 operation, only to choke and sputter when the next re-boot occurs, because a supposedly permanent configuration change was not made permanently after all? I have received enough 2 AM pager calls to know that the numbers are high—too high, considering that there is a straightforward solution available.
Here is a real-life problem I experienced a while back: several DBA's split the work of installing an Oracle 8.1.6.2 upgrade on a number of servers. Everything was fine, in both test and production environments. But months later, a routine upgrade of Perl software suddenly resulted in a script that mysteriously worked on some servers but not others, even though everything was identical. Eventually the problem was found to be related to the following differences in Oracle installation files:
server 1: =========================
9177672 Nov 17 2000 $ORACLE_HOME/lib/libclntsh.so.8.0
812 Oct 24 1999 $ORACLE_HOME/lib/nautab.o
server 2: =========================
9928452 Nov 17 2000 $ORACLE_HOME/lib/libclntsh.so.8.0
1272 Jun 8 2000 $ORACLE_HOME/lib/nautab.o
Can you tell what happened, or how to fix it? Turns out that RADIUS authentication support, even though unused at this site, was included in the OUI session by one DBA and not the other. An obscure bug then triggered the problem with the new Perl version. It took several calendar months to finally figure it out, meanwhile many hours were wasted on implementing workarounds and other futile trouble-shooting activities.
The obvious solution would be to use a common "silent install" response file everywhere. But there are other drawbacks that won't so easily be dispensed with:
csh
and
its modern derivatives, handle job control better than others).
Being a somewhat lengthy process performed infrequently, the OUI
session is unusually susceptible to abnormal termination from
external causes. find
command
in the root directory tree. And if you do copy the CD-ROMS, do
you use the tar
utility,
which requires every byte to be copied three times (once into
the tar file, once over the network, and once out of the tar
file)? Or do you use scp
or ftp
,
which have to start over from the beginning on any failure,
and which each have several drawbacks (ftp
does not preserve
file timestamps and permissions, scp
incurs
high CPU overhead for encryption and decryption).The latest version of OUI (10.1) includes brand new features to help address some of these items, such as the ability to record a response file from an OUI session, the ability to clone an ORACLE_HOME, and the ability to access the install media over an HTTP connection. But to use these new features requires a non-trivial amount of your time to review documentation, test, and maintain, once you've decided to upgrade to Oracle 10g. It's important to remember that you will always use OUI at least once, for your primary installation. What is presented here is an alternative way to duplicate and validate subsequent installations with relatively little effort and overhead.
A note about Windows: even angels fear to tread in the registry, so this article only applies to Linux and Unix systems, where it's simply a matter of getting all the files and directories in the right place, and there are no drive letters to complicate your ORACLE_HOME location.
Software developers learned decades ago that it's critical to maintain a central repository where all new and changed code is stored and versioned. No matter how many individual programmers, no matter how many individual files changed, there is one and only one place where everything officially ends up. Inconsistencies must be resolved in order to compile; the compilation then becomes a known, reproducible entity which can be tested, debugged, and eventually rolled out to production. The whole point is, once a bug is fixed or new capability implemented, it should never have to be fixed or implemented again, at least not in the same place in the code. Done properly and supported by team managers, there never is any question as to what was in any given version of the code.
Database administrators (DBA's) and their system administrator (SA) brethren must learn and adopt the same practice to avoid hobgoblins. There is a slight twist, however, in how DBA's and SA's use this technique: instead of changes coming in from many places to a single destination repository, a single source is created and maintained, and then used to replicate changes out to many targets. This source (which I will call a "push" rather than a repository) should be kept on a single administrative server, preferably dedicated to this function. This server needs very little computing power, and typically not even that much disk space, so an older system, or even a workstation-class machine, will do.
The second part of this series ("Pull is Easier, Push Is Better Part 2")
discusses in more detail how such a server can handle not only generic distributions
of system software, such as Oracle and Perl, but also configuration files,
including those that need to be almost the
same, but not quite, on each server (such as the
Oracle
init.ora
file or the Apache httpd.conf
file). But for the purpose
of this topic, the requirements for this server are simple and few:
oracle
and apache
, and groups dba
and oinstall
, should be present and ideally
use the same numeric user id (UID) and group id (GID) across all systems. rsync
utility,
normally bundled with Linux and easily available for other Unix platforms
if not already bundled. See http://rsync.samba.org/ for
more information on rsync
, as the full installation and
configuration of an rsync
service is not covered here. First, decide on an ORACLE_HOME location for the version of software you will be maintaining (of course, multiple versions can be maintained as long as they are in different ORACLE_HOME's). It is critical to use the exact same ORACLE_HOME per version on every server where you run Oracle.
If necessary,
use symbolic links to meet this requirement.
For example, suppose you choose /opt/oracle/product/9.2.0
as
the standard ORACLE_HOME. On a target server, you don't have enough room
in the filesystem where /opt
is,
but you do have plenty of room in the filesystem mounted at /disk2
.
As user root
on the target machine, perform these three steps
to make the ORACLE_HOME directory, change its ownership, and lastly,
link it symbolically to the correct top-level directory.
mkdir -p /disk2/oracle/product/9.2.0
chown -R oracle:dba /disk2/oracle
ln -s /disk2/oracle /opt
Even if you don't need to use symbolic links, make sure the ORACLE_HOME directory already exists on the target server.
Confirm that your primary, or push, server has a correctly installed (using the OUI) version of Oracle, with the components and patches you have decided to use. Also be sure that the ORACLE_HOME used for the OUI installation was indeed your standard location. You might create a fresh installation, or use an existing "known good" installation, such as production or a valid test server.
The actual push process consists mostly of running a single rsync
command.
I have found that rsync
, like any power tool, has the capability
to do a lot of harm very quickly if not used carefully. While not intended
to be a tutorial, here are some points to keep in mind about rsync to stay
out of trouble:
-n
(also known as --dry-run
),
which shows what would be done, without actually doing it.rsync
output
should be captured in a log file using a Unix pipe and the tee
command,
for example: rsync -n [arguments]
| tee rsync.out
rcp
and scp
, when copying a source directory,
the rsync
command
targets the parent directory of
the directory you are trying to copy. In other words, leave off the trailing
directory on the target—in the following example, directory c
is
included in the source but not the target specification.
rsync
[options] /a/b/c user@target:/a/b
target:/a/b/c
as the destination, rsync
will create and copy to target:/a/b/c/c
, which is probably
not what you wanted.rsync
that control how hard
links and symbolic links in the file system are handled. If the target
directory is actually a symbolic link, use the --copy-links
option of rsync
. If there is a database running on the target machine, using the ORACLE_HOME
you want to push to, shut it down first. Then run this command as user root
on
the push server (if you are using ssh
as your transport for rsync
,
you may have to temporarily enable connections by root
):
% rsync --dry-run -avH --exclude-from=ora_home_exclude
--delete \
/opt/oracle/product/9.2.0 target:/opt/oracle/product | tee rsync.out
(This assumes you have the file ora_home_exclude
in place—more
on this next). You will
be prompted for the password, and then see output similar to the following,
which is also stored in the rsync.out
file
you used with the tee
command:
root@target's password: building file list ... done
9.2.0/Apache/Apache/bin/ab
9.2.0/Apache/Apache/bin/apachectl
[many lines omitted...] 9.2.0/xdk/mesg/lsxus.msb
9.2.0/xdk/mesg/lsxus.msg
wrote 2068372 bytes read 238492 bytes 54279.15 bytes/sec
total size is 2218371456 speedup is 961.64
Let's briefly review the purpose of the rsync
options used. --dry-run
has
already been discussed. -v
is for verbose output, -a
keeps
timestamps, file ownership and permissions consistent (which is why we run
this as user root
instead of oracle
, since some
of the Oracle files have special permissions and ownership), and -H
preserves
hard links. --delete
actually removes files from the target that
don't exist in the source (normally only adds and updates are performed by rsync
).
As the documentation states, use with caution! But in this case, since we
want consistent ORACLE_HOME's, it is appropriate.
The --exclude-from
option of rsync
is very important.
One of the most powerful features of rsync
is the ability to
include and exclude files and directories based on patterns. This is necessary
with the ORACLE_HOME push because, despite the best intentions of Oracle's Optimal
Flexible Architecture (OFA), there are still a number of server-specific
files that reside within the confines of the software installation tree. In
other words, ideally you should be able to completely remove the software
installation and re-install without disturbing files containing server-specific
configuration or logs—but such is not the case.
The main types of files we need to exclude are log files, audit files, and
configuration (.ora
) files. The dbs
directory under
ORACLE_HOME is also critical to exclude, since it contains the init.ora
file(s)
and the shared memory lock file(s). These are files that belong only on the
machine where they were first created, not on any other machine. Here are
the contents of the ora_home_exclude
file used above with the --exclude-from
option
of rsync
:
# for rsync'ing ORACLE_HOME installations *.log *.aud dbs/ ClientConfig.properties tnsnames.ora listener.ora ldap.ora oidca.out *.q snmp_ro.ora snmp_rw.ora sqlnet.ora dbsnmp.ver services.ora
What you put in this file is determined empirically, using the rsync
command
itself in --dry-run
mode and, optionally, the find
command
(some examples are given below). Following the rules in the rsync
documentation,
any files and directories matching these patterns will be excluded from the
push. This applies both on the source and target, so even if your push does
not have these files because there is no running database instance to generate
them, they will still be excluded at the target.
One of the benefits of the push, which helps qualify it as the Fast-and-Hard
Way, is that you can re-run it and test it quickly, usually by just recalling
a command from your Unix shell history list. For example, suppose we have proceeded
with the actual push by running the following command, this time without the
--dry-run
option that was shown above. Since in this case we are
copying to a machine with no existing ORACLE_HOME, we have nothing to lose—at
worst, we can always erase the target and do it again.
% rsync -avH --exclude-from=ora_home_exclude
--delete \
/opt/oracle/product/9.2.0 target:/opt/oracle/product | tee rsync.out
Since this will typically include between one and three gigabytes of files
for a first-time push, it may take a while, depending on your hardware (use
du -sk $ORACLE_HOME
on the push server to see approximately how
big the push will be). When finished, we can now see exactly what was excluded
by running the following comparison (rectify) command. This is a good way
to compare any two
directory trees on any local or remote Unix server, not just ORACLE_HOME directory
trees!
% rsync --dry-run -avH --delete \ /opt/oracle/product/9.2.0 target:/opt/oracle/product root@target's password: building file list ... done 9.2.0/Apache/Apache/htdocs/oam/common/success.log [lines omitted...] 9.2.0/dbs/init.ora 9.2.0/dbs/initmydb.ora -> ../../../admin/mydb/pfile/initmydb.ora 9.2.0/dbs/initdw.ora 9.2.0/dbs/lkMYDB 9.2.0/rdbms/audit/ora_15820.aud [lines omitted...] 9.2.0/rdbms/log/alert_mydb.log 9.2.0/sysman/config/ClientConfig.properties wrote 1836163 bytes read 404 bytes 66784.25 bytes/sec total size is 2218960063 speedup is 1208.21
This shows that the files we expected to be excluded (log and audit files)
were in fact excluded, plus we can see if any files that should have been included,
were not. If you determine for your site that any specific file should be included
or excluded, then add it to the ora_home_exclude
and re-run rsync
again.
Now you can create the $ORACLE_HOME/dbs
directory on the target
machine, place the appropriate init.ora
file, and create your
database. Suppose you then try to run the Oracle Net Creation Assistant on
the target machine, and get the following error:
% $ORACLE_HOME/bin/netca /u01/app/oracle/product/9.2.0/bin/netca: line 138: /u01/app/oracle/product/9.2.0/JRE/bin/jre: No such file or directory
Recall the information about rsync
and the --copy-links
option
above. The OUI installs a Java home outside of the ORACLE_HOME directory and
creates a symbolic link for it, as shown here on the push installation:
/opt/oracle/product/9.2.0/JRE -> /opt/oracle/jre/1.1.8
The update is simple, after you perform the usual --dry-run
test
and log the output:
rsync -avH --exclude-from=ora_home_exclude --delete \ /opt/oracle/jre target:/opt/oracle
Or, for another example, suppose you discover that there are incorrect permissions on the Intelligent Agent executable after a patch install:
-rwxr-xr-x 1 oracle dba 2522483 Dec 09 00:01 bin/dbsnmp* -rwsr-s--- 1 root dba 2472673 Apr 26 2002 bin/dbsnmp.old*
Note that the new dbsnmp
executable did not pick up the same
set-uid root ownership as the previous version, after a one-off patch install
to fix an installation bug under Linux. You only have to fix it once, at the
push location, and then, even if you forget to update all targets at that time,
the push will "remember" the fix and ensure that it is propagated
whenever you do subsequently perform a push.
As time goes by, suppose you do a dry run test and find that the following two files suddenly show up as wanting to be pushed (or, deleted on the target):
building file list ... done 9.2.0/network/log/dbsnmp.nohup 9.2.0/network/tools/NetProperties
It is straight-forward to investigate these files, determining their contents and timestamps, to see if they should be excluded or not. As a DBA or SA, this has the side benefit of being very valuable audit information, in the same vein as Tripwire—many times, I have been able to head off problems or inquire as to what my fellow admins have been doing by finding such new or changed files on either the push or the target. In this case, both files should be excluded, since the first is a result of running the Intelligent Agent, and the second is the result of running Net Creation Assistant, both of which are specific to just one server.
A supplemental approach to proactively managing the ORACLE_HOME tree is to
use the find
command, looking for individual files that have been
modified since a certain time. In the following example, we find all the files
since adding the link for the init.ora
file, which is one suitable
proxy for the most recent time your database configuration was updated:
% cd $ORACLE_HOME; find . -newer dbs/initmydb.ora ./inventory/Clone/clone.xml ./inventory/Components21 ./inventory/ContentsXML/comps.xml ./inventory/ContentsXML/config.xml ./inventory/ContentsXML/libs.xml ./inventory/filemap ./inventory/invDetails.properties ./network/admin ./network/admin/snmp_ro.ora ./network/admin/tnsnames.ora ./network/log ./network/log/listener.log ./network/log/dbsnmp.nohup ./network/tools/NetProperties ./rdbms/audit/ora_9318.aud [lines omitted...] ./dbs ./dbs/lkMYDB
These files are all candidates for being excluded in the ora_home_exclude
file.
However, many of them may already be included or excluded by the rsync
push
process, so evaluate this output in the context of a push.
Five shortcomings of the OUI process were identified. Using the standard Linux
and Unix rsync
command, a method for reliably cloning ORACLE_HOME's
was presented, which offers improvements in all five areas. Through judicious
use of the --exclude-from
and --dry-run
options,
the power of rsync
can be
tamed in the service of multiple Oracle software installations.