SystemTap and Oracle RDBMS: Time in Various States, VFS I/O and Block I/O

Introduction

Now that I am able to aggregate SytemTap probes by Oracle database, it’s time to create several scripts in a toolkit. The toolkit is available in this github repository.

Let’s describe 3 new members of the toolkit:

  • schedtimes_per_db.stp: To track time databases spend in various states
  • vfsio_per_db.stp: To track I/O by database through the Virtual File System (vfs) layer
  • blkio_per_db.stp: To track I/O by database through the block IO layer

schedtimes_per_db

This script tracks the time databases spend in various states. It also reports the time spend by non oracle database.

Usage:

$> stap -g ./schedtimes_per_db.stp <oracle uid> <refresh time ms>

Output example:

$> stap -g ./schedtimes_per_db.stp 54321 10000

------------------------------------------------------------------
DBNAME    :    run(us)  sleep(us) iowait(us) queued(us)  total(us)
------------------------------------------------------------------
NOT_A_DB  :     447327  200561911       1328     517522  201528088
BDT       :      42277  316189082          0      69355  316300714
VBDT      :      74426  326694570          0      77489  326846485


vfsio_per_db

This script tracks the database I/O through the VFS layer. It also reports the I/O in this layer for non oracle database.

Usage:

$> stap -g ./vfsio_per_db.stp <oracle uid> <refresh time ms>

Output example:

$> stap -g ./vfsio_per_db.stp 54321 10000

------------------------------------------------------------------------
DBNAME      : NB_READ   READ_KB   NB_WRITE  WRITE_KB  NB_TOTAL  TOTAL_KB
------------------------------------------------------------------------
BDTS        : 110       347       6         96        116       443
NOT_A_DB    : 89        11        2         0         91        11

blkio_per_db

This script tracks the database I/O through the block IO layer. It also reports the I/O in this layer for non oracle database.

Usage:

$> stap -g ./blkio_per_db.stp <oracle uid> <refresh time ms>

Output example:

$> stap -g ./blkio_per_db.stp 54321 10000

------------------------------------------------------------------------
DBNAME      : NB_READ   READ_KB   NB_WRITE  WRITE_KB  NB_TOTAL  TOTAL_KB
------------------------------------------------------------------------
BDTS        : 9690      110768    18        192       9708      110960
NOT_A_DB    : 0         0         6         560       6         560

Remarks

  • The schedtimes_per_db script is mainly inspired by this one (full credit goes to the authors).
  • Why is it  interesting to look at the vfs layer? Answers are in this awesome File System Latency series (see parts 1, 2, 3, 4 and 5) from Brendan Gregg.
  • In this post the word database stands for “all the foreground and background processes linked to an oracle database”.
  • In a consolidated environment, having a view per database can be very useful.

Conclusion

The toolkit has been created and 3 new members are part of it. Expect from it to grow a lot.

SystemTap and Oracle RDBMS: Aggregate by database

Introduction

The purpose of this post is to share a way to aggregate by Oracle database within SystemTap probes. Let’s describe a simple use case to make things clear.

Use Case

Let’s say that I want to get the number and the total size of TCP messages that have been sent and received by an Oracle database. To do so, let’s use 2 probes:

and fetch the command line of the processes that trigger the event thanks to the cmdline_str() function. In case of a process related to an oracle database, the cmdline_str() output would look like one of those 2:

  • ora_<something>_<instance_name>
  • oracle<instance_name> (LOCAL=<something>

So let’s write two embedded C functions to extract the Instance name from each of the 2 strings described above.

Functions

get_oracle_name_b:

For example, this function would extract BDT from “ora_dbw0_BDT” or any “ora_<something>_BDT” string.

The code is the following:

function get_oracle_name_b:string (mystr:string) %{
char *ptr;
int  ch = '_';
char *strargs = STAP_ARG_mystr;
ptr = strchr( strchr( strargs , ch) + 1 , ch);
snprintf(STAP_RETVALUE, MAXSTRINGLEN, "%s",ptr + 1);
%}

get_oracle_name_f:

For example, this function would extract BDT from “oracleBDT (LOCAL=NO)”, “oracleBDT (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))” or any “oracleBDT (LOCAL=<something>” string.

The code is the following:

function get_oracle_name_f:string (mystr:string) %{
char *ptr;
int    ch = ' ';
char substr_res[30];
char *strargs = STAP_ARG_mystr;
ptr = strchr( strargs, ch );
strncpy (substr_res,strargs+6, ptr - strargs - 6);
substr_res[ptr - strargs - 6]='\0';
snprintf(STAP_RETVALUE, MAXSTRINGLEN, "%s",substr_res);
%}

Having in mind that the SystemTap aggregation operator is “<<<” (as explained here) we can use those 2 functions to aggregate within the probes by Instance Name (passing as parameter the cmdline_str()) that way:

probe tcp.recvmsg
{

if ( isinstr(cmdline_str() , "ora_" ) == 1 && uid() == orauid) {
tcprecv[get_oracle_name_b(cmdline_str())] <<< size
} else if ( isinstr(cmdline_str() , "LOCAL=" ) == 1 && uid() == orauid) {
tcprecv[get_oracle_name_f(cmdline_str())] <<< size
} else {
tcprecv["NOT_A_DB"] <<< size
}
}

probe tcp.sendmsg
{

if ( isinstr(cmdline_str() , "ora_" ) == 1 && uid() == orauid) {
tcpsend[get_oracle_name_b(cmdline_str())] <<< size
} else if ( isinstr(cmdline_str() , "LOCAL=" ) == 1 && uid() == orauid) {
tcpsend[get_oracle_name_f(cmdline_str())] <<< size
} else {
tcpsend["NOT_A_DB"] <<< size
}
}

As you can see, non oracle database would be recorded and displayed as “NOT_A_DB”.

Based on this, the tcp_by_db.stp script has been created.

tcp_by_db.stp: script usage and output example

Usage:

$> stap -g ./tcp_by_db.stp <oracle uid> <refresh time ms>

Output Example:

$> stap -g ./tcp_by_db.stp 54321 10000
---------------------------------------------------------
NAME              NB_SENT   SENT_KB   NB_RECV   RECV_KB
---------------------------------------------------------
VBDT              5439      8231      10939     64154
NOT_A_DB          19        4         41        128
BDT               19        50        35        259

---------------------------------------------------------
NAME              NB_SENT   SENT_KB   NB_RECV   RECV_KB
---------------------------------------------------------
VBDT              267       109       391       2854
NOT_A_DB          102       19        116       680
BDT               26        55        47        326

---------------------------------------------------------
NAME              NB_SENT   SENT_KB   NB_RECV   RECV_KB
---------------------------------------------------------
VBDT              340       176       510       2940
NOT_A_DB          150       8         151       1165
BDT               42        77        61        423

Remarks:

  • The oracle uid on my machine is 54321
  • The refresh time has been set to 10 seconds
  • You can see the aggregation for 2 databases on my machine and also for all that is not an oracle database

Whole Script source code

The whole code is available at:

Conclusion

Thanks to the embedded C functions we have been able to aggregate by Oracle database within SystemTap probes.