Introduction
The purpose of this post is to share a way to aggregate by Oracle database within SystemTap probes. Let’s describe a simple use case to make things clear.
Use Case
Let’s say that I want to get the number and the total size of TCP messages that have been sent and received by an Oracle database. To do so, let’s use 2 probes:
and fetch the command line of the processes that trigger the event thanks to the cmdline_str() function. In case of a process related to an oracle database, the cmdline_str() output would look like one of those 2:
- ora_<something>_<instance_name>
- oracle<instance_name> (LOCAL=<something>
So let’s write two embedded C functions to extract the Instance name from each of the 2 strings described above.
Functions
get_oracle_name_b:
For example, this function would extract BDT from “ora_dbw0_BDT” or any “ora_<something>_BDT” string.
The code is the following:
function get_oracle_name_b:string (mystr:string) %{ char *ptr; int ch = '_'; char *strargs = STAP_ARG_mystr; ptr = strchr( strchr( strargs , ch) + 1 , ch); snprintf(STAP_RETVALUE, MAXSTRINGLEN, "%s",ptr + 1); %}
get_oracle_name_f:
For example, this function would extract BDT from “oracleBDT (LOCAL=NO)”, “oracleBDT (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))” or any “oracleBDT (LOCAL=<something>” string.
The code is the following:
function get_oracle_name_f:string (mystr:string) %{ char *ptr; int ch = ' '; char substr_res[30]; char *strargs = STAP_ARG_mystr; ptr = strchr( strargs, ch ); strncpy (substr_res,strargs+6, ptr - strargs - 6); substr_res[ptr - strargs - 6]='\0'; snprintf(STAP_RETVALUE, MAXSTRINGLEN, "%s",substr_res); %}
Having in mind that the SystemTap aggregation operator is “<<<” (as explained here) we can use those 2 functions to aggregate within the probes by Instance Name (passing as parameter the cmdline_str()) that way:
probe tcp.recvmsg { if ( isinstr(cmdline_str() , "ora_" ) == 1 && uid() == orauid) { tcprecv[get_oracle_name_b(cmdline_str())] <<< size } else if ( isinstr(cmdline_str() , "LOCAL=" ) == 1 && uid() == orauid) { tcprecv[get_oracle_name_f(cmdline_str())] <<< size } else { tcprecv["NOT_A_DB"] <<< size } } probe tcp.sendmsg { if ( isinstr(cmdline_str() , "ora_" ) == 1 && uid() == orauid) { tcpsend[get_oracle_name_b(cmdline_str())] <<< size } else if ( isinstr(cmdline_str() , "LOCAL=" ) == 1 && uid() == orauid) { tcpsend[get_oracle_name_f(cmdline_str())] <<< size } else { tcpsend["NOT_A_DB"] <<< size } }
As you can see, non oracle database would be recorded and displayed as “NOT_A_DB”.
Based on this, the tcp_by_db.stp script has been created.
tcp_by_db.stp: script usage and output example
Usage:
$> stap -g ./tcp_by_db.stp <oracle uid> <refresh time ms>
Output Example:
$> stap -g ./tcp_by_db.stp 54321 10000 --------------------------------------------------------- NAME NB_SENT SENT_KB NB_RECV RECV_KB --------------------------------------------------------- VBDT 5439 8231 10939 64154 NOT_A_DB 19 4 41 128 BDT 19 50 35 259 --------------------------------------------------------- NAME NB_SENT SENT_KB NB_RECV RECV_KB --------------------------------------------------------- VBDT 267 109 391 2854 NOT_A_DB 102 19 116 680 BDT 26 55 47 326 --------------------------------------------------------- NAME NB_SENT SENT_KB NB_RECV RECV_KB --------------------------------------------------------- VBDT 340 176 510 2940 NOT_A_DB 150 8 151 1165 BDT 42 77 61 423
Remarks:
- The oracle uid on my machine is 54321
- The refresh time has been set to 10 seconds
- You can see the aggregation for 2 databases on my machine and also for all that is not an oracle database
Whole Script source code
The whole code is available at:
Conclusion
Thanks to the embedded C functions we have been able to aggregate by Oracle database within SystemTap probes.
would you please say a little about : what does Oracle UID mean?
there are so many oracle processes, how to confirm my “oracle uid” ?
It’s the User ID of the oracle user.
You can get it that way:
[oracle@oradb ~]$ id
uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba)