ASM metrics are a gold mine. Welcome to asm_metrics.pl, a new utility to extract and to manipulate them in real time

bdrouvot ASM, Perl Scripts, ToolKit 04/10/201305/10/2013

First of all, I would like to mention that my asmiostat utility has been extracted from my real_time.pl script. This is due to the fact that I just added a new feature that would have been difficult to add/maintain within the real_time.pl script (because it contains several real time tools).

The result of this extraction is a new script called asm_metrics.pl that can be downloaded from this repository.

Not speaking about the new feature yet, the asm_metrics.pl provides the same functionalities with the same options as my “deprecated” asmiostat one. That is to say:

It displays the following metrics:

Reads/s: Number of read per second.
KbyRead/s: Kbytes read per second.
Avg ms/Read: ms per read in average.
AvgBy/Read: Average Bytes per read.
Same metrics are provided for Write Operations.

In such a way:

It takes a snapshot each second (default interval) from the gv$asm_disk_iostat (or gv$asm_disk_stat depending of the version) cumulative view and computes the delta with the previous snapshot.
It allows to display/aggregate/filter following your needs on ASM instances, diskgroup, failgroup (or exadata cell’s IP), disks and on database instances.
It allows to sort based on the number of reads/s, number of writes/s or both (iops).

So that, for example:

You can find out which databases are the most physical IO consumers (reads, writes or both).
You can find out which diskgroup is the most responsible for the physical IO (reads,writes or both).
You can see the ASM preferred read in action or not.
You can also find out…. (I let you finish the sentence as the asm_metrics.pl utility output is customizable: see this post).

Let’s see the help and I’ll explain the new feature:

./asm_metrics.pl -help

Usage: ./asm_metrics.pl [-interval] [-count] [-inst] [-dbinst] [-dg] [-fg] [-ip] [-show] [-display] [-sort_field] [-help]

 Default Interval : 1 second.
 Default Count    : Unlimited

  Parameter         Comment                                                           Default
  ---------         -------                                                           -------
  -INST=            ALL - Show all Instance(s)                                        ALL
                    CURRENT - Show Current Instance
                    INSTANCE_NAME,... - choose Instance(s) to display

  -DBINST=          Database Instance to collect (Wildcard allowed)                   ALL
  -DG=              Diskgroup to collect (comma separated list)                       ALL
  -FG=              Failgroup to collect (comma separated list)                       ALL
  -IP=              IP (Exadata Cells) to collect (Wildcard allowed)                  ALL
  -SHOW=            What to show: inst,dbinst,fg|ip,dg,dsk (comma separated list)     DG
  -DISPLAY=         What to display: snap,avg (comma separated list)                  SNAP
  -SORT_FIELD=      reads|writes|iops                                                 NONE

Example: ./asm_metrics.pl
Example: ./asm_metrics.pl  -inst=+ASM1
Example: ./asm_metrics.pl  -dg=DATA -show=dg
Example: ./asm_metrics.pl  -dg=data -show=inst,dg,fg
Example: ./asm_metrics.pl  -show=dg,dsk
Example: ./asm_metrics.pl  -show=inst,dg,fg,dsk
Example: ./asm_metrics.pl  -interval=5 -count=3 -sort_field=iops
Example: ./asm_metrics.pl  -show=dg -display=avg -sort_field=iops
Example: ./asm_metrics.pl  -show=dg -display=snap,avg -sort_field=iops

The “-display” option is the new one: It allows you to display the delta snap values (as previously), the average delta values since the collection began (that is to say since the script has been launched) or both.

Let’s see one use case:

I want to find out the most physical IO consumers for the DATA diskgroup every 5 seconds and since I launched the script .

We just need to launch the script that way (using the -display option as I want to see the snap and average sections):

./asm_metrics.pl -show=dbinst -display=snap,avg -dg=DATA -interval=5 -sort_field=iops

So that after the first 5 seconds the output looks like:

............................
Collecting 5 sec....
............................

......... SNAP TAKEN AT ...................

22:53:26                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:26   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:26   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:26            BDTO_1                                                 5.8       92.8     0.9       16384     2.4        38.4      1.6        16384
22:53:26            BDTO_2                                                 5.0       80.0     1.6       16384     0.8        12.8      4.7        16384
22:53:26            JCAASM_2                                               3.4       54.4     0.2       16384     2.4        38.4      2.0        16384
22:53:26            JCAASM_1                                               2.4       38.4     1.5       16384     0.8        12.8      6.0        16384
22:53:26            MILASM_1                                               2.4       38.4     0.6       16384     0.8        12.8      1.7        16384
22:53:26            IATEBDTO_1                                             1.6       25.6     1.7       16384     0.4        6.4       5.0        16384
22:53:26            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

......... AVERAGE SINCE ...................

22:53:26                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:26   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:26   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:26            BDTO_1                                                 5.8       92.8     0.9       16384     2.4        38.4      1.6        16384
22:53:26            BDTO_2                                                 5.0       80.0     1.6       16384     0.8        12.8      4.7        16384
22:53:26            JCAASM_2                                               3.4       54.4     0.2       16384     2.4        38.4      2.0        16384
22:53:26            JCAASM_1                                               2.4       38.4     1.5       16384     0.8        12.8      6.0        16384
22:53:26            MILASM_1                                               2.4       38.4     0.6       16384     0.8        12.8      1.7        16384
22:53:26            IATEBDTO_1                                             1.6       25.6     1.7       16384     0.4        6.4       5.0        16384
22:53:26            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

So, no differences between the delta snap and the average after the first snap (which is obvious :-)) and as you can see the BDTO_1 instance is the one that generated the most iops.

After 10 seconds the output looks like:

............................
Collecting 5 sec....
............................

......... SNAP TAKEN AT ...................

22:53:31                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:31   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:31   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:31            BDTO_2                                                 2.6       41.6     0.7       16384     0.8        12.8      1.5        16384
22:53:31            JCAASM_2                                               3.0       48.0     0.2       16384     0.4        6.4       1.4        16384
22:53:31            IATEBDTO_1                                             2.4       38.4     1.2       16384     0.8        12.8      2.2        16384
22:53:31            JCAASM_1                                               2.2       35.2     1.3       16384     0.8        12.8      1.3        16384
22:53:31            BDTO_1                                                 2.0       32.0     0.3       16384     0.4        6.4       1.4        16384
22:53:31            MILASM_1                                               1.8       28.8     1.8       16384     0.4        6.4       2.5        16384
22:53:31            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

......... AVERAGE SINCE ...................

22:53:26                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:26   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:26   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:26            BDTO_1                                                 3.9       62.4     0.8       16384     1.4        22.4      1.5        16384
22:53:26            BDTO_2                                                 3.8       60.8     1.3       16384     0.8        12.8      3.1        16384
22:53:26            JCAASM_2                                               3.2       51.2     0.2       16384     1.4        22.4      1.9        16384
22:53:26            JCAASM_1                                               2.3       36.8     1.4       16384     0.8        12.8      3.6        16384
22:53:26            MILASM_1                                               2.1       33.6     1.1       16384     0.6        9.6       2.0        16384
22:53:26            IATEBDTO_1                                             2.0       32.0     1.4       16384     0.6        9.6       3.1        16384
22:53:26            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

As you can see the delta snap section reports the time of the snap while the average section reports the time when the collection began. By the way you can also see that the BDTO_1 database instance is the one that generated the most IOPS since the collections began (while this is the BDTO_2 one that generated the most during the second snap).

Remarks:

The script does not create any objects, simply download it and use it !
If you hit this issue:

./asm_metrics.pl 
: No such file or directory

Then launch it that way:

perl ./asm_metrics.pl

It will not be possible to launch the “deprecated” asmiostat through the real_time.pl anymore: Instead a warning message will be displayed to reflect this change and a link to the new asm_metrics.pl script will be provided.

Conclusion:

The asm_metrics.pl script has been extracted from the real_time.pl one.
It provides exactly the same features.
It also allows to display the delta snap values (by default), the average delta values since the collection began or both.

Published by bdrouvot

View all posts by bdrouvot

Published 04/10/201305/10/2013

23 thoughts on “ASM metrics are a gold mine. Welcome to asm_metrics.pl, a new utility to extract and to manipulate them in real time”

Stalin says:

08/11/2013 at 12:39 am

Hi Bertrand,

This is the tool i have been looking for. Thanks for sharing it. BTW, is it possible to dump these values in CSV format for graphing purposes without having to massage the output around.

Thanks,
Stalin

Reply
1. bdrouvot says:
  
  08/11/2013 at 7:37 am
  
  Hello,
  
  Your are welcome !
  
  Dumping the output in CSV format is not foreseen, while graphing the output (most probably with R) is in my todo list.
  
  Thx
  Bertrand.
  
  Reply
Stephan says:

14/11/2013 at 9:50 am

Hi Bertrand,

unfortuanetly, asm_metrics.pl produces only an empty output, no matter how I call it. It doesn’t return an error either. Any idea, where I should look?

Thanks,
Stephan

Reply
1. bdrouvot says:
  
  14/11/2013 at 11:48 am
  
  Hi Stephan,
  
  You mean that the script returns back to the prompt without any error messages, right ?
  
  Thx
  Bertrand
  
  Reply
  1. Stephan says:
    
    14/11/2013 at 12:39 pm
    
    Hi Bertrand,
    
    ehh… no, not exactly… the script outputs an empty statistics. Only the headers are put out, but no actual data. Like this:
    oracle@btierasm02]cluster$ perl asm_metrics.pl
    ……………………….
    Collecting 1 sec….
    ……………………….
    
    ……… SNAP TAKEN AT ……………….
    
    12:37:35 Kby Avg AvgBy/ Kby Avg AvgBy/
    12:37:35 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write
    12:37:35 —— ———– ———– ———– ———- ——- —— ——- —— —— ——- ——– ——
    ……………………….
    Collecting 1 sec….
    
    Just as if can’t get any data from the ASM instance, but without generating an error, either.
    
    Cheers,
    Stephan
  2. bdrouvot says:
    
    14/11/2013 at 1:34 pm
    
    Ok, let me guess that you are using ASM >= 11 and that ASM is not servicing any databases.
    
    Does “select count(*) from gv$asm_disk_iostat where instname not like ‘+ASM%’” return 0 ?
    
    In that case you could edit the script and update the asm_feature_version variable from 11 to 99.
    
    Thx
    Bertrand
Stephan says:

14/11/2013 at 1:44 pm

Great – you’re right. I am using ASM as a base for my OracleVM cluster volume, so actually there’re no databases served from that one.

Thanks – this is a great tool!

Cheers,
Stephan

Reply
Anuradha says:

23/01/2014 at 9:26 pm

Hi Bertrand,

Will this script consume any memory in ASM instance, Often ASM instance having memory (i.e ora-4031 in shared pool), wanted to check before using it.

Thanks,
– Anuradha

Reply
1. bdrouvot says:
  
  27/01/2014 at 10:29 am
  
  Hello Anuradha,
  
  The script launches the same “very simple” select during each snapshots.
  All the “work” is done outside the ASM instance (within the perl script).
  
  Thx
  Bertrand
  
  Reply
  1. Anuradha says:
    
    02/02/2014 at 10:10 am
    
    Hi Bertrand,
    
    Thanks for your reply, I have placed your script to monitor asm dg’s I/O.
    I would like to collect statistics for individual disks by using -show=dsk, but the column “DSK” could not accommodate more than 10 characters and it leaded other columns values to move subsequently (eg. Values of “Reads/s” not in its same column).
    Is there any way to modify the script to accommodate 25 characters for header “DSK”
    
    Thanks,
    Anuradha
  2. bdrouvot says:
    
    06/02/2014 at 10:12 pm
    
    Hello Anuradha,
    
    Do you need the whole path for the disks ? Or the “basename” would be enough ?
    
    Thx
    Bertrand
Anuradha says:

07/02/2014 at 11:00 am

Hi Bertrand,

Yes am looking for whole path of the disks.

Thanks,
Anuradha

Reply
Bertrand says:

25/06/2014 at 9:48 am

Hi Bertrand,

Thanks for sharing this. This should allow me to diagnose some IO perfs.

One question please,if I know which DGs are the main culprit, is there a way to get stats only for these DGs ? I can get them for all, or for one, but haven’t found a way to get results for a subset of diskgroups.

Thx,

—
Bertrand

Reply
1. bdrouvot says:
  
  25/06/2014 at 1:23 pm
  
  Hello,
  
  You can do more than “one or all” by using wildcards.
  
  For example -DG=DATA% will show DATA1, DATA2 diskgroups (if any).
  
  But there is no way to provide a list to filter the diskgroups.
  
  Thx
  Bertrand
  
  Reply
  1. Bertrand says:
    
    25/06/2014 at 2:04 pm
    
    Hello,
    
    Thanks for your quick reply.
    
    The concerned DG are called DGDATA01 and DGREDO03 … 🙂
    Anyway I can live with it, the most important is to get metrics
    
    Regards,
    
    —
    Bertrand
pattu64 says:

20/11/2014 at 5:18 pm

Hi BDT, I am unable to download your scripts.. can you please email me all your scripts of latest version ? Thanks Srini Krovvidi

Reply
1. bdrouvot says:
  
  21/11/2014 at 7:18 am
  
  Hello,
  
  Sure, I’ll do.
  
  Bertrand
  
  Reply
Dan says:

11/03/2015 at 8:22 pm

We’ve been slow to move to ASM storage, but now that we are there, asm_metrics.pl, would be wonderful tool. I know this far a stream from the asm_metrics.pl tool, but we have only the generic Perl delivered with Solaris. I’d like to build a new Perl with all the extensions needed for asm_metrics.pl. To use asm_metrics.pl, we would need anything else in the Perl build other than DBI and DBD?

Reply
1. bdrouvot says:
  
  12/03/2015 at 8:04 am
  
  Happy to hear that you want to use the asm_metrics.pl utility.
  You don’t have to worry about the generic Perl installed on the Machine, because the asm_metrics.pl uses the one under $ORACLE_HOME/perl/bin/perl which has all it needs installed (including DBI and DBD).
  
  Bertrand
  
  Reply
johnfak75@gmail.com says:

27/05/2015 at 7:55 pm

This is very very very cool. Many thanks

Reply
Neil B. says:

01/02/2016 at 9:09 pm

Yes thank you for sharing these scripts.

Reply
Deepak says:

12/01/2020 at 8:59 am

why does first three rows doesn’t shows any any DSK deails?

Reply
Pingback: Diagnose High-Latency I/O Operations Using SystemTap - SEOOS技术门户