ASM metrics are a gold mine. Welcome to asm_metrics.pl, a new utility to extract and to manipulate them in real time

First of all, I would like to mention that my asmiostat utility has been extracted from my real_time.pl script. This is due to the fact that I just added a new feature that would have been difficult to add/maintain within the real_time.pl script (because it contains several real time tools).

The result of this extraction is a new script called asm_metrics.pl that can be downloaded from this repository.

Not speaking about the new feature yet, the asm_metrics.pl provides the same functionalities with the same options as my “deprecated” asmiostat one. That is to say:

It displays the following metrics:

  • Reads/s: Number of read per second.
  • KbyRead/s: Kbytes read per second.
  • Avg ms/Read: ms per read in average.
  • AvgBy/Read: Average Bytes per read.
  • Same metrics are provided for Write Operations.

In such a way:

So that, for example:

Let’s see the help and I’ll explain the new feature:

./asm_metrics.pl -help

Usage: ./asm_metrics.pl [-interval] [-count] [-inst] [-dbinst] [-dg] [-fg] [-ip] [-show] [-display] [-sort_field] [-help]

 Default Interval : 1 second.
 Default Count    : Unlimited

  Parameter         Comment                                                           Default
  ---------         -------                                                           -------
  -INST=            ALL - Show all Instance(s)                                        ALL
                    CURRENT - Show Current Instance
                    INSTANCE_NAME,... - choose Instance(s) to display

  -DBINST=          Database Instance to collect (Wildcard allowed)                   ALL
  -DG=              Diskgroup to collect (comma separated list)                       ALL
  -FG=              Failgroup to collect (comma separated list)                       ALL
  -IP=              IP (Exadata Cells) to collect (Wildcard allowed)                  ALL
  -SHOW=            What to show: inst,dbinst,fg|ip,dg,dsk (comma separated list)     DG
  -DISPLAY=         What to display: snap,avg (comma separated list)                  SNAP
  -SORT_FIELD=      reads|writes|iops                                                 NONE

Example: ./asm_metrics.pl
Example: ./asm_metrics.pl  -inst=+ASM1
Example: ./asm_metrics.pl  -dg=DATA -show=dg
Example: ./asm_metrics.pl  -dg=data -show=inst,dg,fg
Example: ./asm_metrics.pl  -show=dg,dsk
Example: ./asm_metrics.pl  -show=inst,dg,fg,dsk
Example: ./asm_metrics.pl  -interval=5 -count=3 -sort_field=iops
Example: ./asm_metrics.pl  -show=dg -display=avg -sort_field=iops
Example: ./asm_metrics.pl  -show=dg -display=snap,avg -sort_field=iops

The “-display” option is the new one: It allows you to display the delta snap values (as previously), the average delta values since the collection began (that is to say since the script has been launched) or both.

Let’s see one use case:

I want to find out the most physical IO consumers for the DATA diskgroup every 5 seconds and since I launched the script .

We just need to launch the script that way (using the -display option as I want to see the snap and average sections):

./asm_metrics.pl -show=dbinst -display=snap,avg -dg=DATA -interval=5 -sort_field=iops

So that after the first 5 seconds the output looks like:

............................
Collecting 5 sec....
............................

......... SNAP TAKEN AT ...................

22:53:26                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:26   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:26   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:26            BDTO_1                                                 5.8       92.8     0.9       16384     2.4        38.4      1.6        16384
22:53:26            BDTO_2                                                 5.0       80.0     1.6       16384     0.8        12.8      4.7        16384
22:53:26            JCAASM_2                                               3.4       54.4     0.2       16384     2.4        38.4      2.0        16384
22:53:26            JCAASM_1                                               2.4       38.4     1.5       16384     0.8        12.8      6.0        16384
22:53:26            MILASM_1                                               2.4       38.4     0.6       16384     0.8        12.8      1.7        16384
22:53:26            IATEBDTO_1                                             1.6       25.6     1.7       16384     0.4        6.4       5.0        16384
22:53:26            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

......... AVERAGE SINCE ...................

22:53:26                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:26   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:26   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:26            BDTO_1                                                 5.8       92.8     0.9       16384     2.4        38.4      1.6        16384
22:53:26            BDTO_2                                                 5.0       80.0     1.6       16384     0.8        12.8      4.7        16384
22:53:26            JCAASM_2                                               3.4       54.4     0.2       16384     2.4        38.4      2.0        16384
22:53:26            JCAASM_1                                               2.4       38.4     1.5       16384     0.8        12.8      6.0        16384
22:53:26            MILASM_1                                               2.4       38.4     0.6       16384     0.8        12.8      1.7        16384
22:53:26            IATEBDTO_1                                             1.6       25.6     1.7       16384     0.4        6.4       5.0        16384
22:53:26            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

So, no differences between the delta snap and the average after the first snap (which is obvious :-)) and as you can see the BDTO_1 instance is the one that generated the most iops.

After 10 seconds the output looks like:

............................
Collecting 5 sec....
............................

......... SNAP TAKEN AT ...................

22:53:31                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:31   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:31   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:31            BDTO_2                                                 2.6       41.6     0.7       16384     0.8        12.8      1.5        16384
22:53:31            JCAASM_2                                               3.0       48.0     0.2       16384     0.4        6.4       1.4        16384
22:53:31            IATEBDTO_1                                             2.4       38.4     1.2       16384     0.8        12.8      2.2        16384
22:53:31            JCAASM_1                                               2.2       35.2     1.3       16384     0.8        12.8      1.3        16384
22:53:31            BDTO_1                                                 2.0       32.0     0.3       16384     0.4        6.4       1.4        16384
22:53:31            MILASM_1                                               1.8       28.8     1.8       16384     0.4        6.4       2.5        16384
22:53:31            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

......... AVERAGE SINCE ...................

22:53:26                                                                             Kby      Avg       AvgBy/               Kby       Avg        AvgBy/
22:53:26   INST     DBINST        DG            FG            DSK          Reads/s   Read/s   ms/Read   Read      Writes/s   Write/s   ms/Write   Write
22:53:26   ------   -----------   -----------   -----------   ----------   -------   ------   -------   ------    ------     -------   --------   ------
22:53:26            BDTO_1                                                 3.9       62.4     0.8       16384     1.4        22.4      1.5        16384
22:53:26            BDTO_2                                                 3.8       60.8     1.3       16384     0.8        12.8      3.1        16384
22:53:26            JCAASM_2                                               3.2       51.2     0.2       16384     1.4        22.4      1.9        16384
22:53:26            JCAASM_1                                               2.3       36.8     1.4       16384     0.8        12.8      3.6        16384
22:53:26            MILASM_1                                               2.1       33.6     1.1       16384     0.6        9.6       2.0        16384
22:53:26            IATEBDTO_1                                             2.0       32.0     1.4       16384     0.6        9.6       3.1        16384
22:53:26            SMTBDTO_2                                              0.0       0.0      0.0       0         0.0        0.0       0.0        0

As you can see the delta snap section reports the time of the snap while the average section reports the time when the collection began. By the way you can also see that the BDTO_1 database instance is the one that generated the most IOPS since the collections began (while this is the BDTO_2 one that generated the most during the second snap).

Remarks:

  • The script does not create any objects, simply download it and use it !
  • If you hit this issue:
./asm_metrics.pl 
: No such file or directory
  • Then launch it that way:
perl ./asm_metrics.pl
  • It will not be possible to launch the “deprecated” asmiostat through the real_time.pl anymore: Instead a warning message will be displayed to reflect this change and a link to the new asm_metrics.pl script will be provided.

Conclusion:

  • The asm_metrics.pl script has been extracted from the real_time.pl one.
  • It provides exactly the same features.
  • It also allows to display the delta snap values (by default), the average delta values since the collection began or both.

23 thoughts on “ASM metrics are a gold mine. Welcome to asm_metrics.pl, a new utility to extract and to manipulate them in real time

  1. Hi Bertrand,

    This is the tool i have been looking for. Thanks for sharing it. BTW, is it possible to dump these values in CSV format for graphing purposes without having to massage the output around.

    Thanks,
    Stalin

    1. Hello,

      Your are welcome !

      Dumping the output in CSV format is not foreseen, while graphing the output (most probably with R) is in my todo list.

      Thx
      Bertrand.

  2. Hi Bertrand,

    unfortuanetly, asm_metrics.pl produces only an empty output, no matter how I call it. It doesn’t return an error either. Any idea, where I should look?

    Thanks,
    Stephan

      1. Hi Bertrand,

        ehh… no, not exactly… the script outputs an empty statistics. Only the headers are put out, but no actual data. Like this:
        oracle@btierasm02]cluster$ perl asm_metrics.pl
        ……………………….
        Collecting 1 sec….
        ……………………….

        ……… SNAP TAKEN AT ……………….

        12:37:35 Kby Avg AvgBy/ Kby Avg AvgBy/
        12:37:35 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write
        12:37:35 —— ———– ———– ———– ———- ——- —— ——- —— —— ——- ——– ——
        ……………………….
        Collecting 1 sec….

        Just as if can’t get any data from the ASM instance, but without generating an error, either.

        Cheers,
        Stephan

      2. Ok, let me guess that you are using ASM >= 11 and that ASM is not servicing any databases.

        Does “select count(*) from gv$asm_disk_iostat where instname not like ‘+ASM%’” return 0 ?

        In that case you could edit the script and update the asm_feature_version variable from 11 to 99.

        Thx
        Bertrand

  3. Great – you’re right. I am using ASM as a base for my OracleVM cluster volume, so actually there’re no databases served from that one.

    Thanks – this is a great tool!

    Cheers,
    Stephan

  4. Hi Bertrand,

    Will this script consume any memory in ASM instance, Often ASM instance having memory (i.e ora-4031 in shared pool), wanted to check before using it.

    Thanks,
    – Anuradha

    1. Hello Anuradha,

      The script launches the same “very simple” select during each snapshots.
      All the “work” is done outside the ASM instance (within the perl script).

      Thx
      Bertrand

      1. Hi Bertrand,

        Thanks for your reply, I have placed your script to monitor asm dg’s I/O.
        I would like to collect statistics for individual disks by using -show=dsk, but the column “DSK” could not accommodate more than 10 characters and it leaded other columns values to move subsequently (eg. Values of “Reads/s” not in its same column).
        Is there any way to modify the script to accommodate 25 characters for header “DSK”

        Thanks,
        Anuradha

      2. Hello Anuradha,

        Do you need the whole path for the disks ? Or the “basename” would be enough ?

        Thx
        Bertrand

  5. Hi Bertrand,

    Thanks for sharing this. This should allow me to diagnose some IO perfs.

    One question please,if I know which DGs are the main culprit, is there a way to get stats only for these DGs ? I can get them for all, or for one, but haven’t found a way to get results for a subset of diskgroups.

    Thx,


    Bertrand

    1. Hello,

      You can do more than “one or all” by using wildcards.

      For example -DG=DATA% will show DATA1, DATA2 diskgroups (if any).

      But there is no way to provide a list to filter the diskgroups.

      Thx
      Bertrand

      1. Hello,

        Thanks for your quick reply.

        The concerned DG are called DGDATA01 and DGREDO03 … 🙂
        Anyway I can live with it, the most important is to get metrics

        Regards,


        Bertrand

  6. Hi BDT, I am unable to download your scripts.. can you please email me all your scripts of latest version ? Thanks Srini Krovvidi

  7. We’ve been slow to move to ASM storage, but now that we are there, asm_metrics.pl, would be wonderful tool. I know this far a stream from the asm_metrics.pl tool, but we have only the generic Perl delivered with Solaris. I’d like to build a new Perl with all the extensions needed for asm_metrics.pl. To use asm_metrics.pl, we would need anything else in the Perl build other than DBI and DBD?

    1. Happy to hear that you want to use the asm_metrics.pl utility.
      You don’t have to worry about the generic Perl installed on the Machine, because the asm_metrics.pl uses the one under $ORACLE_HOME/perl/bin/perl which has all it needs installed (including DBI and DBD).

      Bertrand

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.