First of all, I would like to mention that my asmiostat utility has been extracted from my real_time.pl script. This is due to the fact that I just added a new feature that would have been difficult to add/maintain within the real_time.pl script (because it contains several real time tools).
The result of this extraction is a new script called asm_metrics.pl that can be downloaded from this repository.
Not speaking about the new feature yet, the asm_metrics.pl provides the same functionalities with the same options as my “deprecated” asmiostat one. That is to say:
It displays the following metrics:
- Reads/s: Number of read per second.
- KbyRead/s: Kbytes read per second.
- Avg ms/Read: ms per read in average.
- AvgBy/Read: Average Bytes per read.
- Same metrics are provided for Write Operations.
In such a way:
- It takes a snapshot each second (default interval) from the gv$asm_disk_iostat (or gv$asm_disk_stat depending of the version) cumulative view and computes the delta with the previous snapshot.
- It allows to display/aggregate/filter following your needs on ASM instances, diskgroup, failgroup (or exadata cell’s IP), disks and on database instances.
- It allows to sort based on the number of reads/s, number of writes/s or both (iops).
So that, for example:
- You can find out which databases are the most physical IO consumers (reads, writes or both).
- You can find out which diskgroup is the most responsible for the physical IO (reads,writes or both).
- You can see the ASM preferred read in action or not.
- You can also find out…. (I let you finish the sentence as the asm_metrics.pl utility output is customizable: see this post).
Let’s see the help and I’ll explain the new feature:
./asm_metrics.pl -help Usage: ./asm_metrics.pl [-interval] [-count] [-inst] [-dbinst] [-dg] [-fg] [-ip] [-show] [-display] [-sort_field] [-help] Default Interval : 1 second. Default Count : Unlimited Parameter Comment Default --------- ------- ------- -INST= ALL - Show all Instance(s) ALL CURRENT - Show Current Instance INSTANCE_NAME,... - choose Instance(s) to display -DBINST= Database Instance to collect (Wildcard allowed) ALL -DG= Diskgroup to collect (comma separated list) ALL -FG= Failgroup to collect (comma separated list) ALL -IP= IP (Exadata Cells) to collect (Wildcard allowed) ALL -SHOW= What to show: inst,dbinst,fg|ip,dg,dsk (comma separated list) DG -DISPLAY= What to display: snap,avg (comma separated list) SNAP -SORT_FIELD= reads|writes|iops NONE Example: ./asm_metrics.pl Example: ./asm_metrics.pl -inst=+ASM1 Example: ./asm_metrics.pl -dg=DATA -show=dg Example: ./asm_metrics.pl -dg=data -show=inst,dg,fg Example: ./asm_metrics.pl -show=dg,dsk Example: ./asm_metrics.pl -show=inst,dg,fg,dsk Example: ./asm_metrics.pl -interval=5 -count=3 -sort_field=iops Example: ./asm_metrics.pl -show=dg -display=avg -sort_field=iops Example: ./asm_metrics.pl -show=dg -display=snap,avg -sort_field=iops
The “-display” option is the new one: It allows you to display the delta snap values (as previously), the average delta values since the collection began (that is to say since the script has been launched) or both.
Let’s see one use case:
I want to find out the most physical IO consumers for the DATA diskgroup every 5 seconds and since I launched the script .
We just need to launch the script that way (using the -display option as I want to see the snap and average sections):
./asm_metrics.pl -show=dbinst -display=snap,avg -dg=DATA -interval=5 -sort_field=iops
So that after the first 5 seconds the output looks like:
............................ Collecting 5 sec.... ............................ ......... SNAP TAKEN AT ................... 22:53:26 Kby Avg AvgBy/ Kby Avg AvgBy/ 22:53:26 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write 22:53:26 ------ ----------- ----------- ----------- ---------- ------- ------ ------- ------ ------ ------- -------- ------ 22:53:26 BDTO_1 5.8 92.8 0.9 16384 2.4 38.4 1.6 16384 22:53:26 BDTO_2 5.0 80.0 1.6 16384 0.8 12.8 4.7 16384 22:53:26 JCAASM_2 3.4 54.4 0.2 16384 2.4 38.4 2.0 16384 22:53:26 JCAASM_1 2.4 38.4 1.5 16384 0.8 12.8 6.0 16384 22:53:26 MILASM_1 2.4 38.4 0.6 16384 0.8 12.8 1.7 16384 22:53:26 IATEBDTO_1 1.6 25.6 1.7 16384 0.4 6.4 5.0 16384 22:53:26 SMTBDTO_2 0.0 0.0 0.0 0 0.0 0.0 0.0 0 ......... AVERAGE SINCE ................... 22:53:26 Kby Avg AvgBy/ Kby Avg AvgBy/ 22:53:26 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write 22:53:26 ------ ----------- ----------- ----------- ---------- ------- ------ ------- ------ ------ ------- -------- ------ 22:53:26 BDTO_1 5.8 92.8 0.9 16384 2.4 38.4 1.6 16384 22:53:26 BDTO_2 5.0 80.0 1.6 16384 0.8 12.8 4.7 16384 22:53:26 JCAASM_2 3.4 54.4 0.2 16384 2.4 38.4 2.0 16384 22:53:26 JCAASM_1 2.4 38.4 1.5 16384 0.8 12.8 6.0 16384 22:53:26 MILASM_1 2.4 38.4 0.6 16384 0.8 12.8 1.7 16384 22:53:26 IATEBDTO_1 1.6 25.6 1.7 16384 0.4 6.4 5.0 16384 22:53:26 SMTBDTO_2 0.0 0.0 0.0 0 0.0 0.0 0.0 0
So, no differences between the delta snap and the average after the first snap (which is obvious :-)) and as you can see the BDTO_1 instance is the one that generated the most iops.
After 10 seconds the output looks like:
............................ Collecting 5 sec.... ............................ ......... SNAP TAKEN AT ................... 22:53:31 Kby Avg AvgBy/ Kby Avg AvgBy/ 22:53:31 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write 22:53:31 ------ ----------- ----------- ----------- ---------- ------- ------ ------- ------ ------ ------- -------- ------ 22:53:31 BDTO_2 2.6 41.6 0.7 16384 0.8 12.8 1.5 16384 22:53:31 JCAASM_2 3.0 48.0 0.2 16384 0.4 6.4 1.4 16384 22:53:31 IATEBDTO_1 2.4 38.4 1.2 16384 0.8 12.8 2.2 16384 22:53:31 JCAASM_1 2.2 35.2 1.3 16384 0.8 12.8 1.3 16384 22:53:31 BDTO_1 2.0 32.0 0.3 16384 0.4 6.4 1.4 16384 22:53:31 MILASM_1 1.8 28.8 1.8 16384 0.4 6.4 2.5 16384 22:53:31 SMTBDTO_2 0.0 0.0 0.0 0 0.0 0.0 0.0 0 ......... AVERAGE SINCE ................... 22:53:26 Kby Avg AvgBy/ Kby Avg AvgBy/ 22:53:26 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write 22:53:26 ------ ----------- ----------- ----------- ---------- ------- ------ ------- ------ ------ ------- -------- ------ 22:53:26 BDTO_1 3.9 62.4 0.8 16384 1.4 22.4 1.5 16384 22:53:26 BDTO_2 3.8 60.8 1.3 16384 0.8 12.8 3.1 16384 22:53:26 JCAASM_2 3.2 51.2 0.2 16384 1.4 22.4 1.9 16384 22:53:26 JCAASM_1 2.3 36.8 1.4 16384 0.8 12.8 3.6 16384 22:53:26 MILASM_1 2.1 33.6 1.1 16384 0.6 9.6 2.0 16384 22:53:26 IATEBDTO_1 2.0 32.0 1.4 16384 0.6 9.6 3.1 16384 22:53:26 SMTBDTO_2 0.0 0.0 0.0 0 0.0 0.0 0.0 0
As you can see the delta snap section reports the time of the snap while the average section reports the time when the collection began. By the way you can also see that the BDTO_1 database instance is the one that generated the most IOPS since the collections began (while this is the BDTO_2 one that generated the most during the second snap).
Remarks:
- The script does not create any objects, simply download it and use it !
- If you hit this issue:
./asm_metrics.pl : No such file or directory
- Then launch it that way:
perl ./asm_metrics.pl
- It will not be possible to launch the “deprecated” asmiostat through the real_time.pl anymore: Instead a warning message will be displayed to reflect this change and a link to the new asm_metrics.pl script will be provided.
Conclusion:
- The asm_metrics.pl script has been extracted from the real_time.pl one.
- It provides exactly the same features.
- It also allows to display the delta snap values (by default), the average delta values since the collection began or both.
Hi Bertrand,
This is the tool i have been looking for. Thanks for sharing it. BTW, is it possible to dump these values in CSV format for graphing purposes without having to massage the output around.
Thanks,
Stalin
Hello,
Your are welcome !
Dumping the output in CSV format is not foreseen, while graphing the output (most probably with R) is in my todo list.
Thx
Bertrand.
Hi Bertrand,
unfortuanetly, asm_metrics.pl produces only an empty output, no matter how I call it. It doesn’t return an error either. Any idea, where I should look?
Thanks,
Stephan
Hi Stephan,
You mean that the script returns back to the prompt without any error messages, right ?
Thx
Bertrand
Hi Bertrand,
ehh… no, not exactly… the script outputs an empty statistics. Only the headers are put out, but no actual data. Like this:
oracle@btierasm02]cluster$ perl asm_metrics.pl
……………………….
Collecting 1 sec….
……………………….
……… SNAP TAKEN AT ……………….
12:37:35 Kby Avg AvgBy/ Kby Avg AvgBy/
12:37:35 INST DBINST DG FG DSK Reads/s Read/s ms/Read Read Writes/s Write/s ms/Write Write
12:37:35 —— ———– ———– ———– ———- ——- —— ——- —— —— ——- ——– ——
……………………….
Collecting 1 sec….
Just as if can’t get any data from the ASM instance, but without generating an error, either.
Cheers,
Stephan
Ok, let me guess that you are using ASM >= 11 and that ASM is not servicing any databases.
Does “select count(*) from gv$asm_disk_iostat where instname not like ‘+ASM%’” return 0 ?
In that case you could edit the script and update the asm_feature_version variable from 11 to 99.
Thx
Bertrand
Great – you’re right. I am using ASM as a base for my OracleVM cluster volume, so actually there’re no databases served from that one.
Thanks – this is a great tool!
Cheers,
Stephan
Hi Bertrand,
Will this script consume any memory in ASM instance, Often ASM instance having memory (i.e ora-4031 in shared pool), wanted to check before using it.
Thanks,
– Anuradha
Hello Anuradha,
The script launches the same “very simple” select during each snapshots.
All the “work” is done outside the ASM instance (within the perl script).
Thx
Bertrand
Hi Bertrand,
Thanks for your reply, I have placed your script to monitor asm dg’s I/O.
I would like to collect statistics for individual disks by using -show=dsk, but the column “DSK” could not accommodate more than 10 characters and it leaded other columns values to move subsequently (eg. Values of “Reads/s” not in its same column).
Is there any way to modify the script to accommodate 25 characters for header “DSK”
Thanks,
Anuradha
Hello Anuradha,
Do you need the whole path for the disks ? Or the “basename” would be enough ?
Thx
Bertrand
Hi Bertrand,
Yes am looking for whole path of the disks.
Thanks,
Anuradha
Hi Bertrand,
Thanks for sharing this. This should allow me to diagnose some IO perfs.
One question please,if I know which DGs are the main culprit, is there a way to get stats only for these DGs ? I can get them for all, or for one, but haven’t found a way to get results for a subset of diskgroups.
Thx,
—
Bertrand
Hello,
You can do more than “one or all” by using wildcards.
For example -DG=DATA% will show DATA1, DATA2 diskgroups (if any).
But there is no way to provide a list to filter the diskgroups.
Thx
Bertrand
Hello,
Thanks for your quick reply.
The concerned DG are called DGDATA01 and DGREDO03 … 🙂
Anyway I can live with it, the most important is to get metrics
Regards,
—
Bertrand
Hi BDT, I am unable to download your scripts.. can you please email me all your scripts of latest version ? Thanks Srini Krovvidi
Hello,
Sure, I’ll do.
Bertrand
We’ve been slow to move to ASM storage, but now that we are there, asm_metrics.pl, would be wonderful tool. I know this far a stream from the asm_metrics.pl tool, but we have only the generic Perl delivered with Solaris. I’d like to build a new Perl with all the extensions needed for asm_metrics.pl. To use asm_metrics.pl, we would need anything else in the Perl build other than DBI and DBD?
Happy to hear that you want to use the asm_metrics.pl utility.
You don’t have to worry about the generic Perl installed on the Machine, because the asm_metrics.pl uses the one under $ORACLE_HOME/perl/bin/perl which has all it needs installed (including DBI and DBD).
Bertrand
This is very very very cool. Many thanks
Yes thank you for sharing these scripts.
why does first three rows doesn’t shows any any DSK deails?