In-Memory Instance distribution with RAC and Multitenant Environment

In a previous post (see In-Memory Instance distribution with RAC databases: I want at least one instance that contains all the data) I provided a way to get rid of the default In-Memory Instance distribution on a RAC (non CDB) database.

To summarize:

  • Thanks to the hidden parameter “_inmemory_auto_distribute” we are able to change the way the In-Memory Column Store are populated by default on an oracle RAC database.
  • By default all objects populated into memory will be distributed across all of the IM column stores in the cluster.
  • With “_inmemory_auto_distribute” set to false, then we are able to populate one instance (or all the instances) with all the data.

Now, let’s have a look at the In-Memory distribution on a Multitenant RAC database and let’s check if we can influence (get rid of the default behaviour) how the In-Memory is distributed across all the Instances per PDB.

By default the IMCS distribution for a PDB is the following:

First case: The PDB is open on one Instance only:

SYS@CDB$ROOT> alter pluggable database BDTPDB open instances=('BDT12c3_1');

Pluggable database altered.

-- Now connect to the BDTPDB PDB as bdt/bdt
SYS@CDB$ROOT> @connect_user_pdb.sql
Enter value for user: bdt
Enter value for pwd: bdt
Enter value for pdb: BDTPDB
Connected.

-- Let's enable the inmemory attribute on the BDT table
BDT@BDTPDB> alter table bdt inmemory ;

Table altered.

-- Trigger the IMCS population
BDT@BDTPDB> select count(*) from bdt;

  COUNT(*)
----------
  44591000

-- check the distribution
BDT@BDTPDB> SELECT s.inst_id, c.name,s.populate_status, s.inmemory_size, s.bytes, s.bytes_not_populated
FROM gv$im_user_segments s, v$pdbs c
WHERE s.segment_name = 'BDT'
and c.con_id=s.con_id; 

   INST_ID NAME                           POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- ------------------------------ --------- ------------- ---------- -------------------
         1 BDTPDB                         COMPLETED     695074816 1879048192                   0

As you can see the Instance 1 contains all the data for this PDB (as BYTES_NOT_POPULATED=0)

Second case: The PDB is open on all the Instances:

-- Open the BDTPDB PDB on all instances
SYS@CDB$ROOT> alter pluggable database BDTPDB open instances=all;

Pluggable database altered.

-- Now connect to the BDTPDB PDB as bdt/bdt
SYS@CDB$ROOT> @connect_user_pdb.sql
Enter value for user: bdt
Enter value for pwd: bdt
Enter value for pdb: BDTPDB
Connected.  

-- Let's enable the inmemory attribute on the BDT table
BDT@BDTPDB> alter table bdt inmemory ;

Table altered.

-- Trigger the IMCS population
BDT@BDTPDB> select count(*) from bdt;

  COUNT(*)  
----------  
  44591000

-- check the distribution
BDT@BDTPDB> SELECT s.inst_id, c.name,s.populate_status, s.inmemory_size, s.bytes, s.bytes_not_populated
FROM gv$im_user_segments s, v$pdbs c
WHERE s.segment_name = 'BDT'
and c.con_id=s.con_id;


   INST_ID NAME                           POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- ------------------------------ --------- ------------- ---------- -------------------
         1 BDTPDB                         COMPLETED     376307712 1879048192           802889728
         2 BDTPDB                         COMPLETED     298909696 1879048192          1006559232

We can see that (for this PDB), by default, no instance contains all data as the BYTES_NOT_POPULATED column is  greater than 0.

Now let’s set the hidden parameter _inmemory_auto_distribute” to false (at the PDB level):

BDT@BDTPDB> alter system set "_inmemory_auto_distribute"=false;
alter system set "_inmemory_auto_distribute"=false
*
ERROR at line 1:
ORA-65040: operation not allowed from within a pluggable database

So, we can’t set this hidden parameter at the PDB level.

Ok, let’s set it at the CDB level:

SYS@CDB$ROOT> alter system set "_inmemory_auto_distribute"=false;

System altered.

then re-trigger the population on instance 1 (for this PDB) and check the distribution:

SYS@CDB$ROOT> @connect_user_pdb.sql
Enter value for user: bdt
Enter value for pwd: bdt
Enter value for pdb: BDTPDB
Connected.  
BDT@BDTPDB> select count(*) from bdt;

  COUNT(*)  
----------  
  44591000

BDT@BDTPDB> SELECT s.inst_id, c.name,s.populate_status, s.inmemory_size, s.bytes, s.bytes_not_populated
FROM gv$im_user_segments s, v$pdbs c
WHERE s.segment_name = 'BDT'
and c.con_id=s.con_id;

   INST_ID NAME                           POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- ------------------------------ --------- ------------- ---------- -------------------
         2 BDTPDB                         COMPLETED     297861120 1879048192          1006559232
         1 BDTPDB                         COMPLETED     675151872 1879048192                   0

BINGO! Look at the Instance 1, it now contains all the data in its IMCS (as BYTES_NOT_POPULATED=0) for this PDB.

If I connect on the Instance 2 (for this PDB) and launch the query to trigger the population, I’ll get:

   INST_ID NAME                           POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- ------------------------------ --------- ------------- ---------- -------------------
         1 BDTPDB                         COMPLETED     675151872 1879048192                   0
         2 BDTPDB                         COMPLETED     672006144 1879048192                   0

So, both instances now contain all the data for this PDB.

Remarks:

  • As I said into the previous post: I used an hidden parameter, so you should get oracle support approval to use it.
  • I guess (because I can’t test) that in a Multitenant and Engineered Systems context it is also possible to have all the data appear in the IM column store on every node for a PDB (thanks to the duplicate attribute): As Christian Antognini show us into this blog post for non CDB.

Conclusion:

The conclusion is the same than in my previous post (with non CDB):

  • Thanks to the hidden parameter “_inmemory_auto_distribute” we are able to change the way the In-Memory Column Store are populated by default on an oracle RAC database.
  • By default all objects populated into memory will be distributed across all of the IM column stores in the cluster.
  • With “_inmemory_auto_distribute” set to false, then we are able to populate one instance (or all the instances) with all the data.

And we can add that:

  • If the PDB is open on one instance only then this instance contains all the data.
  • If the PDB is open on all the instances then we have to set “_inmemory_auto_distribute” to false to have at least one instance that contains all the data.
  • Each PDB population will behave the same (distributed or not) and inherit from the CDB setting (As we can not set the hidden parameter at the PDB level).

 

Advertisement

In-Memory Instance distribution with RAC databases: I want at least one instance that contains all the data

As you may know, with a RAC database, by default all objects populated into
In-memory will be distributed across all of the IM column stores in the cluster. It is also possible to have all the data appear in the IM column store on every node (Engineered Systems only). See this white paper for more details.

There is 2 very interesting blog post around this subject:

  1. Kerry Osborne show us how we can distribute the data (using the distribute INMEMORY attribute) across the nodes into this blog post.
  2. Christian Antognini show us that having the data not fully populated on each instance could lead to bad performance into this blog post.

But wait: If my RAC service is an active/passive one (I mean the service is started on only one instance) then I would like to have all the data fully populated into the In-Memory column store of the “active” instance  (and not distributed across the instances), right?

Let’s try to achieve this (all data fully populated into the In-Memory column store of the active instance):

So, by default, the IMCS distribution is the following:

SQL> alter table bdt inmemory ;

Table altered.

SQL> select count(*) from bdt;

  COUNT(*)
----------
  38220000

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
FROM gv$im_user_segments
WHERE segment_name = 'BDT';

   INST_ID POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- --------- ------------- ---------- -------------------
         1 COMPLETED     313720832 1610612736           732684288
         2 COMPLETED     274530304 1610612736           826236928

As you can see no instance contains all data as the BYTES_NOT_POPULATED column is  greater than 0.

Now let’s set the hidden parameter _inmemory_auto_distribute” to false and re-trigger the population on instance 1:

SQL> alter system set "_inmemory_auto_distribute"=false;

System altered.

SQL> select count(*) from bdt;

  COUNT(*)
----------
  38220000

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
FROM gv$im_user_segments
WHERE segment_name = 'BDT';

   INST_ID POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- --------- ------------- ---------- -------------------
         1 COMPLETED     587137024 1610612736                   0
         2 COMPLETED     274530304 1610612736           826236928

BINGO! Look at the Instance 1, it now contains all the data in its IMCS (as BYTES_NOT_POPULATED=0).

If I connect to the Instance 2 and launch the query to trigger the population I’ll get:

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
FROM gv$im_user_segments
WHERE segment_name = 'BDT';
  2    3  
   INST_ID POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- --------- ------------- ---------- -------------------
         1 COMPLETED     587137024 1610612736                   0
         2 COMPLETED     589234176 1610612736                   0

So, both instances now contain all the data (like the duplicate attribute would have done).

Remarks:

  1. In case of active/passive service (means preferred/available service) then after a service failover the second instance will also contain all the data (once populated).
  2. In case of active/active service then each database instance will contain all the data but not necessary at the same time (so the behaviour is a little bit different of the duplicate attribute).
  3. I used an hidden parameter, so you should get oracle support approval to use it.
  4. I’ll check the behaviour with PDB into another post.

Conclusion:

  • Thanks to the hidden parameter “_inmemory_auto_distribute” we are able to change the way the In-Memory Column Store are populated by default on an oracle RAC database.
  • By default all objects populated into memory will be distributed across all of the IM column stores in the cluster.
  • With “_inmemory_auto_distribute” set to false, then we are able to populate one instance (or all the instances) with all the data.

Update 2015/03/27:

  • This “workaround” proposed to get rid of the default behaviour on non Engineered Systems (“Every row of the test table is stored in the IMCS of either one instance or the other”) works for any type of service.
  • In case of preferred/available service, there is no need to set the hidden parameter to get the IM fully populated on one node. The secret sauce is linked to the parallel_instance_group parameter. See this blog post for more details.