Wrong Cell Server name on X6-2 Elastic Rack - Bug 25317550

    Jan 2, 2018 6:55:00 AM by Syed Jaffar Hussain

    Two X6-2 Elastic Full capacity Exadata systems were deployed recently. Due to the following BUG, cell names were not properly updated with the client provided names after executing the applyElasticConfig.sh.

    Bug 25317550 : OEDA FAILS TO SET CELL NAME RESULTING IN GRID DISK NAMES NOT HAVING RIGHT SUFFIX

    Though this doesn't impact the operations, but, certainly will create confusion when multiple Exadata systems are deployed in the same data center, due to exact name of cell, cell disks, grid disks.

    Note : Its highly recommended to validate the cell names after executing the applyElasticConfig.sh, before running the onecommand. If you encounter the similar problem, simply change the cell name with alter cell name=[correctname] and proceed with onecommand execution to avoid the BUG.

    The default names looks like the below :

    # dcli -g cell_group -l root 'cellcli -e list cell attributes name'
                            celadm1: ru02
                            celadm1: ru04
                            celadm3: ru06
                            celadm4: ru08
                            celadm5: ru10
                            celadm6: ru12


    Changing the cell name to reflect the cell disk, grid disk names, you need to follow the below procedure:

    The procedure below must be performed on all cells separately and sequentially(to avoid full downtime);

    1) Change the cell name:

        cellcli> alter cell name=celadm5

     2) Confirm griddisks can be taken offline.

        cellcli> list griddisk attributes name, ASMDeactivationOutcome, ASMModeStatus 
                ASMDeactivationOutcome - Should be YES for all griddisks

    3) Inactivate griddisk on that cell

        cellcli> alter griddisk all inactive
     
                Observation - IF any votesiks are in the storage server will relocate to any surviving storage servers.

    4) Change cell disk name

                alter celldisk CD_00_ru10 name=CD_00_celadm5;     
                alter celldisk CD_01_ru10 name=CD_01_celadm5;
                alter celldisk CD_02_ru10 name=CD_02_celadm5;
                alter celldisk CD_03_ru10 name=CD_03_celadm5;
                alter celldisk CD_04_ru10 name=CD_04_celadm5;
                alter celldisk CD_05_ru10 name=CD_05_celadm5;
                alter celldisk CD_06_ru10 name=CD_06_celadm5;
                alter celldisk CD_07_ru10 name=CD_07_celadm5;
                alter celldisk CD_08_ru10 name=CD_08_celadm5;
                alter celldisk CD_09_ru10 name=CD_09_celadm5;
                alter celldisk CD_10_ru10 name=CD_10_celadm5;
                alter celldisk CD_11_ru10 name=CD_11_celadm5;

    5) Change Griddisk name using the below examples (do it for all grid disks, DATAC1, DBFS & RECOC1)

                alter GRIDDISK DATAC1_CD_00_ru10  name=DATAC1_CD_00_celadm5;
                alter GRIDDISK DBFS_DG_CD_02_ru10 name=DBFS_DG_CD_02_celadm5;
                alter GRIDDISK RECOC1_CD_11_ru10  name=RECOC1_CD_11_celadm5;

    6) Activate griddisk on that cell

                cellcli> ALTER GRIDDISK ALL ACTIVE;

            There are some important points to be noted after activating griddisks.

          a) asm disks path and name
           * griddisk name change is automatically getting relflected in asm disk path.
           * asm logical name is still referring old name.
          b) failgroup
           * failgroup name is changed and using the same old name.

    7) Changing ASM logical name and failgroup name.

        * This can be achived by dropping asmdisk and adding back with correct name. The observation is failgroup name will get automatically changed when we adding
          back asm disks with correct name.
        * ASMCA is the best tool to drop and add back asm disks with 250+ rebalancing power limit.
       
        a) Drop asm disks and observations.
            * We need to make sure asmdisks can be dropped
                 cellcli> list griddisk attributes name, ASMDeactivationOutcome, ASMModeStatus
                    ASMDeactivationOutcome - Should be YES for all griddisks
            * Drop asmdisks using asmca or alter diskgroup 

            We can see asmdisk state will be dropping and there will be an ongoing rebalance operation.
       
            * ASM rebalance operation.
                We can see ongoing asm rebalance operation using below command and change the power to finish it fast.

                    sqlplus / as asm

                    sql> select * from v$asm_operation;
                    sql> alter diskgroup DATAC1 rebalance power 256;


            * Once rebalance operation completed we can asm disk state as changed to noraml, name will become empty failgroup also changed with corret name.
           
    a) ADD back asm disks and observations.

    Adding back as well can be done by using asmca or asm diskgroup alter commands.
    We need to make sure we are adding back with correct name in this case DATAC1_CD_00_RU10 should be added back DATAC1_CD_00_arb02celadm19

    We can see ongoing asm rebalance operation using below command and change the power to finish it fast.

            sqlplus / as asm
            sql> select * from v$asm_operation;

    8) Remaining cells

    We can continue same operation for remaining cells and entire operation can be completed with out any downtime at database level.
    Once we have completed we can see all votedisks as well relocated or renamed with new name.

    References:

    Bug 25317550 : OEDA FAILS TO SET CELL NAME RESULTING IN GRID DISK NAMES NOT HAVING RIGHT SUFFIX

    I appreciate and thank my team member Khalid Kizhakkethil for doing this wonderful job and preparing the documentation.

    Tags: Oracle

    Syed Jaffar Hussain

    Written by Syed Jaffar Hussain

    An Oracle Database Expert for over 15 years from his 20 years of Information Technology (IT) career. Over the past 15 years of Oracle journey, he involved with several local and large scaled international banks where he implemented and managed highly complex cluster and non-cluster environments with over 100’s of business critical databases. Recognizing his efforts and contribution towards the Oracle community, Oracle awarded him the prestigious ‘Best DBA of the year, 2011’ and Oracle ACE Director status. He also acquired industry best Oracle credentials, Oracle Certified Master (OCM), Oracle RAC Expert, OCP DBA 8i,9i,10g & 11g in addition to ITIL Expertise. Syed is an active Oracle speaker, regularly presents technical sessions and webinars on various Oracle database technologies at many Oracle events. You can visit his technical blog, http://jaffardba.blogspot.com where he discuss and writes the workaround/solution about the issues confronted from his day-to-day activities. Apart from being the part of the core Technical Review committee member for a few Oracle technology oriented books, he also co-authored an Oracle 11g R1/R2 Real Application Cluster Essentials and Oracle Expert RAC books. His blog can be found at http://jaffardba.blogspot.com/