Solving Communication problems between DB and ASM instances

    Nov 7, 2017 5:29:00 PM by Deiby Gomez

    Introduction

    Most of the time I write how-to articles or I am introducing a new feature of Oracle Database. Those articles contain new information that’s good to know and help people fix issues or to use a function/feature, but in this time I am writing about situations I have had. It’s good for readers and beginners to know those little details around how an issue was fixed, what daily work is like for another DBA, or just to read a funny story. In this article I will tell a story regarding a problem a customer had a long time ago; the root cause is not frequent (I hope!) but if we don’t understand the relevant concepts we could spend several hours trying to find out a root cause that could be easy to identify when the our concepts are solid.

    Infrequent, but it can happen

    A long time ago I received a call from a customer saying that there were some errors in the database instance. Well, interestingly the databases were executing DMLs properly without any issue. I asked the customer if these errors appeared only with one specific operation like an Insert, or like a CREATE <something>, etc.; and he said that he was running a script received from the application team to create several tablespaces with its datafiles.  When he was running the script he was receiving the following errors:

    ORA-01119: error in creating database file '+DATA'
    ORA-17502: ksfdcre:4 Failed to create file +DATA
    ORA-27300: OS system dependent operation:open failed with status: 2
    ORA-27301: OS failure message: No such file or directory
    ORA-27302: failure occurred at: sskgmsmr_7

    First, you can see that the set of errors says that there is a directory or file that doesn’t exist in the OS; on the other hand, it points to the ASM disk group, which in this case is “+DATA”. So this is confusing, because either the file that the database is looking for is in ASM or it is in the OS.  I did a quick check of the database instance and it was OK. There were no errors in the alert log, all the disks were healthy. On the database side, however, there seemed to be some issues, specifically with the sentences “CREATE TABLESPACE” which the customer had in the script provided by the application team.

    So, the clues were:

    • No issues with the ASM Instance
    • DMLs were being executed successfully in the database instance.
    • CREATE TABLESPACE statements fail in the database instance.
    • ASM and OS are both involved in a “file” or “directory” that doesn’t exist. 

    With these four clues to go by, you should be on the right track if your concepts are solid. The root cause you would be thinking about would involve the file that the database instance uses to communicate with the ASM instance This file is named "ab_<ASM SID>.dat" and it is located in the $ORACLE_HOME/dbs. You need to know about the existence of this file and what its function is.This file rarely has issues, or rarely causes problems…but sometimes it happens,

    Let’s define this file:

    What is the "ab_<ASM SID>.dat" file? This file is used by the database instance to message an ASM instance. When the database instance needs to send a message to the ASM instance, the database instance reads this file in order to find out the information required for getting connected to the ASM instance. This file is in $ORACLE_HOME/dbs. If this file doesn't exist the database will not be able to connect to the ASM instance and you will receive an error. This file is important because it is involved in the database instance work.

    Some time ago I wrote an article with several tests of where this file is required to execute some sentences in the database and in which sentences the file is not required. You can read the details here.

    The conclusion of that earlier article indicates:

    • Tablespace creation – required
    • Datafile creation – required
    • Table creation – not required
    • DML operations – not required
    • Drop tablespace – not required
    • Delete datafile – not required
    • Startup database instance – required
    • Shutdown database instance – not required

    Well, taking that into account, to solve this customer’s issue, I listed all the files in $ORACLE_HOME/dbs and the root cause was confirmed. The file "ab_<ASM SID>.dat" did not exist in the directory. I asked the customer if he had moved the file somewhere else or if he’d deleted it and he said that the day before the junior DBA was “cleaning” logs and traces that were using space and that could be deleted. I think that one of those files that “could be deleted” was "ab_<ASM SID>.dat". As I said before, this situation happens rarely. Solving the problem is not a big deal; what we have to do is reboot the ASM instance, but in order to do that we have to reboot the database instance as well.  After rebooting the ASM instance the file was recreated and the database was able to use it. The script that the customer had was executed successfully and all the CREATE TABLESPACE operations were success.

     

    Conclusion

    Sometimes there are issues whose root cause is very rare, and in order to determine it quickly we have to have all our concepts solid; otherwise, we might spend several hours trying to figure out what’s going on, reading notes and so on.

    In this case, it was very important to identify the clues. We had four clues here which pointed us to the right root cause.  Sometimes the customer is stressed and under pressure and wants us to fix the problem fast, but DBAs have to stay calm, we have to extract the clues (symptoms), to think about the root cause,  to create an hypothesis and work to prove it. To shorten diagnostic time make sure you’re on solid ground conceptually, which you can do by practicing various scenarios while you are getting prepared for a certification.

    Tags: Oracle

    Deiby Gomez

    Written by Deiby Gomez

    Deiby Gómez is the youngest Oracle ACE and Oracle ACE Director in the world and the first Guatemalan with these awards. Deiby is the youngest Latin American with the highest certifications “Oracle Certified Master 11g” and “Oracle Certified Master 12c”. He received In Vegas, United States the "SELECT Journal Editor’s Choice Award 2016", he became the first Guatemalan with that award. He is a frequent speaker in several Oracle Events around the world like “Technology Network Latin American Tour 2013, 2014, 2015 and 2016 in several countries like Guatemala, Costa Rica, Nicaragua, El Salvador, Uruguay, Argentina, Mexico, Brazil, Ecuador, Colombia, Peru; Collaborate in Vegas, USA; Latin American Oracle Open World in Brazil (2015 and 2016) and Oracle Open World in San Francisco, USA (2015 and 2016). He is the first Guatemalan who was accepted by Oracle Corporation as “Beta Tester” for the version “12cR2” in 2015. He is the official Technical Reviewer of the Book “Oracle Database 12c Release 2 Multitenant (1st Edition, McGraw-Hill)" and Co-Author of the book "Oracle Database 12c Release 2 Testing Tools and Techniques for Performance and Scalability", both can be found in Amazon. He is the first Guatemala who appeared as an outstanding expert in the official magazine of Oracle Corporation called “Oracle Magazine” in the Edition November/December in 2014, this magazine is delivered around the world in several countries. He has published several articles in the Official Website of Oracle (www.oracle.com) in Portuguese, Spanish and English. Currently he is President of Guatemalan Oracle Users Group (GOUG). Director of Support Quality en Latin American Oracle Users Group Community (LAOUC) for 2016-2017, founder of a very well known group in Oracle Community called “Oraworld Team” that has members from India, Guatemala, Brazil, France and Switzerland having in total 4 Oracle ACE Directors, 3 Oracle ACE, 5 Oracle Certified Masters (OCM) and 1 PhD. Deiby was part of Oracle ACE Hackaton in Amsterdam, Netherlands in April 2016 where he was building several solutions using Oracle Cloud Products. Currently Deiby Gómez is CEO in Nuvola Consulting Group, a company that provides excellence on Support and Consulting services with Oracle Technology like Databases, Middleware, Cloud and Engineered systems. Deiby is well known in the community because of his resilience, entrepreneurship and his availability to help and share his knowledge. Deiby loves to travel, to play chess and to enjoy a good cup of coffee with friends. Oracle ACE (at the age of 23) Oracle Certified Master 11g (at the age of 24) Oracle ACE Director (at the age of 25) Oracle Certified Master 12c (at the age 26) SELECT Journal Editor’s Choice Award 2016 Speaker in several Universities, OTN Tour 2013,2014,2015. Collaborate15, LA Oracle Open World. Technical Reviewer of the Book "Oracle Database 12c Release 2 Multitenant (Oracle Press) 1st Edition" Blogger. Oracle Certifications: Oracle Linux Certified Implementation Specialist. Oracle Database 11g Administrator Certified Professional. Oracle Database 11g Administrator Certified Master (OCM 11g) Oracle Database 12c Administrator Certified Master (OCM 12c) Oracle Database 12c Maximum Availability Architecture Certified Master (MAA OCM 12c) Oracle Database 12c Maximum Availability Architecture Certified Expert (MAA OCE 12c) Oracle Database 12c Administrator Certified Professional. Oracle Service Oriented Architecture Infrastructure Implementation Certified Expert. Oracle Exadata Database Machine Administrator. Oracle RAC 11g and Grid Infraestructure Administrator Oracle RAC 12c and Grid Infraestructure Administrator Oracle Real Application Clusters 12c Certified Implementation Specialist Oracle Database 12c: Data Guard Administrator