"recovery". Every DBMS handles this differently. Some are completely automated, some require manual intervention. Vertica handles this automatically. If you've handled k-safety properly then whenever a node comes back online (for whatever reason) Vertica will automatically recover it by copying data from other nodes. This really eliminates the need for restoring data due to hardware or software failure...you only really need to restore a db when you need to create a new, copied environment or when you've accidentally committed bad data.
Reviewing the Logs. Any good DBA should be reviewing the available DBMS logs periodically. You can see partial logs in the Message Center in MC. However, if you read the logs directly you can begin to think up methods to automate log management. The two most important logs are vertica.log and dbLog. To view these in Linux you use the tail command. tail -f <catalog_path>/vertica.log
and tail -f <cdb_path>/dbLog
. vertica.log will log almost any activity that the database manager performs. dbLog will list just those messages specific to the database. The former is almost unusably verbose (automation is your friend)...so the latter is what you really should be checking frequently. Authentication and Authorization. Vertica follows the DCL syntax of the ANSI standards and most DBMS vendors. You can always use MC to handle your ANA needs. Auditing. Auditing is very specific to every organization. Some have strict requirements, others don't care. Good DBAs, even if auditing is not a functional requirement, audit various events if for no other reason than trending. Vertica has rich auditing facilities...I simply know very little about them. Sorry. Performance Management. I'll cover this in the next blog post [[Basic Vertica Performance Management Tasks]]. Resolving connectivity issues. Vertica is similar to SQL Server in this regard. You must know the servername (clustername), database name, port, and user/password (if not using OS authentication). Vertica clients connect to a built-in load balancer via a VIP (virtual IP) that is shared across Vertica cluster nodes. This ensures that if a node fails clients can still work. This is similar to a SQL Server cluster. The IP Virtual Server (IPVS) is the load balancer in Vertica and it even allows adding and removing nodes on a running cluster without interrupting users. This type of load balancer is akin to "round robin" which means that any single node can be handling inordinately more activity just through dumb luck. Vertica also has "native connection load balancing" that is disabled by default but can actually ensure that CPU and memory overhead is honored across nodes too. The caveat is that you have to use Vertica's client libraries and it takes a bit more work to set up. License Management. HP licenses Vertica by the size of the raw (uncompressed) data in the tables. This doesn't include projection data, temp tables, the WOS (we'll cover that in a future post), or data not yet purged by TM (covered in the future too). If your db's size nears the licensed limit you'll see warnings in admintools and vsql. You need to then either upgrade your license or purge some data. The beauty is that Vertica never just shuts down. The most visible problem is that your queries will return a status code of SUCCESS_WITH_INFO instead of just SUCCESS. If you code for that scenario then you can save yourself some grief. You can run SELECT get_compliance_status();
to determine your licensing compliance. There are certainly other DBA tasks that could be added to this list, but this list comprises the absolute basics to be an entry-level Vertica DBA.
You have just read "[[Basic Vertica DBA Tasks]]" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.