How To Use Regex Negative Lookahead To Exclude Strings

Author: , Posted on Friday, May 31st, 2019 at 1:20:12pm

I have a task in Perl to list specific files based on pattern match, those with and those without the string “_from_”.

There are two files in the directory to filter:

static-east.properties
static-east_from_west.properties

To capture the files with the _from_ string was easy:

To capture the files WITHOUT the _from_ string was not quite so easy until I learned about Negative Lookahead:

This is a great interactive regex testing site that I found (note: I have zero affiliation with them, I just think it is a great resource):
https://www.regextester.com/15

How To Force csshX To Tile In Columns

Author: , Posted on Tuesday, May 28th, 2019 at 12:30:12pm

I use csshX on a daily basis. The default layout on my screen for three nodes is one column with three rows, one per node.

This layout is excellent for visual positioning, but does make it harder to read long output.

For this purpose, I start csshX using the column quantity specifier -x and I get the layout to be vertical instead.

For more information, please consult the built-in man page:

The Important Role of a Tungsten Rollback Error

Author: , Posted on Friday, May 24th, 2019 at 5:01:34pm

The Question

Recently, a customer asked us:

What is the meaning of this error message found in trepsvc.log?

2019/05/14 01:48:04.973 | mysql02.prod.example.com | [east - binlog-to-q-0] INFO pipeline.SingleThreadStageTask Performing rollback of possible partial transaction: seqno=(unavailable)


Simple Overview

The Skinny

This message is an indication that we are dropping any uncommitted or incomplete data read from the MySQL binary logs due to a pending error.


The Answer

Safety First

This error is often seen before another error and is an indication that we are rolling back anything uncommitted, for safety. On a master this is normally very little and would likely be internal transactions in the trep_commit_seqno table, for example.

As you may know with the replicator we always extract complete transactions, and so this particular message is specific to the reading of the MySQL binlog into the internal memory queue (binlog-to-q).

This queue then feeds into the q-to-thl pipeline. We only want to write a complete transaction into the THL, so anything not completed when a failure like this happens gets rolled back.


Summary

The Wrap-Up

In this blog post we discussed the Tungsten Replicator rollback error.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

Understanding Cross-Site Replication in a Tungsten Composite Multi-Master Cluster for MySQL, MariaDB and Percona Server

Author: , Posted on Wednesday, May 22nd, 2019 at 2:42:10pm

Overview

The Skinny

In this blog post we will discuss how the managed cross-site replication streams work in a Composite Multi-Master Tungsten Cluster for MySQL, MariaDB and Percona Server.


Agenda

What’s Here?
  • Briefly explore how managed cross-site replication works in a Tungsten Composite Multi-Master Cluster
  • Describe the reasons why the default design was chosen
  • Explain the pros and cons of changing the configuration
  • Examine how to change the configuration of the managed cross-site replicators

Cross-Site Replication

A Very Brief Summary

In a standard Composite Multi-Master (CMM) deployment, the managed cross-site replicators pull Transaction History Logs (THL) from every remote cluster’s current master node.

The CMM functionality was introduced in Tungsten Clustering software version 6.0.0


Cross-Site Replication: In-Depth

How Does It All Work?

Managed cross-site replicators run in addition to the base replication service for the local cluster. The additional replication services (one per remote site) run on all nodes, and have a relay and slaves. The relay runs on the current master node so as not to make things confusing.

In a Composite Multi-Master cluster, each local cluster must pull data from the other remote sites. There is an additional replication service for each remote site, called a sub-service.

Each sub-service is named to match the remote site.

For example, assume a composite cluster with four sites: east, west, north and south.

On the east cluster, there would be the following three additional replication streams:
east_from_west
east_from_north
east_from_south

As you can see, each sub-service is named in a way that makes it easy to understand.

Reading sub-service names is also simple, and so for example “east_from_west” is stated as “I am in cluster east and pulling THL from cluster west”.

Below is a diagram showing just two clusters within the “usa” composite service – east and west:


Cross-Site Replication Architecture

Pros and Cons

The default architecture is designed so that the relay node for each cluster service gets the most recent information directly from the remote master, reducing the risk of data latency (staleness).

As of Tungsten Clustering software version 6.0.4, the cross-site replicators within a CMM deployment can be configured to point to slave nodes, and to prefer slave nodes over master nodes during operation.

This configuration allows the slave nodes to handle the load generated by the remote cross-site relays upon the master nodes. This becomes a concern when there are many sites, all of which are pulling THL from the same remote master nodes.

Of course, when using this option, one must accept that the downstream data may be delayed by the additional hop, and that the data replicated to the remote sites could (and probably would) be older than it would be using the standard topology.


Tuning Cross-Site Replication

How To Configure the Replication Streams to Meet Your Needs

To configure the cluster to prefer slave nodes over master nodes, use the --policy-relay-from-slave=true option to tpm.

Both master and slave nodes remain in the list of possible hosts, so if no slave nodes are available during a switch or failover event, then a master will be used.


Summary

The Wrap-Up

In this blog post we discussed Tungsten Composite Multi-Master Cluster cross-site replication configuration.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

Performance Tuning Tungsten Replication to MySQL

Author: , Posted on Tuesday, May 21st, 2019 at 11:57:54am

The Question

Recently, a customer asked us:

Why would Tungsten Replicator be slow to apply to MySQL?


The Answer

Performance Tuning 101

When you run trepctl status and see:
appliedLatency : 7332.394
like this on a slave, it is almost always due to the inability for the target database to keep up with the applier.

This means that we often need to look first to the database layer for the solution.

Here are some of the things to think about when dealing with this issue:

Architecture and Environment
 Are you on bare metal?
 Using the cloud?
 Dev or Prod?
 Network speed and latency?
 Distance the data needs to travel?
 Network round trip times? Is the replicator applying to a database installed on the same server or is it applying over the network to a remote server?

[crayon-5d127f26a58ed754667405/]

Observe the value for real – if it is 15ms or more chances are you will see slow apply rates.

MySQL Binary Logging
 What binary logging format are you using?
mysql> select @@global.binlog_format;

  • For non-Multi-Master deployments, use MIXED
  • For Multi-Master topologies, use ROW

MySQL Tables
 Verify that all tables are InnoDB.
 Also make sure all tables have a Primary Key.
 Do the tables have proper indexes?
 Use the slow query log to identify if any tungsten-owned queries are taking a long time
 The MySQL EXPLAIN command is very useful in understanding slow queries:
https://dev.mysql.com/doc/refman/5.7/en/using-explain.html
https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
https://dev.mysql.com/doc/refman/5.7/en/explain-extended.html

MySQL Locks
 MySQL locks can prevent queries from completing in a timely manner. Check for queries that are holding locks open:

[crayon-5d127f26a58f4919253461/]

OS Memory
 Is the database configured to use enough memory?
 Check for lack of server memory

[crayon-5d127f26a58f7062215359/]

Physical Disk
 Check for disk i/o contention – this is often the real issue, especially with remote disk
shell> iostat -xpne 2
 Add SSD storage into your production systems
 Split filesystems up
 Implement multi-volume striping for improved i/o speed
 Make sure there are enough IOPS if using cloud instances


Summary

The Wrap-Up

In this blog post we discussed Tungsten Replicator applier performance tuning.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Fix rsync Packet Corrupt Errors

Author: , Posted on Friday, May 10th, 2019 at 10:10:42am

I was getting rsync errors from one source host:

Finally tracked it down to the application “Little Snitch”.

Needed to fully uninstall it and reboot to get the problem solved.

SSH Differences Between Staging and INI Configuration Methods

Author: , Posted on Tuesday, May 7th, 2019 at 3:54:45pm

The Question

Recently, a customer asked us:

If we move to using the INI configuration method instead of staging, would password-less SSH still be required?


The Answer

The answer is both “Yes” and “No”

No, for installation and updates/upgrades specifically. Since INI-based configurations force the tpm command to act upon the local host only for installs and updates/upgrades, password-less SSH is not required.

Yes, because there are certain commands that do rely upon password-less SSH to function. These are:

  • tungsten_provision_slave
  • prov-sl.sh
  • multi_trepctl
  • tpm diag (pre-6.0.5)
  • tpm diag --hosts (>= 6.0.5)
  • Any tpm-based backup and restore operations that involve a remote node

Summary

The Wrap-Up

In this blog post we discussed the SSH differences between the Staging and INI configuration methods.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How to Integrate Tungsten Clustering Monitoring Tools with PagerDuty Alerts

Author: , Posted on Tuesday, April 23rd, 2019 at 4:07:42pm

Overview

The Skinny

In this blog post we will discuss how to best integrate various Continuent-bundled cluster monitoring solutions with PagerDuty (pagerduty.com), a popular alerting service.

Agenda

What’s Here?
  • Briefly explore the bundled cluster monitoring tools
  • Describe the procedure for establishing alerting via PagerDuty
  • Examine some of the multiple monitoring tools included with the Continuent Tungsten Clustering software, and provide examples of how to send an email to PagerDuty from each of the tools.

Exploring the Bundled Cluster Monitoring Tools

A Brief Summary

Continuent provides multiple methods out of the box to monitor the cluster health. The most popular is the suite of Nagios/NRPE scripts (i.e. cluster-home/bin/check_tungsten_*). We also have Zabbix scripts (i.e. cluster-home/bin/zabbix_tungsten_*). Additionally, there is a standalone script available, tungsten_monitor, based upon the shared Ruby-based tpm libraries. We also include a very old shell script called check_tungsten.sh, but it is obsolete.


Implementing a Simple PagerDuty Alert

How To Add a PagerDuty Email Endpoint for Alerting
  • Create a new user to get the alerts:
    Configuration -> Users -> Click on the [+ Add Users] button
    • Enter the desired email address and invite. Be sure to respond to the invitation before proceeding.
  • Create a new escalation policy:
    Configuration -> Escalation Policies -> Click on the [+ New Escalation Policy] button
    • Enter the policy name at the top, i.e. Continuent Alert Escalation Policy
    • “Notify the following users or schedules” – click in the box and select the new user created in the first step
    • “escalates after” Set to 1 minute, or your desired value
    • “If no one acknowledges, repeat this policy X times” – set to 1 time, or your desired value
    • Finally, click on the green [Save] button at the bottom
  • Create a new service:
    Configuration -> Services -> Click on the [+ New Service] button
    • General Settings: Name – Enter the service name, i.e. Continuent Alert Emails from Monitoring (what you type in this box will automatically populate the
    • Integration Settings: Integration Type – Click on the second radio choice “Integrate via email”
    • Integration Settings: Integration Name – Email (automatically set for you, no action needed here)
    • Integration Settings: Integration Email – Adjust this email address, i.e. alerts, then copy this email address into a notepad for use later
    • Incident Settings: Escalation Policy – Select the Escalation Policy you created in the third step, i.e. “Continuent Alert Escalation Policy”
    • Incident Settings: Incident Timeouts – Check the box in front of Auto-resolution
    • Finally, click on the green [Add Service] button at the bottom

At this point, you should have an email address like “alerts@yourCompany.pagerduty.com” available for testing.

Go ahead and send a test email to that email address to make sure the alerting is working.

If the test works, you have successfully setup a PagerDuty email endpoint to use for alerting, congratulations!


How to Send Alerts to PagerDuty using the tungsten_monitor Script

Invoking the Bundled Script via cron

The tungsten_monitor script provides a mechanism for monitoring the cluster state when monitoring tools like Nagios aren’t available.

Each time the tungsten_monitor runs, it will execute a standard set of checks:

  • Check that all Tungsten services for this host are running
  • Check that all replication services and datasources are ONLINE
  • Check that replication latency does not exceed a specified amount
  • Check that the local connector is responsive
  • Check disk usage

Additional checks may be enabled using various command line options.

The tungsten_monitor is able to send you an email when problems are found.

It is suggested that you run the script as root so it is able to use the mail program without warnings.

Alerts are cached to prevent them from being sent multiple times and flooding your inbox. You may pass --reset to clear out the cache or –-lock-timeout to adjust the amount of time this cache is kept. The default is 3 hours.

An example root crontab entry to run tungsten_monitor every five minutes:

[crayon-5d127f26a5ca8513993276/]

An alternate example root crontab entry to run tungsten_monitor every five minutes in case your version of cron does not support the new syntax:

[crayon-5d127f26a5cae240638232/]

All messages will be sent to /opt/continuent/share/tungsten_monitor/lastrun.log

The online documentation is here:
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-tungsten_monitor.html


Big Brother is Watching You!

The Power of Nagios and the check_tungsten_* scripts

We have two very descriptive blog posts about how to implement the Nagios-based cluster monitoring solution:

Global Multimaster Cluster Monitoring Using Nagios and NRPE

Essential Cluster Monitoring Using Nagios and NRPE

We also have Nagios-specific documentation to assist with configuration:
http://docs.continuent.com/tungsten-clustering-6.0/ecosystem-nagios.html

In the event you are unable to get Nagios working with Tungsten Clustering, please open a support case via our ZenDesk-based support portal https://continuent.zendesk.com/
For more information about getting support, visit https://docs.continuent.com/support-process/troubleshooting-support.html

There are many available NRPE-based check scripts, and the online documentation for each is listed below:
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-tungsten_health_check.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_services.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_progress.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_policy.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_online.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_latency.html

Big Brother Tells You

Tell the Nagios server how to contact PagerDuty

The key is to have a contact defined for PagerDuty-specific email address, which is handled by the Nagios configuration file /opt/local/etc/nagios/objects/contacts.cfg:

objects/contacts.cfg

[crayon-5d127f26a5cb2422574444/]

Teach the Targets

Tell NRPE on the Database Nodes What To Do

The NRPE commands are defined in the /etc/nagios/nrpe.cfg file on each monitored database node:

/etc/nagios/nrpe.cfg

[crayon-5d127f26a5cb5142759592/]

Note that sudo is in use to give the nrpe user access as the tungsten user to the tungsten-owned check scripts using the sudo wildcard configuration.

Additionally, there is no harm in defining commands that may not be called, which allows for simple administration – keep the master copy in one place and then just push updates to all nodes as needed then restart nrpe.

Big Brother Sees You

Tell the Nagios server to begin watching

Here are the service check definitions for the /opt/local/etc/nagios/objects/services.cfg file:

objects/services.cfg

[crayon-5d127f26a5cb8507926243/]


Summary

The Wrap-Up

In this blog post we discussed how to best integrate various cluster monitoring solutions with PagerDuty (pagerduty.com), a popular alerting service.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com/tungsten-clustering-6.0/ecosystem-nagios.html.

Below are a list of Nagios NRPE plugin scripts provided by Tungsten Clustering. Click on each to be taken to the associated documentation page.

  • check_tungsten_latency – reports warning or critical status based on the replication latency levels provided.
  • check_tungsten_online – checks whether all the hosts in a given service are online and running. This command only needs to be run on one node within the service; the command returns the status for all nodes. The service name may be specified by using the -s SVCNAME option.
  • check_tungsten_policy – checks whether the policy is in AUTOMATIC mode and returns a CRITICAL if not./
  • check_tungsten_progress – executes a heartbeat operation and validates that the sequence number has incremented within a specific time period. The default is one (1) second, and may be changed using the -t SECS option.
  • check_tungsten_services – confirms that the services and processes are running; their state is not confirmed. To check state with a similar interface, use the check_tungsten_online command.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Fix YUM Update Errors with Percona GPG Keys

Author: , Posted on Tuesday, April 16th, 2019 at 9:12:13am

Problem: Trying to run a yum -y update as root aborts with the following error:

The solution, found on the Percona website, is this:

sudo yum update percona-release

Source URL: https://www.percona.com/blog/2019/02/05/new-percona-package-signing-key-requires-update-on-rhel-and-centos/

Spring Snow Today

Author: , Posted on Friday, March 22nd, 2019 at 8:51:05am