Performance Tuning Tungsten Replication to MySQL

Author: , Posted on Tuesday, May 21st, 2019 at 11:57:54am

The Question

Recently, a customer asked us:

Why would Tungsten Replicator be slow to apply to MySQL?


The Answer

Performance Tuning 101

When you run trepctl status and see:
appliedLatency : 7332.394
like this on a slave, it is almost always due to the inability for the target database to keep up with the applier.

This means that we often need to look first to the database layer for the solution.

Here are some of the things to think about when dealing with this issue:

Architecture and Environment
 Are you on bare metal?
 Using the cloud?
 Dev or Prod?
 Network speed and latency?
 Distance the data needs to travel?
 Network round trip times? Is the replicator applying to a database installed on the same server or is it applying over the network to a remote server?

shell> time mysql -e "select 1"
...
real    0m0.004s
user    0m0.003s
sys     0m0.000s

Observe the value for real – if it is 15ms or more chances are you will see slow apply rates.

MySQL Binary Logging
 What binary logging format are you using?
mysql> select @@global.binlog_format;

  • For non-Multi-Master deployments, use MIXED
  • For Multi-Master topologies, use ROW

MySQL Tables
 Verify that all tables are InnoDB.
 Also make sure all tables have a Primary Key.
 Do the tables have proper indexes?
 Use the slow query log to identify if any tungsten-owned queries are taking a long time
 The MySQL EXPLAIN command is very useful in understanding slow queries:
https://dev.mysql.com/doc/refman/5.7/en/using-explain.html
https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
https://dev.mysql.com/doc/refman/5.7/en/explain-extended.html

MySQL Locks
 MySQL locks can prevent queries from completing in a timely manner. Check for queries that are holding locks open:

mysql> show full processlist;
mysql> show open tables where in_use <> 0;
mysql> show engine innodb status;

OS Memory
 Is the database configured to use enough memory?
 Check for lack of server memory

shell> free -m
shell> top

Physical Disk
 Check for disk i/o contention – this is often the real issue, especially with remote disk
shell> iostat -xpne 2
 Add SSD storage into your production systems
 Split filesystems up
 Implement multi-volume striping for improved i/o speed
 Make sure there are enough IOPS if using cloud instances


Summary

The Wrap-Up

In this blog post we discussed Tungsten Replicator applier performance tuning.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Fix rsync Packet Corrupt Errors

Author: , Posted on Friday, May 10th, 2019 at 10:10:42am

I was getting rsync errors from one source host:

1
2
Corrupted MAC on input.
Disconnecting: Packet corrupt
Corrupted MAC on input.
Disconnecting: Packet corrupt

Finally tracked it down to the application “Little Snitch”.

Needed to fully uninstall it and reboot to get the problem solved.

SSH Differences Between Staging and INI Configuration Methods

Author: , Posted on Tuesday, May 7th, 2019 at 3:54:45pm

The Question

Recently, a customer asked us:

If we move to using the INI configuration method instead of staging, would password-less SSH still be required?


The Answer

The answer is both “Yes” and “No”

No, for installation and updates/upgrades specifically. Since INI-based configurations force the tpm command to act upon the local host only for installs and updates/upgrades, password-less SSH is not required.

Yes, because there are certain commands that do rely upon password-less SSH to function. These are:

  • tungsten_provision_slave
  • prov-sl.sh
  • multi_trepctl
  • tpm diag (pre-6.0.5)
  • tpm diag --hosts (>= 6.0.5)
  • Any tpm-based backup and restore operations that involve a remote node

Summary

The Wrap-Up

In this blog post we discussed the SSH differences between the Staging and INI configuration methods.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How to Integrate Tungsten Clustering Monitoring Tools with PagerDuty Alerts

Author: , Posted on Tuesday, April 23rd, 2019 at 4:07:42pm

Overview

The Skinny

In this blog post we will discuss how to best integrate various Continuent-bundled cluster monitoring solutions with PagerDuty (pagerduty.com), a popular alerting service.

Agenda

What’s Here?
  • Briefly explore the bundled cluster monitoring tools
  • Describe the procedure for establishing alerting via PagerDuty
  • Examine some of the multiple monitoring tools included with the Continuent Tungsten Clustering software, and provide examples of how to send an email to PagerDuty from each of the tools.

Exploring the Bundled Cluster Monitoring Tools

A Brief Summary

Continuent provides multiple methods out of the box to monitor the cluster health. The most popular is the suite of Nagios/NRPE scripts (i.e. cluster-home/bin/check_tungsten_*). We also have Zabbix scripts (i.e. cluster-home/bin/zabbix_tungsten_*). Additionally, there is a standalone script available, tungsten_monitor, based upon the shared Ruby-based tpm libraries. We also include a very old shell script called check_tungsten.sh, but it is obsolete.


Implementing a Simple PagerDuty Alert

How To Add a PagerDuty Email Endpoint for Alerting
  • Create a new user to get the alerts:
    Configuration -> Users -> Click on the [+ Add Users] button
    • Enter the desired email address and invite. Be sure to respond to the invitation before proceeding.
  • Create a new escalation policy:
    Configuration -> Escalation Policies -> Click on the [+ New Escalation Policy] button
    • Enter the policy name at the top, i.e. Continuent Alert Escalation Policy
    • “Notify the following users or schedules” – click in the box and select the new user created in the first step
    • “escalates after” Set to 1 minute, or your desired value
    • “If no one acknowledges, repeat this policy X times” – set to 1 time, or your desired value
    • Finally, click on the green [Save] button at the bottom
  • Create a new service:
    Configuration -> Services -> Click on the [+ New Service] button
    • General Settings: Name – Enter the service name, i.e. Continuent Alert Emails from Monitoring (what you type in this box will automatically populate the
    • Integration Settings: Integration Type – Click on the second radio choice “Integrate via email”
    • Integration Settings: Integration Name – Email (automatically set for you, no action needed here)
    • Integration Settings: Integration Email – Adjust this email address, i.e. alerts, then copy this email address into a notepad for use later
    • Incident Settings: Escalation Policy – Select the Escalation Policy you created in the third step, i.e. “Continuent Alert Escalation Policy”
    • Incident Settings: Incident Timeouts – Check the box in front of Auto-resolution
    • Finally, click on the green [Add Service] button at the bottom

At this point, you should have an email address like “alerts@yourCompany.pagerduty.com” available for testing.

Go ahead and send a test email to that email address to make sure the alerting is working.

If the test works, you have successfully setup a PagerDuty email endpoint to use for alerting, congratulations!


How to Send Alerts to PagerDuty using the tungsten_monitor Script

Invoking the Bundled Script via cron

The tungsten_monitor script provides a mechanism for monitoring the cluster state when monitoring tools like Nagios aren’t available.

Each time the tungsten_monitor runs, it will execute a standard set of checks:

  • Check that all Tungsten services for this host are running
  • Check that all replication services and datasources are ONLINE
  • Check that replication latency does not exceed a specified amount
  • Check that the local connector is responsive
  • Check disk usage

Additional checks may be enabled using various command line options.

The tungsten_monitor is able to send you an email when problems are found.

It is suggested that you run the script as root so it is able to use the mail program without warnings.

Alerts are cached to prevent them from being sent multiple times and flooding your inbox. You may pass --reset to clear out the cache or –-lock-timeout to adjust the amount of time this cache is kept. The default is 3 hours.

An example root crontab entry to run tungsten_monitor every five minutes:

*/5 * * * * /opt/continuent/tungsten/cluster-home/bin/tungsten_monitor --from=you@yourCompany.com --to=alerts@yourCompany.pagerduty.com >/dev/null 2>/dev/null

An alternate example root crontab entry to run tungsten_monitor every five minutes in case your version of cron does not support the new syntax:

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /opt/continuent/tungsten/cluster-home/bin/tungsten_monitor --from=you@yourCompany.com --to=alerts@yourCompany.pagerduty.com >/dev/null 2>/dev/null

All messages will be sent to /opt/continuent/share/tungsten_monitor/lastrun.log

The online documentation is here:
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-tungsten_monitor.html


Big Brother is Watching You!

The Power of Nagios and the check_tungsten_* scripts

We have two very descriptive blog posts about how to implement the Nagios-based cluster monitoring solution:

Global Multimaster Cluster Monitoring Using Nagios and NRPE

Essential Cluster Monitoring Using Nagios and NRPE

We also have Nagios-specific documentation to assist with configuration:
http://docs.continuent.com/tungsten-clustering-6.0/ecosystem-nagios.html

In the event you are unable to get Nagios working with Tungsten Clustering, please open a support case via our ZenDesk-based support portal https://continuent.zendesk.com/
For more information about getting support, visit https://docs.continuent.com/support-process/troubleshooting-support.html

There are many available NRPE-based check scripts, and the online documentation for each is listed below:
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-tungsten_health_check.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_services.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_progress.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_policy.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_online.html
http://docs.continuent.com/tungsten-clustering-6.0/cmdline-tools-check_tungsten_latency.html

Big Brother Tells You

Tell the Nagios server how to contact PagerDuty

The key is to have a contact defined for PagerDuty-specific email address, which is handled by the Nagios configuration file /opt/local/etc/nagios/objects/contacts.cfg:

objects/contacts.cfg

define contact{
	use			 generic-contact
        contact_name             pagerduty
        alias                    PagerDuty Alerting Service Endpoint
        email                    alerts@yourCompany.pagerduty.com
}

define contactgroup{
        contactgroup_name       admin
        alias                   PagerDuty Alerts
        members                 pagerduty,anotherContactIfDesired,etc
}

Teach the Targets

Tell NRPE on the Database Nodes What To Do

The NRPE commands are defined in the /etc/nagios/nrpe.cfg file on each monitored database node:

/etc/nagios/nrpe.cfg

command[check_tungsten_online]=/usr/bin/sudo -u tungsten /opt/continuent/tungsten/cluster-home/bin/check_tungsten_online
command[check_tungsten_latency]=/usr/bin/sudo -u tungsten /opt/continuent/tungsten/cluster-home/bin/check_tungsten_latency -w 2.5 -c 4.0 
command[check_tungsten_progress_alpha]=/usr/bin/sudo -u tungsten /opt/continuent/tungsten/cluster-home/bin/check_tungsten_progress  -t 5 -s alpha
command[check_tungsten_progress_beta]=/usr/bin/sudo -u tungsten /opt/continuent/tungsten/cluster-home/bin/check_tungsten_progress  -t 5 -s beta
command[check_tungsten_progress_gamma]=/usr/bin/sudo -u tungsten /opt/continuent/tungsten/cluster-home/bin/check_tungsten_progress  -t 5 -s gamma

Note that sudo is in use to give the nrpe user access as the tungsten user to the tungsten-owned check scripts using the sudo wildcard configuration.

Additionally, there is no harm in defining commands that may not be called, which allows for simple administration – keep the master copy in one place and then just push updates to all nodes as needed then restart nrpe.

Big Brother Sees You

Tell the Nagios server to begin watching

Here are the service check definitions for the /opt/local/etc/nagios/objects/services.cfg file:

objects/services.cfg

# Service definition
define service{
    service_description         check_tungsten_online for all cluster nodes
    host_name                   db1,db2,db3,db4,db5,db6,db7,db8,db9
    check_command               check_nrpe!check_tungsten_online
    contact_groups     admin
    use                         generic-service
    }


# Service definition
define service{
    service_description         check_tungsten_latency for all cluster nodes
    host_name                   db1,db2,db3,db4,db5,db7,db8,db9
    check_command               check_nrpe!check_tungsten_latency
    contact_groups     admin
    use                         generic-service
    }


# Service definition
define service{
    service_description         check_tungsten_progress for alpha
    host_name                   db1,db2,db3
    check_command               check_nrpe!check_tungsten_progress_alpha
    contact_groups             admin
    use                         generic-service
    }

# Service definition
define service{
    service_description         check_tungsten_progress for beta
    host_name                   db4,db5,db6
    check_command               check_nrpe!check_tungsten_progress_beta
    contact_groups             admin
    use                         generic-service
    }

# Service definition
define service{
    service_description         check_tungsten_progress for gamma
    host_name                   db7,db8,db9
    check_command               check_nrpe!check_tungsten_progress_gamma
    contact_groups             admin
    use                         generic-service
    }


Summary

The Wrap-Up

In this blog post we discussed how to best integrate various cluster monitoring solutions with PagerDuty (pagerduty.com), a popular alerting service.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com/tungsten-clustering-6.0/ecosystem-nagios.html.

Below are a list of Nagios NRPE plugin scripts provided by Tungsten Clustering. Click on each to be taken to the associated documentation page.

  • check_tungsten_latency – reports warning or critical status based on the replication latency levels provided.
  • check_tungsten_online – checks whether all the hosts in a given service are online and running. This command only needs to be run on one node within the service; the command returns the status for all nodes. The service name may be specified by using the -s SVCNAME option.
  • check_tungsten_policy – checks whether the policy is in AUTOMATIC mode and returns a CRITICAL if not./
  • check_tungsten_progress – executes a heartbeat operation and validates that the sequence number has incremented within a specific time period. The default is one (1) second, and may be changed using the -t SECS option.
  • check_tungsten_services – confirms that the services and processes are running; their state is not confirmed. To check state with a similar interface, use the check_tungsten_online command.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Fix YUM Update Errors with Percona GPG Keys

Author: , Posted on Tuesday, April 16th, 2019 at 9:12:13am

Problem: Trying to run a yum -y update as root aborts with the following error:

1
2
3
4
5
The GPG keys listed for the "Percona-Release YUM repository - x86_64" repository are already installed but they are not correct for this package.
Check that the correct key URLs are configured for this repository.
 
 Failing package is: sysbench-1.0.17-2.el6.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Percona
The GPG keys listed for the "Percona-Release YUM repository - x86_64" repository are already installed but they are not correct for this package.
Check that the correct key URLs are configured for this repository.

 Failing package is: sysbench-1.0.17-2.el6.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Percona

The solution, found on the Percona website, is this:

sudo yum update percona-release

Source URL: https://www.percona.com/blog/2019/02/05/new-percona-package-signing-key-requires-update-on-rhel-and-centos/

Spring Snow Today

Author: , Posted on Friday, March 22nd, 2019 at 8:51:05am

How To Revert a Single File to a Specific Commit Using git

Author: , Posted on Monday, March 18th, 2019 at 6:25:56pm

git log {path_and_file}

git checkout {commit_hash} -- {path_and_file}

git checkout 8c7eae3f518bb7fd98eb6e8344270f02065d83ee -- myFile.txt

How To Enable Opening a Terminal Window from the Current Finder Location in MacOS

Author: , Posted on Saturday, February 16th, 2019 at 12:50:20pm

As of Mac OS X Lion 10.7, Terminal provides Services for opening a new terminal window or tab at the selected folder in Finder. They also work with absolute pathnames selected in text (in any application). You can enable these services with System Preferences > Keyboard > Keyboard Shortcuts > Services. Look for “New Terminal at Folder” and “New Terminal Tab at Folder”. You can also assign them shortcut keys.

In addition, you can now drag folders (and pathnames) onto the Terminal application icon to open a new terminal window, or onto a tab bar in a terminal window to create a new tab in that window. If you drag onto a tab (rather than into the terminal view) it will execute a complete cd command to switch to that directory without any additional typing.

As of OS X Mountain Lion 10.8, Command-Dragging into a terminal will also execute a complete cd command.

R.I.P. Albert Finney

Author: , Posted on Friday, February 8th, 2019 at 1:34:16pm

Born: May 9, 1936, Charlestown, Salford, United Kingdom
Died: February 7, 2019

How To Obtain a Public Key from an AWS .pem Private Key on Linux and Mac

Author: , Posted on Monday, February 4th, 2019 at 3:28:27pm

Use the ssh-keygen command on a computer to which you’ve downloaded your private key .pem file; for example:

First, ensure permissions will allow ssh-keygen to work:
chmod 600 /path/to/the/file/your-key-pair.pem

Then generate an RSA public key:
ssh-keygen -y -f /path/to/the/file/your-key-pair.pem > your-key-pair.pub