What on Earth is a Split-Brain Scenario in a MySQL Database Cluster?

Author: , Posted on Monday, June 24th, 2019 at 10:14:17am

Overview

The Skinny

In this blog post we will define what a split-brain scenario means in a MySQL database cluster, and then explore how a Tungsten MySQL database cluster reacts to a split-brain situation.


Agenda

What’s Here?
  • Define the term “split-brain”
  • Briefly explore how the Tungsten Manager works to monitor the cluster health and prevent data corruption in the event of a network partition
  • Also explore how the Tungsten Connector works to route writes
  • Describe how a Tungsten MySQL database cluster reacts to a split-brain situation
  • Illustrate various testing and recovery procedures

Split-Brain: Definition and Impact

Sounds scary, and it is!

A split-brain occurs when a MySQL database cluster which normally has a single write master, has two write-able masters.

This means that some writes which should go to the “real” master are sent to a different node which was promoted to write master by mistake.

Once that happens, some writes exist on one master and not the other, creating two broken masters. Merging the two data sets is impossible, leading to a full restore, which is clearly NOT desirable.

We can say that a split-brain scenario is to be strongly avoided.

A situation like this is most often encountered when there is a network partition of some sort, especially with the nodes spread over multiple availability zones in a single region of a cloud deployment.

This would potentially result in all nodes being isolated, without a clear majority within the voting quorum.

A poorly-designed cluster could elect more than one master under these conditions, leading to the split-brain scenario.


Tungsten Manager: A Primer

A Very Brief Summary

The Tungsten Manager health-checks the cluster nodes and the MySQL databases.

The Manager is responsible for initiating various failure states and helping to automate recovery efforts.

Each Manager communicates with the others via a Java JGroups group chat.

Additionally, the Connectors get status information from a chosen Manager as well…


Tungsten Connector: A Primer

Another Brief Summary

The Tungsten Connector is an intelligent MySQL database proxy located between the clients and the database servers, providing a single connection point, while routing queries to the database servers.

Simply put, the Connector is responsible for routing MySQL queries to the correct node in the cluster.

In the event of a failure, the Tungsten Connector can automatically route queries away from the failed server and towards servers that are still operating.

When the cluster Managers detect a failed master (i.e. because the MySQL server port is no longer reachable), the Connectors are signaled and client traffic is re-routed to the newly-elected Master node.

Each Connector makes a TCP connection to any available Manager, then all command-and-control traffic uses that channel. The Manager never initiates a connection to the Connector.

When there is a state change (i.e. shun, welcome, failover, etc.), the Manager will communicate to the Connector over the existing channel.

The Connector will re-establish a channel to an available Manager if the Manager it is connected to is stopped or lost.

For more detailed information about how the Tungsten Connector works, please read our blog post, “Experience the Power of the Tungsten Connector


Failsafe-Shun: Safety by Design

Protect the data first and foremost!

Since a network partition would potentially result in all nodes being isolated without a clear majority within the voting quorum, the default action of a Tungsten Cluster is to SHUN all of the nodes.

Shunning ALL of the nodes means that no client traffic is being processed by any node, both reads and writes are blocked.

When this happens, it is up to a human administrator to select the proper master and recover the cluster.

The main thing that avoids split-brain in our clustering is that the Connector is either:

  1. connected to a manager that is a member of a quorum or
  2. it is connected to a Manager that has all resources shunned

In the first case, it’s guaranteed to have a single master. In the second case, it can’t connect to anything until the Manager its connected to is in a quorum.

Example Failure Testing Procedures

Use this failure test procedure ONLY in a dev/test environment.
Use this procedure AT YOUR OWN RISK!

A failsafe-shun scenario can be forced.

Given a 3-node cluster east, with master db1 and slaves db2/db3, simply stop the manager process on both slaves and wait about 60 seconds:

[crayon-5d133e39ea91e273750628/]

The Manager on the master node db1 will restart itself after an appropriate timeout and the entire cluster will then be in FAILSAFE-SHUN status.

Once you have verified that the cluster in in FAILSAFE-SHUN status, start the Managers on both slaves before proceeding with recovery:

[crayon-5d133e39ea926859316069/]

Example Recovery Procedures

First, examine the state of the dataservice and choose which datasource is the most up to date or canonical. For example, within the following example, each datasource has the same sequence number, so any datasource could potentially be used as the master:

[crayon-5d133e39ea928995304338/]

Recover Master Using

Once you have selected the correct host, use cctrl to call the recover master using command specifying the full service name and hostname of the chosen datasource:

[crayon-5d133e39ea92c837706656/]

You will be prompted to ensure that you wish to choose the selected host as the new master. cctrl then proceeds to set the new master, and recover the remaining slaves.

If this operation fails, you can try the more manual process, next.

Welcome and Recover

A simple welcome attempt will fail:

[crayon-5d133e39ea930877830285/]

To use the welcome command, the force mode must be enabled first:

[crayon-5d133e39ea932185075444/]

Note the OFFLINE state as the result? In AUTOMATIC mode, the datasource will be set to ONLINE before you have the time to run the ls command and look at the cluster state:

[crayon-5d133e39ea935276036043/]

Finally, recover the remaining slaves:

[crayon-5d133e39ea939637387218/]

Summary

The Wrap-Up

In this blog post we defined what a split-brain scenario means, and explored how a Tungsten MySQL database cluster reacts to a split-brain situation.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about Tungsten Cluster recovery procedures, please visit https://docs.continuent.com/tungsten-clustering-6.0/operations-recovery-master.html

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

Configuring the Tungsten Connector for PCI Compliance

Author: , Posted on Tuesday, June 18th, 2019 at 5:57:37pm

The Question

Recently, a customer asked us:

We were wondering if the Connectors would be able to bind to localhost/127.0.0.1 instead of 0.0.0.0:3306? Since the Connector is installed on the application servers, all of the connections are coming from localhost. We would like to limit this exposure so that the 3306 port is not exposed externally. We ask because we are failing PCI checks that are able to access the database port externally.


The Answer

YES!

You may set the IP address for the Connector to listen on by using the tpm command option: property=server.listen.address={IP_ADDRESS}

To force the Connector to listen on the localhost only use the following example:

[crayon-5d133e39eaebf632240732/]

Use the IP address instead of the hostname to prevent DNS or /etc/hosts lookups.

For more information about configuring the Connector IP address, please visit the docs page at https://docs.continuent.com/tungsten-clustering-6.0/connector-advanced-listen-address.html


Summary

The Wrap-Up

In this blog post we discussed one way to configure the Tungsten Connector for PCI Compliance.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about configuring the Connector IP address, please visit the docs page at https://docs.continuent.com/tungsten-clustering-6.0/connector-advanced-listen-address.html

For more information about Tungsten clusters, please visit https://docs.continuent.com.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

Why is My Java Application Freezing Under Heavy I/O Load?

Author: , Posted on Monday, June 17th, 2019 at 1:09:44pm

The Question

Recently, a customer asked us:

Why would heavy disk IO cause the Tungsten Manager and not MySQL to be starved of resources?

For example, we saw the following in the Manager log file tmsvc.log:

The Answer

Why a Java application might be slow or freezing

The answer is that if a filesystem is busy being written to by another process, the background I/O will cause the Java JVM garbage collection (GC) to pause.

This problem is not specific to Continuent Tungsten products.

The following article from LinkedIn engineering explains the issue very well (and far better than I could – well done, and thank you):

https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic

Below is a quote from the above article (without permission, thank you):

Latency-sensitive Java applications require small JVM GC pauses. However, the JVM can be blocked for substantial time periods when disk IO is heavy. These are the factors involved:

  1. JVM GC needs to log GC activities by issuing write() system calls;
  2. Such write() calls can be blocked due to background disk IO;
  3. GC logging is on the JVM pausing path, hence the time taken by write() calls contribute to JVM STW pauses.

The Solution

So what may be done to alleviate the problem?

You have options like:

  • Tune the GC log location to use a separate disk to cut down on i/o conflicts as per the article above
  • Move the backups or NFS-intensive jobs to another node.
  • Unmount any NFS volumes and use rsync to an admin host responsible for NFS writes (i.e. move the mount to an external host)

Again, I quote from the LinkedIn engineering article above (without permission, thank you again):

One solution is to put GC log files on tmpfs (i.e., -Xloggc:/tmpfs/gc.log). Since tmpfs does not have disk file backup, writing to tmpfs files does not incur disk activities, hence is not blocked by disk IO. There are two problem with this approach: (1) the GC log file will be lost after system crashes; and (2) it consumes physical memory. A remedy to this is to periodically backup the log file to persistent storage to reduce the amount of the loss.

Another approach is to put GC log files on SSD (Solid-State Drives), which typically has much better IO performance. Depending on the IO load, SSD can be adopted as a dedicated drive for GC logging, or shared with other IO loads. However, the cost of SSD needs to be taken into consideration.

Cost-wise, rather than using SSD, a more cost-effective approach is to put GC log file on a dedicated HDD. With only the IO activity being the GC logging, the dedicated HDD likely can meet the low-pause JVM performance goal.

Summary

The Wrap-Up

In this blog post we discussed why Java applications freeze or are slow under heavy I/O load and what may be done about it.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions

The Library

Please read the docs!

For more information about Tungsten clusters, please visit https://docs.continuent.com

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us

Using Keep-Alives To Ensure Long-Running MySQL & MariaDB Sessions Stay Connected

Author: , Posted on Friday, June 14th, 2019 at 6:29:40am

Overview

The Skinny

In this blog post we will discuss how to use the Tungsten Connector keep-alive feature to ensure long-running MySQL & MariaDB/Percona Server client sessions stay connected in a Tungsten Cluster.


Agenda

What’s Here?
  • Briefly explore how the Tungsten Connector works
  • Describe the Connector keep-alives – what are they and why do we use them?
  • Discuss why the keep-alive feature is not available in Bridge mode and why
  • Examine how to tune the keep-alive feature in the Tungsten Connector

Tungsten Connector: A Primer

A Very Brief Summary

The Tungsten Connector is an intelligent MySQL database proxy located between the clients and the database servers, providing a single connection point, while routing queries to the database servers.

√ High-Availability

The most important function of the Connector is failover handling.

In the event of a failure, the Tungsten Connector can automatically route queries away from the failed server and towards servers that are still operating.

When the cluster detects a failed master because the MySQL server port is no longer reachable, the Connectors are signaled and traffic is re-routed to the newly-elected Master node.

√ Read-Scaling

Next is the ability to provide read-scaling and route MySQL queries based on various factors.

During the routing process, Tungsten Connector communicates with the Tungsten Manager to determine which datasources are the most up to date, and their current role so that the packets can be routed properly.

In the default Bridge mode, traffic is routed at the TCP layer, and read-only queries must be directed to a different port (normally 3306 for writes and 3307 for reads).

There are additional modes, Proxy/Direct and Proxy/SmartScale. In both cases, queries are intercepted and inspected by the Connector. The decisions made are tunable based on configuration parameters.

For more detailed information about how the Tungsten Connector works, please read our blog post, “Experience the Power of the Tungsten Connector

For a comparison of Routing methods, please see the documentation page: http://docs.continuent.com/tungsten-clustering-6.0/connector-routing-types.html


Tungsten Connector: Keep-Alives

What are they and why do we use them?

Connections to MySQL servers can automatically time-out according to the wait_timeout variable configured within the MySQL server.

To prevent these connections being automatically closed, the connector can be configured to keep the connection alive by submitting a simple SELECT statement (actually SELECT ‘KEEP_ALIVE’;) periodically to ensure that the MySQL timeout is not reached and the connection closed.

The keep-alive feature was designed with Proxy modes in mind (Proxy/Direct and Proxy/SmartScale). When using either, Proxy mode, every single client connection gets 2 mysql server-side connections: one for reads and one for writes.

If your application is read-intensive, the server-side read-only connection gets updated often and is kept alive by MySQL. Under those conditions, the write connection is NOT being unused, and so there is a risk the MySQL server’s wait_timeout to expire, so the next write on the client side connection would get an error.

In response to the above scenario, the keep-alive feature was implemented.

Keep-alives by default are enabled and set to autodetect, which will compute suitable values based on the MySQL server wait_timeout in order to be totally transparent to the application. This design will produce the exact same behavior as if the application were connected directly to the database server.


Keep-Alives and Bridge Mode

Why They Do Not Work Together

The Connector Keep-alive feature is NOT compatible with Bridge mode.

In Bridge mode, the client session is directly connected to the MySQL server at the TCP level, literally forwarding the client’s packet to the server. This means that closing connections is the responsibility of the MySQL server based on the configured wait_timeout value, not the Connector.


Summary

The Wrap-Up

In this blog post we discussed the basics of the Tungsten Connector, the Keep-alive feature and how to tune the values that control it.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions


The Library

Please read the docs!

For more information about Tungsten Connector Keep-alives, please visit http://docs.continuent.com/tungsten-clustering-6.0/connector-states-keepalive.html

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Install/Update CPAN on MacOS Mojave with fatal error: ‘EXTERN.h’ file not found

Author: , Posted on Wednesday, June 12th, 2019 at 9:49:18am

The Problem

Tried to install the latest version of cpan and got stuck. Tracked it down to the dependency module Mac::SystemDirectory which was failing to build:

The Solution

Use the MacOS installer command to deploy the needed files:

The, rerunning the cpan install works:

How To Solve the LOAD DATA INFILE Error from –secure-file-priv

Author: , Posted on Wednesday, June 12th, 2019 at 9:43:30am

The Problem

Tried to load a Tab-delimited text file into MuSQL 5.6.43 and got the following error:

mysql> LOAD DATA INFILE 'sample.txt' INTO TABLE test_table;
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement

Solution One

mysql> SHOW VARIABLES LIKE "secure_file_priv";

Copy the file into the returned directory from above and call the full path in the load data command:

mysql> LOAD DATA INFILE '/var/lib/mysql-files/sample.txt' INTO TABLE test_table;

Solution Two

Add the LOCAL keyword:

mysql> LOAD DATA LOCAL INFILE 'sample.txt' INTO TABLE test_table;

References:

10 MySQL Load Data Infile Examples to Upload Text File Data to Tables


https://stackoverflow.com/questions/32737478/how-should-i-tackle-secure-file-priv-in-mysql

How To Get Sonos Working on MacOS Mojave

Author: , Posted on Thursday, June 6th, 2019 at 3:58:12pm

SonosSonos was running just fine oon my MacOS Mojave iMac, then I had to force-quit it due to a system freeze.

The Sonos app REFUSED to start, no matter what I tried, including uninstall/reinstall, full scrub of all files located, uninstaller software, reboots, cache cleaning and reboot with Onyx – NOTHING worked, and i was getting frustrated.

By sheer luck, the Terminal app asked for access Via the Security & Privacy » Privacy » Accessability permissions panel in System Preferences and I saw that the Sonos app was listed there AND THE BOX WAS NOT CHECKED!

As soon as the box was checked and the Sonos app restarted, it worked. Go figure!

The Continuent Docker Support Policy Explained

Author: , Posted on Wednesday, June 5th, 2019 at 12:57:19pm

Overview

Continuent has traditionally had a relaxed policy about Linux platform support for customers using our products.

While it is possible to install and run Continuent Tungsten products (i.e. Cluster/Replicator/etc.) inside Docker containers, there are many reasons why this is not a good idea.

Background

As background, every database node in a Tungsten Cluster runs at least three (3) layers or services:

  • MySQL Server (i.e. MySQL Community or Enterprise, MariaDB or Percona Server)
  • Tungsten Manager, which handles health-checking, signaling and failover decisions (Java-based)
  • Tungsten Replicator, which handles the movement of events from the MySQL master server binary logs to the slave databases nodes (Java-based)

Optionally, a fourth service, the Tungsten Connector (Java-based), may be installed as well, and often is.

The Current State of Affairs

As such, this means that the Docker container would also need to support these 3 or 4 layers and all the resources needed to run them.

This is not what containers were designed to do. In a proper containerized architecture, each container would contain one single layer of the operation, so there would be 3-4 containers per “node”. This sort of architecture is best managed by some underlying technology like Swarm, Kubernetes, or Mesos.

More reasons to avoid using Docker containers with Continuent Tungsten solutions:

  • Our product is designed to run on a full Linux OS. By design Docker does not have a full init system like SystemD, SysV init, Upstart, etc… This means that if we have a process (Replicator, Manager, Connector, etc…) that process will run as PID 1. If this process dies the container will die. There are some solutions that let a Docker container to have a ‘full init’ system so the container can start more processes like ssh, replicator, manager, … all at once. However this is almost a heavyweight VM kind of behavior, and Docker wasn’t designed this way.
  • Requires a mutable container – to use Tungsten Clustering inside a Docker container, the Docker container must be launched as a mutable Linux instance, which is not the classic, nor proper way to use containers.
  • Our services are not designed as “serverless”. Serverless containers are totally stateless. Tungsten Clustering does not support this type of operation.
  • Until we make the necessary changes to our software, using Docker as a cluster node results in a minimum 1.2GB docker image.
  • Once Tungsten Clustering has been refactored using a microservices-based architecture, it will be much easier to scale our solution using containers.
  • A Docker container would need to allow for updates in order for the Tungsten Cluster software to be re-configured as needed. Otherwise, a new Docker container would need to be launched every time a config change was required.
  • There are known i/o & resource constraints for Docker containers, and therefore must be carefully deployed to avoid those pitfalls.
  • We test on CentOS-derived Linux platforms.

What to Expect in the Future

Continuent does NOT have Docker containerization on the product roadmap at this time. That being said, we do intend to provide containerization support at some point in the future. Customer demand will contribute to the timing of the effort.

Summary

In closing, Continuent’s position on container support is as follows:

  • Unsupported at this time for all products (i.e. Cluster/Replicator/etc.)
  • Use at your own risk

The Library

Please read the docs!

For the documentation page for this policy, please visit https://docs.continuent.com/tungsten-clustering-6.0/deployment-requirements.html#deployment-requirements-docker-policy

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Show Filenames With git log

Author: , Posted on Monday, June 3rd, 2019 at 1:36:42pm

To display the filenames included in each commit, just add the --name-only argument to git log:

How To Remove A File From The git Staging Pre-Commit List

Author: , Posted on Friday, May 31st, 2019 at 4:07:10pm

Quickref: How to Remove a file from the staging area

The Story

Recently, I accidentally added some files to git’s pre-commit phase, i.e.:

For example, here is how to handle the above situation:

To better understand, here are the phases/states/stages that git uses:

  • Untracked – when a file is first created, git sees it and knows it has not yet been told to handle this file
  • Staged/Indexed – git add will signal git to start tracking the file, and so git places the file into the staging state, ready to commit
  • Committed – the git commit command can be called upon a single file or use -a to commit all staged files
  • Modified- a file has already been committed at least once before, and now new changes exist in the local file(s) which are not committed or staged yet.