How To Block All Traffic From China Using iptables and ipset on Amazon Linux

Author: , Posted on Thursday, October 24th, 2019 at 8:02:41am

All credit to Matt Wilcox for this excellent article, for which this post is based – thank you, Matt!
https://mattwilcox.net/web-development/unexpected-ddos-blocking-china-with-ipset-and-iptables/

All commands run as root!

Do this once only:

Then add blockchina to the root cron

How To Read Multiple Lines of User Input from the Command Line in Perl

Author: , Posted on Wednesday, October 9th, 2019 at 11:59:50am

Use Case: Export Apple Reminders from MacOS application to cleaned plain text, i.e. remove unwanted spaces, blank lines and square brackets

PROCEDURE

  1. Click on a single item in the MacOS Reminders app list
  2. Select All (either Command-a or use Edit->Select All)
  3. Copy (either Command-c or use Edit->Copy)
  4. Open Terminal
  5. Run cleanme
  6. Paste the copied reminders into the Terminal window, then press Control-D
  7. The script will output the cleaned list as plaintext

EXAMPLE

Here is the Perl code I used to do it:

How To Sync Box to S3 Using rclone

Author: , Posted on Thursday, October 3rd, 2019 at 10:26:21am

To sync various cloud resources, use the excellent cli tool rclone from https://rclone.org/docs/

For this use case, the need was to sync from Box to an AWS S3 bucket.

Install rclone:

Configure both S3 and Box – for remote name labels I just used “S3” and “Box”:

Validate Access and Functionality:

Perform the Sync:

A more advanced example:

Make It Faster: Improving MySQL Write Performance for Tungsten Cluster Slaves

Author: , Posted on Thursday, September 26th, 2019 at 10:59:10am

Overview

The Skinny

In this blog post we explore various options for performance tuning MySQL server for better slave replication performance.

A Tungsten Cluster relies upon the Tungsten Replicator to move events from the master node to the slaves. Once the event has been transferred to the slave as THL on disk, the slave applier will then attempt to write it to the database. The Replicator can only apply events as fast as MySQL allows. If the MySQL server is somehow slow or blocking, then the Replicator will be as well.

A properly-tuned database server in addition to infrastructure and SysAdmin best practices will go quite a long way towards high-performance slave apply.


The Question

Recently, a customer asked us:

During one of our load tests, we had a peak of 60k writes/min, averaging around 40k w/m. During this period, we saw an applied latency of 100-160 seconds. Is it possible to improve the replication latency on the slaves?


The Summary

Where To Look and What For?

When performance tuning, we normally examine the configuration and logs for the following layers:

  • Tungsten Replicator
  • MySQL server
  • Storage
  • Network/Firewall
  • Hardware & OS

The Rules

What are Performance Tuning Best Practices?

Before we dive into the various layers, let’s review the performance tuning basics that I have learned over the years:

  • Performance tuning is a slow, iterative process.
  • Change one thing at a time and test carefully.
  • Document your changes and test results.
  • Go for the low-hanging fruit early on. Sometimes the smallest changes may have the largest results, i.e. adding a proper index on a table.
  • As soon as you remove one bottleneck, the next one will appear. Iterate!
  • Divide and Conquer – pick a spot and ensure everything is working properly on both sides. this often helps in isolating issues.
  • Start at the top and work your way down the layers. Test, test, test!
  • Start at the bottom and work your way up the layers. Test, test, test!
  • When in doubt, ask, then document.
  • Read the documentation. Then read it again.
  • Ensure consistency amongst all nodes at every layer.
  • Most of all, do not assume.

Tungsten Replicator

The Short Version

We always start off by checking the replicator applier stages, of which there are three.

On a slave, if the appliedLatency of the remote-to-thl stage is low, but the appliedLatency of both the thl-to-q and q-to-dbms stages are high, then the issue is almost always with the MySQL server somehow.

For example, on a healthy cluster:

On a Master Node:
shell> trepctl status -name tasks | egrep 'applied|stage'

On a Slave Node:
shell> trepctl status -name tasks | egrep 'applied|stage'

For more information about stages, please visit the following blog post: Mastering Tungsten Replicator Series: Understanding Pipelines and Stages

Quick Tech “Tool Tip”
Here is a handy command to describe the summary progress of each stage on that node, refreshing every two seconds:

MySQL Server

What a Lovely Tune
  • The first thing to do would be to enable the slow query logs just to validate that there is nothing blocking faster queries behind it. Since the applier is single-threaded, just one repeated slow query can create a huge bottleneck.
  • Next, SHOW FULL PROCESSLIST on the slaves during the latency would be helpful to identify what may be locking or blocking.
  • Use the MySQL EXPLAIN command to better understand why queries are slow.
  • Make sure there are no redundant indexes.
  • Does every table have a Primary Key?
  • What type of replication is enabled in MySQL – STATEMENT, MIXED or ROW? For active/passive clusters, we recommend MIXED, and for active/active clusters, use ROW to ensure data integrity across masters. ROW will increase latency due to the sheer volume of data transmitted as compared to STATEMENT or MIXED.
  • Are the tables InnoDB? If so, do you have lots of memory? can you load the tables into RAM and then sync to disk? This could help to avoid the physical SERIAL disk i/o that can create bottlenecks, even if the i/o channel itself is not full
  • Check for the use of Unique indexes, which disable the InnoDB change buffering performance boost. InnoDB has change buffering (previously called the insert buffer), which is a feature to delay building secondary indexes that are not unique, and merge writes. It can boost insert performance by quite a lot, and it’s enabled by default.
  • Finally, how large are the affected tables and schemas in terms of row size and byte size? Can the tables be split? Can the databases be split?

Key MySQL my.cnf Configuration Parameters to Check:

For this specific customer, we identified four (4) “hot” tables in the slow query logs, which turned out to have 2 million or more rows each. This asks the database to perform some i/o intensive operations.
Once the issue was clearly localized to the four tables, the scope narrowed somewhat, implying an indexing, query or disk i/o-based bottleneck.


Storage

Disk Inside and Out
  • Disk I/O is normally the slowest part of any compute workflow because disk is often much slower than physical memory. No matter how fast the underlying SSD is, there are physical limitations to overcome, especially with SAN and Cloud-based storage.
  • For example, when using AWS EC2, consider using enhanced EBS I/O with provisioned IOPS if you are not already doing so, which would provide faster performance at an increased cost.
  • Not having separated volumes for different purposes will always degrade performance because of disk contention. For example, ensure separate volumes for (at least) root, mysql data, mysql binlogs and THL files.
  • Then stripe each filesystems for MySQL over multiple volumes using RAID of some sort (i.e. RAID-0), so that there are multiple I/O channels in use at the same time for the same filesystem.
  • This layered disk architecture provides for a large number of parallel disk I/O channels, giving a much higher throughput at a much lower latency.
  • What is the filesystem type? (i.e. xfs, ext4, etc…) because journaling filesystems are very slow. Consider using a non-journaling file system, or disabling journaling. We suggest using xfs.
  • Mount the filesystem using the -noatime option

Network/Router/Firewall

The Network Really IS the Computer
  • Ensure sufficient bandwidth, and a very low error rate.
  • Confirm the various hops have sufficient cpu, ram and bandwidth
  • Is the firewall able to keep up?

Operating System

SysAdmins to the Rescue, As Usual
  • Is there enough CPU?
  • Is there enough RAM?
  • Check nproc and ofiles limits

Advanced Tungsten Replicator

The Parallel (Apply) Universe

After making sure that all reasonable efforts have been made to properly evaluate the above solutions, there is an advanced feature available to try: Parallel Apply.

Tungsten Replicator by default uses a single-threaded applier, so it can get about 10,000 updates per second maximum, depending on the round trip time, and so on. Since MySQL server is multi-threaded, the master is able to write faster than the slaves.

To increase that limit, Tungsten Replicator offers Parallel Apply, which employs multiple replication apply threads, one per shard.

https://docs.continuent.com/tungsten-clustering-6.0/deployment-parallel.html

By default, we shard by database, with one shard created per database schema. If there is a cross-database query, all other threads block until that one completes, slowing performance down dramatically.

This means Parallel apply is best suited for environments that equally busy writes for every database. Having many databases, but only one or two as hot defeats the design and purpose of Parallel Apply.

Again, any cross-shard query will force the Replicator back into single-threaded mode to ensure data integrity, with the result of having no performance gain, or even degradation as now the Replicator has to keep switching modes.

Tungsten Replicator can also shard by table, with the same caveats, but this time cross-TABLE queries will block other threads, making this somewhat less useful than desired based on most query designs.

http://docs.continuent.com/tungsten-replicator-6.1/filters-reference-shardbytable.html

Important questions before implementing Parallel Apply:

  • Do you have just one database schema or multiple?
  • If single, do the bulk of your queries cross tables or use single tables?
  • If multiple databases, are they de-coupled or do you do a bunch of cross-database queries?

The Answer

What WAS the Solution After All?

For this customer, the following changes improved performance to acceptable levels:

  • The THL was being written to the same location as the MySQL binary logs, creating contention. Moving the THL directory to a different location (/logs) improved performance.
  • Because the nodes are AWS EC2 instances, converting volumes from GP2 to Provisioned IOPS allowed the slave to keep up with replication much better.
  • Several MySQL configuration values were modified, which also decreased the applied latency on the slaves:


The Library

Please read the docs!

For more information about how to use the Tungsten Replicator, please see Mastering Tungsten Replicator Series: Command Line Superpowers

For more in-depth discussion about MySQL database tuning, here are some excellent blog posts by Morgan Tocker:
http://www.tocker.ca/2013/09/17/what-to-tune-in-mysql-56-after-installation.html
http://www.tocker.ca/2013/05/06/when-does-mysql-perform-io.html

For more technical information about Tungsten clusters, please visit https://docs.continuent.com


Summary

The Wrap-Up

Clearly, there are many, many things to think about when it comes to MySQL performance tuning – this blog post barely scratches the surface of the subject. Remember, performance tuning is all about iterative effort over time!

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Dual-Sort Query Results In WordPress Using get_posts

Author: , Posted on Thursday, September 12th, 2019 at 11:03:25am

Needed to build a feature for a Bookstore tool to have an optional sequence number for the Custom Post Type “books” that would allow items with a seq number to float to the top of the list and be sorted by seq #. The rest of the books would show underneath, sorted alphabetically.

10 Reasons Why Tungsten Clustering Beats the DIY Approach for Geo-Distributed MySQL Deployments

Author: , Posted on Thursday, August 29th, 2019 at 4:51:13pm

Why does the DIY approach fail to deliver vs. the Tungsten Clustering solution for geo-distributed MySQL multimaster deployments?

Before we dive into the 10 reasons, note why commercially-supported enterprise software is less risky and in fact less costly:

  • The labor time a human spends building and maintaining a DIY solution costs more than a supported solution that just works.
  • No matter what changes your team goes through, there is documentation, training, support so your mission-critical process is never dependent upon an irreplaceable individual.
  1. Tungsten Clustering is a complete solution, comprised of the Replicator, Manager and Connector components
    • With DIY, you must first decide the architecture, then select the individual tools to handle each layer of the topology. Each part must be installed, configured, maintained, monitored and fixed separately.
    • With DIY, you must craft scripts to connect everything together. In Tungsten Clustering, the three core components, Manager, Connector and Replicator, handle all of the messaging and control in a seamlessly-orchestrated fashion.
    • Tungsten Clustering has more than ten (10) years of development maturity. No DIY solution can match the depth of enterprise experience we bring to the table.
    • Tungsten Clustering is designed from the ground up to provide 24×7 data access
  2. Development of Database High-Availability Solutions
    • DIY requires significant in-house investment of time, money and human resources to build a full solution, and even then it could not come close to matching what Continuent offers based on ten years of development efforts.
    • Creating a simple solution is relatively fast, but managing and correctly automating all the possible corner cases that a geo-distributed multi-master solution exposes is very difficult to master. It is often the corner cases that lead to downtime.
    • DIY also requires extensive institutional knowledge to maintain the multiple portions of the chosen architecture. If the staff are lost, often a DIY solution becomes impossible to manage and maintain.
    • Tungsten Clustering is a complete, proven and supported solution with significant resources including extensive documentation, release notes, white papers and instructional training videos & webinars.
    • Tungsten Clustering is under continuous development with new features and bug fixes. It covers all the changes in the environment (MySQL version, OS version, etc.)
  3. Database Administration (DBA)
    • DIY requires that your DBA do a lot of manual work. Properly automating local failover is an enormous task, especially when trying to avoid split-brain situations.
    • With DIY, scripting a cohesive solution for a geo-distributed active-active highly-available definitely is a non-trivial task. Tungsten Clustering provides a global mesh right out of the box.
    • With DIY, recovering failures is time consuming and laborious. Tungsten Clustering automates most normal operations.
    • With DIY, almost all updates and upgrades (MySQL version, schema changes, etc.) will require down-time. Tungsten Clustering allows continuous operation during all changes.
    • With Tungsten Clustering, you can add new sites/clusters by following standard, easy-to-follow instructions
  4. Monitoring
    • All solutions must be monitored to be fully effective, so Tungsten Clustering includes various monitoring scripts as part of the core product.
    • We offer a cron-based watcher script that alerts via email if you do not already have a monitoring system.
    • The included shell scripts are very easy to read, understand and modify to suit your needs.
    • We include Nagios and Zabbix support as part of the base product.
  5. Backup/Restore/Provisioning
    • Re-provisioning a slave can be handled using a single command.
    • All enterprise databases need to be backed up on a regular basis regardless of the number of slave replicas to prevent data vulnerability from replicated errors.
    • Tungsten Clustering fully supports and is tightly integrated with both MySQL’s mysqldump command along with the excellent and free XtraBackup by Percona.
    • A configurable cron-based backup script is also provided as part of the core distribution, powerful enough to select an appropriate node and perform automated backups.
    • Backup and Restore may also be handled via the Manager’s cctrl command-line interface.
  6. Multi-master Replication (strength of Tungsten Replicator)
    • Tungsten Replicator handles complex topologies with ease.
    • Tungsten Replicator can switch roles very easily (i.e. master becomes a slave, or a slave becomes a master).
    • Tungsten Replicator is cluster-aware and can automatically switch to another THL source if the current one becomes unavailable.
  7. Management (zero downtime maintenance, etc.)
    • We have already engineeered the rules and communications and tools needed to control the entire clustering process end-to-end.
    • With Tungsten Clustering, there is a unified view of all the cluster resources. With DIY, there is no such view.
    • With Tungsten Clustering, it is very easy to change the role of a node, with DIY there are many different moving parts to control individually.
    • With DIY, failures may not be handled correctly because one layer does not know what another layer is doing.
    • With Tungsten Clustering, a node may be taken down for maintenance without impacting operations, then easily returned to the cluster and re-synced.
  8. Connectivity/Intelligent Proxy and Router (multisite aware proxy, etc.)
    • Connector is able to intelligently pause and resume client traffic during switches and failovers.
    • Connector is able to load-balance in multiple ways, including across sites using intelligent proxying and query redirection
    • Connector gets status updates and signals from the manager, and is therefore integrated into the management process.
    • Connector is able to reconnect to another manager if it loses connection with the current one.
    • Tungsten Connector has a full view of all nodes.
    • Connector Bridge mode provides full-speed performance with very low latency
  9. Support (can’t beat our 24/7 support)
    • All layers are supported by the same team end-to-end. Rapid response for urgent cases. Our SLA is 1 hr max.  For urgent cases, we average 6 minutes.
    • For a DIY solution, there is no single place to go for help. Many of the tools/solutions have no available support, or require support contracts from multiple vendors.
    • Staff spread out over the globe for follow-the-sun support around the clock.
    • 24x7x365 enterpise-level support is included in all term licenses.
    • You are supported by mature, knowledgable staff, with decades of experience.
    • Built-in diagnostics-gathering and submission tools covering all layers including the OS and database.
  10. Tungsten Replicator has features that are not available with native MySQL replication
    • Easy management – no need to log into MySQL to manage replication.
    • Global Transaction ID (GTID) for ANY version of MySQL.
    • Off-the-shelf MySQL support (MySQL Community/Enterprise, Percona Server and MariaDB)
    • Inspect every event in detail, including the event source, timestamp and time zone using simple commands.
    • Easy to query to get performance data, latency, status, and errors.
    • High tolerance for network outages, automatically picks up where it left off.
    • Able to use either a master or another slave as a THL source. Use a slave to reduce load on the master!
    • Can automatically switch to another THL source if the current one becomes unavailable.
    • Dedicated replicator log files to help quickly diagnose issues.
    • Ability to skip transactions as needed with a single intuitive command.
    • Ability to replicate to other targets in addition to MySQL:  Hadoop, Oracle, AWS Redshift, Kafka, HPE Vertica, Cassandra, Elasticsearch and others, with the same single extraction from the MySQL source(s).
    • Over 40 filters available!  Replicate subsets of schemas, tables, columns, and even do data transformation.
    • Create time-delayed replicas so that roll-back to a known good state is easy.
    • Implement complex topologies, such as fan-in.
    • Parallel apply support, based on schema.

Zero-Downtime Cluster Maintenance: Comparing the Procedures for Upgrades versus DB/OS Maintenance

Author: , Posted on Monday, August 26th, 2019 at 1:42:28pm

Overview

The Skinny

Part of the power of Tungsten Clustering for MySQL / MariaDB is the ability to perform true zero-downtime maintenance, allowing client applications full access to the database layer, while taking out individual nodes for maintenance and upgrades. In this blog post we cover various types of maintenance scenarios, the best practices associated with each type of action, and the key steps to ensure the highest availability.

Important Questions

Understand the Environment as a Whole First

There are a number of questions to ask when planning cluster maintenance that are critical to understand before starting.

For example:

  1. What is the cluster topology?
    • Standalone (connectors write to single cluster master)
      Single cluster: Locate the service name and the node list
    • Composite Master-Slave (CMS) (active/passive, connectors write to single cluster master)
      Multiple clusters: Locate the list of service names and the nodes per service
    • Composite Master-Master (CMM) (active/active, connectors write to multiple cluster masters)
      Multiple clusters: Locate the list of service names and the nodes per service
  2. Where are the connectors installed?

    Since the Connectors are where the calling client application meets the database layer, it is critical to understand where the Connectors are running and how they are being used by the various database clients to best plan the appropriate procedures for your deployment.

    • On the application servers?

      There is an architectural expectation that there is a load-balancer above multiple applications servers, which would remove the application node in the event of a Connector failure, and therefore remove any single point of failure at the application layer.

    • On a Connector farm fronted by a load balancing solution?
    • On the database nodes?
  3. What load-balancing solution is being used in front of the Connectors, if any?

    Tungsten Connectors are not inherently highly-available – they provide HA for the database layer, so some form of HA solution is required for the Connector layer.

    • How are client connections drained from an individual Connector?
    • Is this procedure fully documented and practiced?
  4. Are there any downstream consumers running the Tungsten Replicator?
  5. Is it possible to take a maintenance window?
  6. What are times of the least traffic during the week over the entire day?

Maintenance Scenario: Database or OS Restart

Rolling Thunder

This scenario covers cases where the database itself will be down, either because of an OS reboot or the restart of the MySQL daemon itself.

The best practice for operations involving an OS reboot or a database restart is a procedure called “Rolling Maintenance”, where each database node is serviced one at a time.

The reason for this is that attention must be given to multiple layers in order to ensure uptime – look at both connections per Connector and connections per database (both available via cctrl> ls):

  • Database layer – the database will go down, so we need to prevent all client connections through the Connector to that database node prior the the reboot or restart, both existing and new.
    This is accomplished via the shun command within cctrl.
  • Connector layer – if the Connector is running on the database node and that node gets rebooted, then all connections through that Connector will be abruptly and ungracefully severed when the restart occurs.
    This needs to be handled by draining connections from that Connector in the load balancer layer above.  Note that if restarts are happening during a maintenance window and the application is offline, we do not have to drain the connector, further simplifying the process.

A summary of the Rolling Maintenance Procedure is as follows:

  1. Perform the maintenance operation on all of the current slaves, one at a time
  2. Move the master role to a completed slave using cctrl> switch
  3. Perform the maintenance operation on the old master (now a slave)

Here is a link to the online documentation for performing rolling maintenance:
https://docs.continuent.com/tungsten-clustering-6.1/operations-maintenance-dataservice.html

Maintenance Scenario: Upgrades

The Special Case

This scenario covers cases where the database will NOT be down, so the focus becomes the Connector only. This means that we can modify the above rolling maintenance procedures somewhat to make it much easier to accomplish.

In the upgrade scenario, all three layers of the Tungsten Clustering software need to be stopped and restarted using the new version of software in a new directory.

The Connector is the only layer of the three that will impact client availability, which makes this operation capable of being performed live and in-place, with no switch needed.

In this use case, one needs to look at the connections per Connector (available via cctrl> ls), and NOT the connections per database because the databases stay up the whole time:

  • Connector layer – if the Connector is running on the database node and that node gets upgraded, then all connections through that Connector will be gracefully severed when the upgrade occurs.
    This still requires draining connections from each Connector in the load balancer layer above.
  • Database layer – the database stays up, so the Datasource stays ONLINE, no need to SHUN, and connections can happen without interruption from the database’s perspective.

There are two types of upgrade procedures, and the correct one is based upon the deployment method chosen during installation: Staging or INI.

For both types of upgrades, use the -no-connectors argument to tools/tpm update to prevent the Connectors from restarting. Once the upgrade is completed on all nodes, simply drain each Connector one-by-one and run the tpm promote-connector command to stop and restart the Connector once all client connections have been drained.  As above, if this is happening during a maintenance window when applications are offline, we do not have to drain the connectors and we do not need to employ the --no-connectors argument.

In this manner one may upgrade an entire cluster with zero downtime!

A summary of the INI Upgrade Procedure is as follows:

  1. Upgrade all of the current slaves, one at a time
  2. Perform an in-place upgrade on the master while live, with no switch required
  3. Drain and execute tpm promote-connector on each node running a Connector (if the applications are online due to no maintenance window)

A summary of the Staging Upgrade Procedure is as follows:

  1. Perform an in-place upgrade on the database and connector nodes (if any) all at once, with no switch required
  2. Drain and execute tpm promote-connector on each node running a Connector (if the applications are online due to no maintenance window)

Click here for both INI and Staging Upgrade Procedure details already covered in a prior blog post…

Best Practices

Do’s and Don’ts
  • Ensure that replication on each node has caught up with minimal latency before recovering/welcoming back into the cluster
  • Use switch, not switch to {datasource} if at all possible.
  • For both types of upgrades, if applications are running, use the -no-connectors argument to tools/tpm update to prevent the Connectors from restarting. Once the upgrade is completed on all nodes, simply drain each Connector one-by-one and run the tpm promote-connector command to stop and restart the Connector once all client connections have been drained.
  • Consider any downstream stand-alone Tungsten Replicators in the topology.  Occasionally, there are incompatible changes in THL which will cause an older version of the replicator to go offline. Upgrade the Replicator at the same time you are upgrading the cluster to address replication issues due to THL incompatibility.

The Library

Please read the docs!

For more information the Performing Database or OS Maintenance, please visit the docs page at https://docs.continuent.com/tungsten-clustering-6.1/operations-maintenance.html

For more information about the specfic procedures, please visit the online documentation:

INI Upgrades:
https://docs.continuent.com/tungsten-clustering-6.1/cmdline-tools-tpm-ini-upgrades.html
Staging Upgrades:
https://docs.continuent.com/tungsten-clustering-6.1/cmdline-tools-tpm-cmdline-upgrade.html
The tpm update command:
https://docs.continuent.com/tungsten-clustering-6.1/cmdline-tools-tpm-commands-update.html

For more information about Tungsten clusters, please visit https://docs.continuent.com


Summary

The Wrap-Up

In this blog post we discussed the power of Tungsten Clustering for MySQL / MariaDB to perform true zero-downtime maintenance, allowing client applications full access to the database layer, while taking out individual nodes for maintenance and upgrades. We covered various types of maintenance scenarios, the best practices associated with each type of action, and the key steps to ensure the highest availability.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

How To Upgrade PHP on AWS Linux

Author: , Posted on Thursday, August 22nd, 2019 at 10:26:04am

As root:

~or~

THEN:

Be sure to restart your web server!!

How to use Round-Robin Load Balancing with the Tungsten Connector

Author: , Posted on Thursday, August 8th, 2019 at 9:36:40am

Overview

The Skinny

Part of the power of Tungsten Clustering for MySQL / MariaDB is its intelligent MySQL Proxy, known as the Tungsten Connector. The Tungsten Connector has built-in read-write splitting capabilities, and it is also possible to configure different algorithms which select the appropriate slave (i.e. Round-Robin or Lowest-Latency).


The Question

Recently, a customer asked us:

How do we best share the load between read-only slaves? Currently, there appears to be an imbalance, with most of the read-only queries reaching just one slave. What may we do to improve this situation?

This customer noticed that a couple of long running, slow queries were running against one particular slave node.

Despite the increased CPU utilization caused by the expected, long running queries, the “quick” Read-Only requests would run against this same slave node causing what the customer referred to as “pilling up”, otherwise known as a bottleneck.


The Background

What IS the default Load Balancer Model
  • The default Load Balancer Model setting for the Tungsten Connector is MostAdvancedSlaveLoadBalancer.
  • This means that Read-Only operations are always reading from the most up-to-date slave in the cluster.
  • The MostAdvancedSlaveLoadBalancer Load Balancer Model setting selects the slave data source that has the lowest replication lag, or highWater in the ls -l output within cctrl.
  • If no slave data source is eligible, the master data source will be selected.
  • In a cluster experiencing rapid changes in latency, the 3-second (default) polling interval will be too low. This will result in higher CPU utilization and slower-running queries, which means that the MostAdvancedSlaveLoadBalancer can potentially create a bottleneck.

The Answer

Change the Load Balancer Model

It is possible to change the Tungsten Connector’s Load Balancer Model.

To more evenly distribute the workload (therefore resolving the bottleneck), Continuent recommends changing the Load Balancer Model from MostAdvancedSlaveLoadBalancer to RoundRobinSlaveLoadBalancer.

Why choose round-robin instead?

  • The RoundRobinSlaveLoadBalancer Load Balancer Model setting uses all available slave nodes, each in turn.
  • When the end of the slave node list is reached, the load balancer starts from the beginning of the list and repeats the loop.

Why NOT choose round-robin?

  • Consideration should be given to the fact that with the RoundRobinSlaveLoadBalancer, it is possible that a Read-Only request may read older/stale data from a slave that is not as caught-up as another slave.
  • It is also possible for the calling client to see slower response times due to slave CPU load

To change the Connector Load Balancer model, simply specify the property in the [defaults] stanza of your tungsten.ini file:


The Library

Please read the docs!

For more information the Load Balancer Model setting for the Tungsten Connector, please visit the docs page at https://docs.continuent.com/tungsten-clustering-6.0/connector-routing-loadbalancers.html

For more information about Tungsten clusters, please visit https://docs.continuent.com


Summary

The Wrap-Up

In this blog post we discussed changing the Load Balancer Model setting for the Tungsten Connector to a Round-Robin model, as well the behavioral changes that should be considered before making this change.

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

Enabling Autorecovery for the Tungsten Replicator

Author: , Posted on Wednesday, August 7th, 2019 at 10:58:04am

The Replicator is a critical piece of the Tungsten Clustering solution for MySQL / MariaDB, as well as its own stand-alone data replication product. Automatic recovery is a feature that enables the Replicator to go back online in the event of a transient failure. In this blog we discuss how to enable Automatic Recovery. For more information about Auto-Recovery, please click here to visit the online documentation page.

The Question

Recently, a customer asked us:

We see that the replicators receive a transaction which has a deadlock error in it:

If one performs a service online, it comes back online without issue and continues to replicate. Is there a setting that will allow the cluster (if in automatic mode) to try to online the service on its own (maybe try 2 or 3 times then stop trying if it continuously errors)?


The Answer

Yes! It’s called Automatic Recovery.

The following tpm options can be added to the [defaults] stanza of your tungsten.ini file:

These settings will enable autorecovery, the maximum number of attempts to online the replicator. These settings will also identify the delay between the replicator identifying that autorecovery is needed and an auto-recovery is being attempted.

The Library

Please read the docs!

For more information about Auto-Recovery, please visit the documentation page at https://docs.continuent.com//tungsten-replicator-6.1/operations-autorecovery.html

For more information about Tungsten Replicator, please visit our full online documentation at https://docs.continuent.com


Summary

The Wrap-Up

In this blog post we discussed enabling Automatic Recovery for the Tungsten Replicator.

To learn more about Tungsten Replicator, check out https://www.continuent.com/solutions/tungsten-replicator-all-you-need-to-know/

Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

To learn about Continuent Clustering solutions, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us