MEMSQL

MemSQL is a distributed and in-memory SQL database management system. MemSQL also includes a relational database management system (RDBMS). MemSQL compiles Structured Query Language (SQL) into machine code through a code generation process called “code generation”.

Core Technology

MemSQL combines lock free data structures and just-in-time (JIT) compilation to handle highly volatile workloads. More specifically, MemSQL implements lock-free hash tables and lock-free skiplists in memory for fast random access to data. SQL requests sent to the MemSQL server are converted to byte code and compiled via LLVM into machine code. The query is then stripped of its parameters and the query template is stored as a shared object which is then matched against the query entered into the system. Executing a precompiled query plan eliminates interpretation along hot code paths, providing highly efficient code paths that minimize the number of central processing unit (CPU) instructions required to process SQL statements.

MemSQL is compatible with MySQL. This means that applications can connect to MemSQL via the MySQL client and driver, as well as the Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC) connectors. In addition to MySQL syntax and functionality, MemSQL can also store columns in JSON format, and supports geospatial data types and operations.

Row and Column Format

MemSQL can store database tables as either rowstores or columnstores. The format used is defined by the user at DDL time (i.e. when the table is created). Data for all rowstore tables is stored completely in memory, with snapshots and transaction logs remaining on disk. The data for all column store tables is stored on disk, using a structure such as row store to handle inserts that go to the column store.

Row tables and column tables differ in more than just the storage medium used. Rowstores, as the name suggests, store information in row format, which is the traditional data format used by RDBMS systems. Rowstores are optimized for singletons or small insert, update or delete queries and are most associated with OLTP (transactional) use cases. Columnstores are optimized for complex select queries, typically associated with OLAP (analytics) use cases. For example, large clinical data sets for data analysis are best stored in a columnar format, because the queries executed against them will typically be ad-hoc queries in which aggregates are computed over a large number of similar data items.

Distributed Architecture

MemSQL database is a distributed database implemented with aggregator and leaf leaf. The MemSQL binaries used for aggregators and leaf nodes are pretty much the same, with the only difference being that the user identifies the node as an aggregator or leaf. The aggregator is responsible for receiving SQL requests, splitting them in leaf nodes, and collecting the results back to the client. Leaf nodes store MemSQL data and process queries from aggregators. All communication between aggregators and leaf nodes is done over the network via SQL syntax. MemSQL uses hash partitioning to distribute data uniformly across a number of leaf nodes.

Durability

MemSQL’s durability is slightly different for in-memory and column-on-on-disk rowstores. Durability for in-memory rowstores is implemented with write-ahead logs and snapshots, similar to checkpoints. By default, once a transaction is recognized in memory, the database will asynchronously write the transaction to disk as fast as the disk allows.

The on-disk column is actually fronted by an in-memory rowstore-like structure (skiplist). This structure has the same robustness guarantee as the first line of MemSQL. In addition, columnstores are durable because their data is stored on disk.

Replication

MemSQL clusters can be configured in “High Availability” mode, where each data partition is automatically created with master and slave versions on two separate leaf nodes. In High Availability mode, the aggregator sends transactions to the master partition, which then sends logs to the slave partition. In the event of an unexpected master failure, the slave partition takes over as the master partition in full online operation.

MemSQL Ops

MemSQL ships with an installation, management, and monitoring tool called MemSQL Ops. When installing MemSQL, Ops can be used to set up a MemSQL database distributed across machines, and provide metrics about the running system. MemSQL Ops has a web user interface and a command line interface.

Integration With Apache

Starting with MemSQL 4.1, launched in September 2015, MemSQL gives users the ability to install Apache Spark as part of a MemSQL cluster, and use Spark as an ETL tool to import data into MemSQL. Apache Spark is installed and managed interactively using MemSQL Ops. Ops users can then define extract, transform, and load phases of their data pipeline to import data into MemSQL. Management and monitoring of running data pipelines can be done in UI Ops.

The need for digital IT is needed in daily activities, Bead IT Consultant is the right choice as your partner, visit our website by clicking this link www.beadgroup.com