Opinion | Where will the next generation of cloud database technology head in the cloud native era?

Opinion | Where will the next generation of cloud database technology head in the cloud native era?

The era of full cloudification has come. In the face of a series of new technologies and challenges, what kind of changes will the database market face? As a cloud service provider, how can we help more enterprise-level users grasp the "cloud" trend and provide the most efficient and valuable database solutions?

Recently, on the database will be special Yunfeng Ali Station, Beijing, vice president of Alibaba Group, Bodhidharma hospital's chief scientist database, A smart business group cloud database products division is responsible for the Face Li Feifei total for next-generation cloud technology and native database The challenge was wonderfully shared.

Database development and technological evolution

According to DB-Engine's database market trend analysis in January 2019, relational databases still occupy the core market share. At the same time, the database market is constantly being segmented, and database segments such as graph databases, document databases, and NoSQL are emerging.

Another major trend is that the market share of traditional commercial databases represented by the three giants of Oracle, DB2 and Microsoft SQL Server continues to decline, while the open source and third-party database markets continue to grow.

Since the birth of database technology, although it has experienced more than 40 years of development, it is still in a period of vigorous development. Nowadays, the major cloud computing vendors have also reached a consensus: the database is an important part of connecting IaaS and intelligent applications on the cloud. Therefore, from all aspects of data generation, storage, and consumption, cloud manufacturers need to improve the ability of all links , And then meet the needs of users to connect IaaS and intelligent applications.

With the continuous development of database technology, not only OLTP systems have appeared to realize transaction processing and real-time recording of transaction data; OLAP systems have also appeared, with the help of OLAP systems to realize real-time analysis of massive data. In addition, various database services and management tools are needed to support the core OLTP and OLAP systems. On this basis, NoSQL database solutions came into being for semi-structured data and unstructured data.

From the late 1970s to the early 1980s, relational databases were born, SQL query language and OLTP transaction processing systems were born. With the explosive growth of data and the emergence of complex analysis requirements, data warehouses, OLAP online analysis and processing systems, and ETL data processing technologies were born.

With the development of technology to this day, the amount of multiple heterogeneous data such as graphs, documents, time and space, and time series continues to grow. Therefore, NoSQL and NewSQL database systems other than relational databases have also appeared correspondingly.

What kind of database do we need in the cloud native era?

Traditional databases often use a single-node architecture. In today's cloud-native era, cloud-native databases usually adopt a shared storage architecture. Alibaba Cloud PolarDB uses a shared storage architecture, which builds shared storage through a high-speed network, and realizes the separation of storage and computing on top of this, and can quickly and elastically expand multiple computing nodes.

At the same time, PolarDB can also quickly expand and contract in the two dimensions of storage and calculation according to the specific needs of users. For users, they can use the PolarDB database based on shared storage without modifying any business logic, which can realize non-intrusive migration.

In addition to cloud-native shared storage technology, the challenge of high concurrency and massive data access also needs to be solved with a distributed architecture. For example, in order to cope with the annual Double 11 promotion, Alibaba itself needs to explore distributed architecture. .

In addition, Alibaba Cloud also hopes to provide different query interfaces, such as SQL, on the user side for the multi-modal and polymorphic requirements of data. On the storage side, Alibaba Cloud hopes to support users to store data in different places and achieve unified query access to different data types through a unified interface like SQL. Currently, the data lake service provided by Alibaba Cloud is a cloud-native technology evolved for the above scenarios.

Just like the OLTP and OLAP systems mentioned earlier, traditional solutions hope to isolate read-write conflicts, let OLTP be responsible for transaction processing, and let OLAP be responsible for massive data analysis tasks. In the cloud-native era, Alibaba Cloud will use the technical dividends brought by new hardware to reduce the cost of data migration as much as possible, integrate transaction processing and data analysis in the same engine, and seamlessly solve the two problems for users through a set of systems. Kind of problem.

Alibaba Cloud currently serves a large number of enterprise-level customers who use the cloud resource pool provided by Alibaba Cloud through virtualization, separation of storage and computing. Therefore, we need to intelligently monitor and deploy all resources on the cloud to achieve rapid response and provide users with the highest quality services. Behind the intelligence needs to use machine learning and artificial intelligence technology to realize self-awareness, self-decision-making, self-recovery, and self-optimization from various dimensions such as data migration, data protection, and flexible scheduling.

In the cloud-native era, another important technology is the integrated design of software and hardware. The development of new hardware has brought many technological bonuses that can be used continuously for the database system, such as RDMA network, SSD, NVM, GPU, IPG acceleration, etc. Alibaba Cloud's PolarDB shared storage uses the RDMA network, so it can access remote database nodes as quickly as local nodes.

For many customers on the cloud, there may also be financial-grade high-availability requirements. Utilizing the high-availability protocol, Alibaba Cloud database can adopt a three-copy architecture, which can realize seamless real-time switching between databases locally, and can also meet the needs of different users for disaster recovery in remote locations. With the help of Binlog technology to achieve remote data synchronization and realize finance Highly available. In addition, users on the cloud attach great importance to data security. Alibaba Cloud's database service provides encryption technology from the beginning of data placement to ensure data security across the entire link.

Alibaba Cloud database service: independent and controllable global layout with innovation and commercial value

The tool products provided by Alibaba Cloud Database include data backup, data migration, data management, hybrid cloud data management, and intelligent diagnosis and optimization systems, which can help customers achieve rapid cloud access and create hybrid cloud solutions.

Among the core engine products we provide, there are independent and controllable self-developed products, as well as third-party and open source products. We hope to provide users with a wealth of choices through commercial databases and open source products. At the same time, we also hope to integrate the technical dividends of cloud computing into self-developed database products, so as to further deepen the application and truly help customers solve the application of third-party or open-source database products. Unsolvable pain points and problems.

In the OLTP direction, the core product provided by Alibaba Cloud Database is PolarDB and its distributed version PolarDB X. In addition, Alibaba Cloud also provides a series of services such as mainstream MySQL, PostgreSQL, SQLServer, and PPAS compatible with Oracle.

For the OLAP system, our core products are AnalyticDB, the data lake service DLA for multi-source heterogeneous data, and the time-series spatio-temporal database TSDB for IoT scenarios. In the NoSQL field, Alibaba Cloud provides a wealth of third-party database products for customers to choose from, such as HBase, Redis, MongoDB, and Alibaba Cloud's self-developed graph database GDB.

Alibaba Cloud Database's management and control platform and full-link monitoring service provide users with intelligent full-link detection and analysis, ensuring that Alibaba Cloud database can provide users with the highest level of Service Level.

Alibaba Cloud helps customers create a link for online and offline hybrid cloud data storage. Starting from the migration of customers to the cloud, they can choose Alibaba Cloud DTS service for real-time data upload and synchronization. After the data is uploaded to the cloud, customers can choose cloud-native database products such as PolarDB for storage, or use DLA or AnalyticDB for data analysis.

For data analysis in specific scenarios, solutions such as document databases, graph databases, or time-series spatiotemporal databases can be selected. DTS systems can be used to achieve online and offline data synchronization and backup, and HDM can also be used for hybrid cloud database management. In addition, Alibaba Cloud Database Service also provides a database management suite, which can support users to manage and develop databases, making the management and development process more efficient.

Here we mainly introduce two cloud-native database products, POLARDB and AnalyticDB .


POLARDB uses the RDMA network to achieve efficient shared distributed storage. With the help of shared storage technology, multiple computing nodes can achieve "one write and multiple read", and can quickly pop up multiple read-only nodes according to the workload requirements of customers To meet customers' demands for computing at peak times, it can also achieve rapid scaling and expansion on storage nodes.

In view of the customer's application scenarios and the fluctuations of its business peaks and valleys, PolarDB can be used and billed according to the amount and needs, which greatly improves the efficiency of the customer's database usage and saves the required costs. In general, PolarDB is a super MySQL, and subsequent versions of PolarDB will be compatible with PostgreSQL and Oracle.

In some scenarios, users need to face the challenges of high concurrency and massive data access, so it is necessary to break the upper limit of shared storage capacity. The distributed architecture of PolarDB X uses the Sharding Partition solution to achieve unlimited horizontal expansion of storage capacity. The distributed version of POLARDB X will also be tested later, and everyone is welcome to try it out.

When AnalyticDB
analyzes massive amounts of data, there will be certain conflicts between reading and writing. If you need to read a large amount of data and perform analysis, it will be extremely complicated, so it is recommended that you use Alibaba Cloud's real-time interactive analysis database system AnalyticDB.

The core feature of AnalyticDB is the ability to support high-throughput writes and a storage engine developed for row and column storage, so real-time interactive analysis can be realized. In massive data and high concurrency scenarios, AnalyticDB performs very well in terms of response time. AnalyticDB is compatible with the MySQL ecology, so the data in MySQL can be directly imported into AnalyticDB, which can realize millisecond-level queries of tens of billions of data and millions of TPS-level writes.

Data transmission cloud service DTS
In addition to the core cloud-native database products, we also have a variety of database tool products, such as data transmission cloud service DTS. The pain point that DTS solves is the data transmission problem that the cloud customer needs to go to the cloud, and the real-time data synchronization problem between the cloud and the cloud database or from the TP to the AP system after it goes to the cloud.

Using DTS, users can achieve rapid and efficient incremental data synchronization, ensuring real-time data consistency. DTS also provides data subscription capabilities, which can access more different data sources through different protocols and interfaces.

The database family ushered in a new member: graph database GDB

Here is an introduction to Alibaba Cloud's new database product-Alibaba Cloud Map Database GDB, which is currently undergoing public testing on the official website of Alibaba Cloud. GDB is a real-time and reliable online database that supports an attribute graph model for processing highly connected data query and storage. It utilizes a large number of cloud-native technologies, such as the separation of storage and computing. GDB supports a standard graph query language and is compatible with Gremlin syntax, which is consistent with the mainstream graph databases in the market.

Another core feature of GDB is to support real-time updates and support OLTP-level data consistency, which can help you ensure data consistency when analyzing and storing massive attribute graphs. GDB has the characteristics of cloud-native database products such as high service availability and easy maintenance. Typical application scenarios include social networks, financial fraud detection, implementation recommendations, etc. At the same time, GDB also supports forms such as knowledge graphs and neural networks.

To empower customers: reduce costs and improve efficiency, worry-free development

The goal of the Alibaba Cloud database team is to provide customers with enterprise-level cloud-native database services, using its own global layout, autonomous and controllable technology to provide enterprise customers with services such as fast data on the cloud, unified data management on and off the cloud, and data security.

For example, Alibaba Cloud database services currently support complex applications such as urban brains in cities such as Hangzhou, which need to store structured data as well as unstructured data, and have proposed OLTP, OLAP, tool products, etc. A huge challenge, and products that utilize cloud native technologies such as Alibaba Cloud AnalyticDB, PolarDB, DTS, etc. can seamlessly support complex application scenarios such as urban brains.

Take PolarDB as an example. The product was tested in August 2018 and commercialized at the end of 2018. So far, it has achieved rapid growth on the public cloud platform. The core reason behind the rapid growth of POLARDB is that Alibaba Cloud really helps customers solve their own pain points, not using new technology to "find the hammer and find the nail", but really "see the nail and then build the hammer".

The core features of POLARDB are cloud native, minute-level flexible storage and computing, cost-effective, flexible and flexible billing methods, high concurrency, rapid expansion of multiple read-only nodes, large-capacity support, and shared distributed storage It achieves an experience similar to a stand-alone database, has no intrusion to the user's business logic, and is highly compatible with MySQL.

AnalyticDB is a real-time interactive analysis system. Whether it is self-made data or stored data obtained from a big data system, it can be migrated to the AnalyticDB cluster with the help of DTS tools for in-depth business analysis, visualization, and interactive query. .

AnalyticDB can support connection queries of hundreds of tables, and can provide customers with millisecond-level query services. Today, countless users are using cloud-native databases such as PolarDB and AnalyticDB on Alibaba Cloud. Cloud-native databases are also truly changing the pain points customers encounter in applications and bringing them more business value.

In summary, a series of new technologies and new challenges have emerged in the cloud-native era. Facing these challenges, it is necessary to organically integrate database core products, management and control platforms, and database tools in order to provide customers with the most efficient and valuable solutions. Alibaba Cloud sincerely invites everyone to experience its own products and technologies, hoping to work with more customers to solve problems, and also hope that more developers and ecological partners can build on Alibaba Cloud's database services and products for specific industries And the in-depth solutions in the field have made the database market in the cloud-native era more prosperous.

Original link

This article is the original content of Yunqi Community and may not be reproduced without permission.