Solution--Shandong Ruiju Software Co., Ltd.

High-performance Data Cloud Platform

Application background of high-performance data cloud platform

l High-performance: The capacity of the existing database is large and the query and analysis speed is slow;

l Flexibility: The existing database architecture is not flexible, and it is required to shut down during the maintenance and capacity expansion.

l Low budget: Compared with the traditional database architecture of power+storage, the initial investment is much smaller

l The user is going to construct the data warehouse and data mart.

Key points of solution

The performance ZC data platform (ICCS-DW) of data is developed and researched by taking Greenplum engine as a core, and the share-free/MPP architecture and column storage database is adopted.

The internal compression of database, MapReduce, can realize the capacity expansion without shutdown, and multilevel fault tolerance. It is an OLAP product that specializes in quickly realizing the complex search and analysis of the mass database. It is:

l A data platform integrating with the software and hardware;

l Passed EMC test certitification;

l Loaded with GreenPlum, a parallel calculation engine;

l The standard database function and customizable ETL ability;

l High-speed parallel cloud data processing and loading capacity;

l Unique online capacity expansion and fault handling mechanism.

It can help clients build a virtualized calculation environment for data warehouse, create an autonomous virtualized data warehouse for different data calculation models and tasks, and conduct the centralized management of all kinds of structured and unstructured data with different data volumes. Meanwhile, the parallel architecture of the products also provides the virtualized data warehouse with an extremely high processing speed, which greatly improve the processing efficiency and analysis quality of each analysis model and task of the virtual database.

Characteristics of high-performance cloud data platform (ICCS-DW):

lShare-free/MPP core architecture 。

Data engine evenly distributes all data to all node servers of the system, all nodes store partial rows of each table or table partition, all data loadings and queries are automatically operated in parallel on each node server, and the architecture supports the expansion to tens of thousands of nodes.

l The mixed storage and implementation (according to column or row)

Support the mixed data storage according to the column or row According to the application requirements, the administrator could designate the storage and compression methods of each table or table partition. Based on this function, For any table or table partition, the user can select to store and process data according to the row or column

l PB-level loading capacity

The high-performance parallel and loading function based on MPP Scatter/Gather flow technology. The loading speed is increased linearly with the node, and actually exceed 4TB/hour.

l Index function

Support the index technology of the databases, and the databases stored in the form of row and column support the index.

lClient access and third-party tool support

l Multi-level fault-tolerant capability

l Online system capacity expansion (never shut down)

l Workload Management

l Flexible external data access

l Completely comply with the latest standard of SQL

A logistics enterprise built a smart BI through ICCS-DW

Client background

Mainly engaged in the logistics related to the auto parts and components, auto-related domestic freight forwarding service, whole vehicle storage, logistics technology consulting, planning, management, training and other services, and international freight forwarding service.

With the development of the business, the client hopes to integrate all systems, establish data warehouse, reduce operating costs through analysis and mining, and provide decision-making support.

Scheme design

A variety of database data are centrally stored in Greenplum, and a data warehouse is built by using Greenplum to analyze and mine data. The results are presented to the end users by the front-end application server.

User benefits

The establishment of the data warehouse assists uses in:

l Scientifically managing and rationally developing the internal and external information resources.

l Obviously reducing the logistics transportation costs by data analysis and mining.

l Establishing the unified data analysis platform by using the theme-oriented data model.

Greenplum tries its best to improve the database performance of a portal website

Client background

l Oracle RAC has been used;

l The loading speed is very slow and unacceptable, and technicians complain every day;

l It takes half a day or one day to do the detailed and complex click query of clients, and sometimes, there is no result, which wastes a lot of our time and far beyond the tolerance of Business Manager;

l Currently, the system cannot satisfy the analysis and application of the massive historical data.

Scheme design

The high-performance cloud data platform is used to replace the original Oracle RAC, and massive loading and complex query can be completed by Greenplum. The processed results are delivered to Oracle.

User benefits

l The loading speed increases 50 times.

l The query response time is reduced to 1/700 of the original time.