2014-04-23_2336142014-04-23_233614
Call Us: +386 1 5374052

CHRYSOLITE BDA

TASK-SPECIFIC CALL DETAIL RECORDS TOOL FOR DATA STORAGE

 

CHRYSOLITE BDA – a specialized tool for data storage optimized for accumulation and fast retrieval of CDR information about subscriber  activities in telephone networks (PSTN, GSM/3G).

Brochure CHRYSOLITE BDA (pdf)

“Traditional DBMS” approaches of creating data storages being currently used have been formed during the last ten years and are based on the use  of the Hi-End class storage equipment with a general-purpose DBMS (Oracle, Microsoft SQL, some specialized – Sybase IQ, Greenwich, etc.). This leads to several million dollars cost of CDR storage (not including software licenses), while the cost of creating the storage is up to 90% of the cost for equipment in similar projects. All “traditional” solutions are based on a common architecture – a dedicated powerful server with a lot of CPUs, FC / SAN infrastructure, expensive storage controllers, a large number of disk bays connected to each controller, bays often connected to each other in cascade. As a result, the price for one terabyte is several tens of thousands of dollars.

The downside of the use of such solutions simplicity is their extremely low performance – only up to 8 MB/sec (in rare cases) to load new Call Data Records to systems based on MS SQL/Oracle, up to 120 thousand CDR/sec (in rare cases) based on some specialized DBMS (while the hardware provides the speed ten times quicker).

In both cases the annually increased amount of subscriber activities (and also requirements for data loading) and the need of permanent storage capacity growth encounter the overwhelming technical barrier for their modernization.

At some stage, the cost of storage expansion based per each terabyte starts to exceed multiplied costs of the previous stages, while the speed of queries performance for the retained data in storage falls sharply below the minimum acceptable one.

Specialized hardware and software and columnar DBMS that have found a widespread use in analytical reporting for telecom and banks (Teradata, Vertica) are designed primarily for a small amount of data extracted on query with complex logic of their calculation and issue of summary aggregated data.

Obtaining the original records from storage encounters significant computational resource expense while assembling them from the columns they were split during data loading process (which itself may significantly exceed the search time). The aim of search activities on the contrary is to obtain all the information about each subscriber activity on any request without any restrictions on the number of records in query result.

Skaf_eng_1111

Query execution:

 

SNTT_bda_210x297 2
Another alternative is the use of systems based on HBase/Hadoop which involves the growth of the amount of required hardware (electrical power and number of racks) in a geometric progression, maintenance of each system requires the constant presence of engineers, each system is unique.

As a result the target customer had only to accept the situation where there were no alternatives and cost-effective ways of organizing high powered CDR data retention storage that can satisfy the real operational needs. In the CHRYSOLITE Big Data Appliances, the SNTT company presents a new, efficient approach to implementing high-performance systems for special storage and retrieval of information in the CDR.

This is a truly revolutionary product that changes radically the idea of CDR data retention and retrieval:

 

  • The cost of storage system is less (from 4 to 10 times) than the cost of using the “traditional” approach and other appliances and storages in the market;
  • High-performance search in the information stored that is much faster than the analogues (up to ten times);
  • Significant disk space saving compared to analogues.

The growth of the number of subscriber activities in telephone networks is up to 30% annually that requires continuous improvement of information retrieval systems of CDR processing.

All this is happening as the storage technologies and DBMS, that have shown their effectiveness over the past 5-10 years, did not survive the attack of the increasing amount of data, query performance requirements and lost the achieved position. At the same time, an approach that has proven its effectiveness is long-term accumulation of CDR data gathered from different operators in one information system, and its simultaneous processing of retained CDR data.

The SNTT company offers a modern approach to the practical implementation of high-performance storage and information retrieval of CDR – the CHRYSOLITE BDA.

Massive Parallel Processing (MPP) system architecture

123

MPP – architecture is a linear list of nodes sequentially filled with data. During the operation of the data loading system, record (intelligent data distribution) goes simultaneously in N nodes – “recording window” after filling the system goes to the following nodes.

Basic models of storage :

 

Model* Technical Characteristics User data** Nodes Loading Data*** Query performance on MSISDN/IMSI/IMEI/Cell-ID per 24 hours
Minimum (6U) 2 kW/100 kg 36 TB 2 up to 1, 4 million CDRs/sec – up to 4 billion CDR/24 hours No more than 1 sec****
Stationary (24U) 8 kW/400 kg 180 TB/480 TB 10/4 up to 3 million/CDR sec – up to 5 billion CDR/24 hours
Stationary (42U) 14 kW / 800 kg 320 TB/840 TB 18/7 up to 4 million/CDR sec – up to 6 billion CDR/24 hours
Stationary (2x42U) 28 kW/1900 kg 640 TB/1,68 petabytes 36/14 up to 4 million/CDR sec – up to 7 billion CDR/24 hours

* – excluding UPS;
** – taking into account the hot-spare, RAID organization, the node capacity depends on the servers HP ProLiant (DL / SL) models used;
*** – peak load of 120 bytes per CDR;
**** – in accordance with the daily amount of information depending on the model (4 … 7 billion CDRs/24 hours).

Nonstandard storages (1-20 PByte) building technology

SNTT_bda_210x297 2

By implementing an appliance the customer acquires:

 

    • Loading from 700 thousand CDR/sec to 4 million CDR/sec depending on the model (see table of basic models);
    • No less than one million query execution on the subscriber’s ID (MSISDN/IMSI/IMEI) per day over the entire volume of retained data, increasing abruptly the amount of performed activities in 24 hours;
    • Scaling up to 20 PB while the cost of each new added terabyte of usable capacity stays the same;
    • Highly skilled consulting support and assistance from the developer when implementing the system in its own infrastructure;
    • Rapid effect of the acquire – system delivery time of 90 days;
    • the lowest market price for 1 TB of capacity.

Key appliance features:

 

      • The specialized SQL-dialect, connection to storage with standard ODBC-interface;
      • Specially designed for processing CDR information, the own technology of recording and loading, high-speed indexing algorithms of incoming data in real time;
      • Appliance works with data for previous days arriving with a significant delay;
      • Does not require administration or any additional configuration, convenience to work offline “Plug and forget”;
      • Integrated system for health-monitoring and self-diagnostics of hardware and software;
      • Retrieving extremely large amounts of data in query results without system speed degradation (up to several billion records in one query request);
      • Querying the information on the MSISDN (called/called party) or IMSI or IMEI – up to 10 000 criteria in a single query;
      • Querying by location of subscribers: LAC/Cell-ID, support for up to 1000 criteria in single query;
      • Query processing by MSCs MSISDN, MSISDNs of SMS-centers, retrieving all CDR activities for a specified period of time;
      • Processing of custom composite SQL-queries with specialized logic, support of wildcard search using “*”, “?”;
      • Retaining of the CDR information of different telecom operators in one system with the ability to search in data for selected operators and also in the entire data volume without impact on query execution time;
      • Special data conversion (transfer mode) from «legacy» systems – loading of up to 30 terabytes per day (depending on the model) of CDR text data for the fast filling with the storage data of the previous periods when appliance is installed at the customer;
      • Storage performance and querying remains the same in the case of failure of one or several nodes;
      • Search speed does not depend on the period specified in the request, it remains the same for the last week query or the one of the week a year ago;
      • Daily incoming data breakdown (partitioning) on each node by activity datetime in CDR provides search only on the requested period (partitions scanned) without reference to the full range of the information;
      • Automatic deletion of accumulated CDR of oldest periods, loop recording of the new incoming information (round-robin).

Key specifications:

 

      • Up to 100 concurrent queries on a single system with parallel loading of new data;
      • Processing of no less than one million queries on the subscriber’s ID per day, the ability to obtain the result of up to several billion records on query;
      • Query result fetching 1000 … 100 000 records per sec (depending on how data records are spread among disks);
      • Storage capacity = nodes number * one node capacity (TB);
      • Base models contains up to 36 nodes, the expansion of the storage capacity up to 20 petabytes by increasing the number of nodes is possible;
      • Capacity expansion by adding additional standard storage nodes only;
      • Hewlett-Packard ProLiant DL/SL series (2U/4U) hardware;
      • OS Red Hat Enterprise Linux.

How to connect to appliance:

 

      • ODBC-interface (libraries for Linux/Windows x64), a specialized SQL-dialect (including SDK with demo examples of the work with a repository);
      • NoSQL interface (including SDK with query examples and results for specific types of queries), provides results in real time by eliminating overhead of SQL interface;
      • At the input system receives text files containing CDR-records (CSV/fixed-length);
      • Loading and indexing of CDR data at the speed of 100-400 MB/sec (0,8 Gbit/sec … 3 Gbit/sec) depending on the model.

The SNTT company helps its customers with the development of the storage technical solutions not included in the basic models. Maximum available CDR storage size of 20 PB corresponds to 20x42U system cabinets.

Match the size of the subscriber base of telecom operators and approximate volume of daily information CDR

 

Number of subscribers Approximate daily amount CDRs* Model / maximum term storage (months) without overwriting the old information
Minimum 24U 42U 2x42U
500 000 150 million 37 183 326 652
1 000 000 300 million 18 92 163 326
2 000 000 600 million 9 46 81 163
3 000 000 900 million 6 31 54 109
5 000 000 1,5 billion 4 18 33 65
10 000 000 3 billion 2 9 16 33
20 000 000 6 billion 8 16

* – the assessment of as much information as possible per day.

Usage:

 

      • As the major component in the construction of systems of the long-term storage and retrieval of CDR information;
      • As the main data storage in TSP’s legal control systems for compliance;
      • As a permanent archive of the selected information about CDRs of subscribers in monitoring centers.

The customer requests an appliance model, depending on the objectives of a specific project and integrates it into its own hardware and software infrastructure.

The SNTT company performs selection of models, assembly, design of the database structure for a specific task of the customer, testing, and assists in the integration of the storage in the hardware and software infrastructure of the customer.

Highly qualified SNTT specialists and the unique experience gained in implementing CDR storage systems provide the best consulting and methodological support of the projects.

Approximate performance of a system with 1-3 nodes *

 

Pseudo-SQL The approximate size of the result, rows Action period (seconds), nodes* Processing period Number of processed data (CDR)
1 2 3
SELECT* FROM calls WHEREcall_date > T1 AND call_date < T2 AND calling=`XXXX` OR called=`XXXX` 0 0,2 0,2 0,2 1 day 1 billion
<10 0,3 0,2 0,2 1 day 1 billion
200 1 0,5 0,5 1 day 1 billion
SELECT *FROM calls WHERE call_date > T1 AND call_date < T2 AND base_station=’XXXXX-YYYYY’ 10521 day1 billion
SELECT* FROM calls WHERE call_date > T1 AND call_date < T2 AND imsi=`XXXX <100 0,5 0,2 0,2 1 day 1 billion
SELECT * FROM calls WHERE calling=`XXXX` or called=`XXXX` 1 1 1 1 31 days 31 billion
SELECT *FROM calls WHERE Imei=`XXXX` 1000 10 5 2 31 days 31 billion
SELECT * FROM calls WHERE calling=`XXXX` or called=`XXXX` 10 000 15 7 3 31 days 31 billion
SELECT count(*)FROM calls WHERE calling=`XXXX` or called=`XXXX` 10 000 6 3 3 31 days 31 billion

* in the case of dividing the data into N nodes when recording (“window”), the search speed increases linearly in N times / determined by configuration at system installation stage, query run time in the table includes query fetching result and writing it on external CSV-text files, i.e. a full result.

Brochure CHRYSOLITE BDA (pdf)

Address

1000 Ljubljana, Republika Slovenija, Tomacevska cesta 046
Phone: +386 1 5374052
Fax: +386 1 5374054
Email: sntt@sntt.si

Disclaimer

© 2013 - 2018 Copyright by SNTT Company. All rights reserved. Using information from this site without our consent is prohibited.