Data Centric Model with Cloudera: Caser Success Story


In this post we are going to tell you how this process is developed based on a success story: that of one of the most important multinational insurance companies in the country, Caser.

Caser is an insurance and services company that has been operating in Spain since 1942. Its offer includes health insurance, medical attention and care centers, dental clinics, maintenance and assistance services, and a network of financial agents. Currently the group has a staff of more than 6,700 people.


At Caser they already had a Data Centric-oriented model, with a main core based on a Data Warehouse, but they needed to improve some aspects and provide new capabilities.

Its objective was to improve aspects such as the scalability and security in massive processes. They were also interested in providing new capabilities in data governance, lineage, cataloging, access controls and auditing.

The purpose was build a self-service platform, analytical BI, in which the user could obtain the most varied information as soon as possible, in the best way and in the fastest way. The need was gain agility.

He tells us about all this Hugo González, Head of the Information structures and advanced Analytics area at Caserin this interview he recently gave us:

“The challenge for us is to start having a platform, not only that has the data and can provide it, but that we do it in a much faster, more agile way and that the data is as accessible as possible throughout the company. Data is a key asset, says Hugo González, and even more so in a financial company like ours. Facilitating access to data, facilitating the incorporation of new information that can be exploited in any context in the simplest way possible is a challenge, but it is what we are looking for.”

If you are interested in the topic, we recommend the full interview:


First decision: which platform to bet on

At Caser they analyzed different technologies and assessed a multitude of options and platforms available on the market. They decided to bet on Big Data technology and, specifically, on a Big Data cluster with Cloudera.

The commitment to Big Data, in the process of digital transformation in the insurance company, has enhanced the Data Centric model, from a single open platform, Cloudera, where you get everything you need in a managed and managed way.

Getting Started with Big Data Implementation with Cloudera

A Data Centric strategy such as the one carried out by Caser requires a roadmap for several years.

In his case, the first step has been to design and implement a Data Lake to start storing information. They replaced the Staging Area Data Warehouse with a Data Lake in Cloudera. For practical purposes, it means that they replaced the relational database and implemented a Data Lake in Cloudera.

In this context they used technologies such as Apache Impala, Apache Kudu and Apache Spark. Impala for performance, Kudu for the need to modify data, and Spark for recoding SQL at scale.

They also opted to maintain technologies that they had used for years to take advantage of accumulated knowledge and experience, as is the ETL tool. With this decision they were able to minimize risks and meet their objectives in the planned time.

successful results

One of Caser’s objectives in this transformation process was that this change to a totally different technology to the one they had, would be transparent for the business, for users and processes. In this sense, the success has been complete.

Another objective that has been achieved with the Data Lake is centralize the data in a single place and that can be used for reporting, analytics and even integrate it with production applications via rest api and to be able to obtain real-time data for fraud detection.

Collaboration PUE and Caser

With PUE we have formed a team in which each party has contributed the best. We know the data from Caser. They, their knowledge and experience in technology.

Says Hugo González de Caser referring to the collaboration with PUE.

PUE has collaborated with Caser from the initial phase of the digital transformation process, in the substitution of the Staging Area Data Warehouse for a Data Lake with Cloudera. It has participated in the entire project providing technological solutions: from how to get the ETL tool to connect quickly and efficiently to the data lake, to the design and change of all processes. A data architecture capable of improving the load performance of the Data Lake has been designed and implemented, to allow more efficient data analysis in less time:

  • Conversion processes from traditional programming languages, such as PL/SQL, to Spark processes with Scala, reducing the time of data analysis processes from hours to minutes, allowing to be more agile and have a business diagnosis almost in real time.
  • Cluster configuration and security policies.
  • Integration of processes from PowerCenter to Cloudera Platform.

Currently, a second phase of collaboration has begun, with one of the company’s most massive processes.

I highlight PUE’s experience in Big Data, but also commitment and collaboration.

Hugo Gonzalez.

If your company considers and needs to start a process of modernization and transformation, we will be happy to analyze your particular case to advise you on the most appropriate solutions and technologies for your project.

Contact information


Related Posts

Leave a Reply

Your email address will not be published.