Transcription

An Oracle White PaperSeptember 2013Oracle Enterprise Transformation Solutions SeriesBig Data & Analytics Reference Architecture

Big Data & Analytics Reference ArchitectureExecutive Overview . 3Introduction . 5Reference Architecture Conceptual View . 5Focus Areas . 6Unified Information Management . 6Real-Time Analytics . 7Intelligent Processes . 8Information . 8Deployment . 9Architecture Principles .10Reference Architecture Logical View .12Information Management Components of the Logical Architecture14Real-Time Analytics Components of the Logical Architecture .18Intelligent Process Components of the Logical Architecture .21Oracle Product Mapping View .22Information Management Product Mapping .23Real-Time Analytics Product Mapping .28Intelligent Processes Product Mapping .31Oracle Engineered Systems .33Implementation .36Conclusion .37Further Reading .38IT Strategies from Oracle.38Other References .382

Big Data & Analytics Reference ArchitectureExecutive OverviewData is often considered to be the crown jewels of an organization. It can be used in myriadways to run the business, market to customers, forecast sales, measure performance, gaincompetitive advantage, and discover new business opportunities. And lately, a convergenceof new technologies and market dynamics has opened a new frontier for informationmanagement and analysis.This new wave of computing involves data with far greater volume, velocity, and variety thanever before. Big Data, as it is called, is being used in ingenious ways to predict customerbuying habits, detect fraud and waste, analyze product sentiment, and react quickly to eventsand changes in business conditions. It is also a driving force behind new businessopportunities.Most companies already use analytics in the form of reports and dashboards to help run theirbusiness. This is largely based on well structured data from operational systems that conformto pre-determined relationships. Big Data, however, doesn’t follow this structured model. Thestreams are all different and it is difficult to establish common relationships. But with itsdiversity and abundance come opportunities to learn and to develop new ideas – ideas thatcan help change the business.To run the business, you organize data to make it do something specific; to change thebusiness, you take data as-is and determine what it can do for you. These two approaches aremore powerful together than either alone. In fact, many innovative solutions are acombination of both approaches.For instance, a major European car manufacturer is collecting data via telematics from carsthey produce. This data is used to influence offers they make to their customers. It is alsoused to better understand the conditions that the car has experienced, which in turn helps inroot-cause failure analysis as well as in future automobile design.The architectural challenge is to bring the two paradigms together. So, rather than approachBig Data as a new technology silo, an organization should strive to create a unifiedinformation architecture – one that enables it to leverage all types of data, as situationsdemand, to promptly satisfy business needs. This is the approach taken by a large worldwidebank. They are using a common information architecture design to drive both their real-timetrading platforms and their batch reporting systems.The objective of this paper is to define and describe a reference architecture that promotes aunified vision for information management and analytics. The reference architecture isdefined by the capabilities an organization needs and a set of architecture principles that are3

Big Data & Analytics Reference Architecturecommonly accepted as best practices in the industry. It is described in terms of componentsthat achieve the capabilities and satisfy the principles. Oracle products are mapped to thearchitecture in order to illustrate how the architecture can be implemented and deployed.Organizations can use this reference architecture as a starting point for defining their ownunique and customized architecture.4

Big Data & Analytics Reference ArchitectureIntroductionIn order to approach Big Data and analytics holistically, it is important to consider what thatmeans. The strategy used to develop this reference architecture includes three key points toset the context:1. Any data, any source. Rather than differentiate Big Data from everything else (smalldata?), we want to view data in terms of its qualities. This includes its degree ofstructure, volume, method of acquisition, historical significance, quality, value, andrelationship to other forms of data. These qualities will determine how it is managed,processed, used, and integrated.2. Full range of analytics. There are many types of analysis that can be performed, bydifferent types of users (or systems), using many different tools, and through avariety of channels. Some types of analysis require current information and otherswork mostly with historical information. Some are performed proactively and othersare reactive. The architecture design must be universal and extensible to support afull range of analytics.3. Integrated analytic applications. Intelligence must be integrated with theapplications that knowledge workers use to perform their jobs. Likewise, applicationsmust integrate with information and analysis components in a manner that producesconsistent results. There must be consistency from one application to another, as wellas consistency between applications, reports, and analysis tools.This reference architecture is designed to address key aspects of these three points.Specifically, the architecture is organized into views that highlight three focus areas: universalinformation management, real-time analytics, and intelligent processes. They representarchitecturally significant capabilities that are important to most organizations today.Reference Architecture Conceptual ViewThe conceptual view for the reference architecture, shown in Figure 1, uses capabilities toprovide a high-level description of the Big Data and Analytics solution.5

Big Data & Analytics Reference ArchitectureFigure 1. Big Data & Analytics Reference Architecture Conceptual ViewThe top layer of the diagram illustrates support for the different channels that a companyuses to perform analysis or consume intelligence information. It represents delivery overmultiple channels and modes of operation: stationary and mobile, (network) connected anddisconnected.Focus AreasThis paper concentrates on three important aspects of the Big Data and analyticsarchitecture: Unified Information Management, Real-Time Analytics, and IntelligentProcesses. Each of these focus areas is further detailed below.It should be noted that although the reference architecture is organized into these threefocus areas, the solution cannot be implemented as silos of functionality. Rather, thecomplete solution must incorporate all aspects of the reference architecture in a cohesivemanner.Unified Information ManagementUnified Information Management addresses the need to manage information holistically asopposed to maintaining independently governed silos. At a high level this includes: High Volume Data Acquisition – The system must be able to acquire data despite highvolumes, velocity, and variety. It may not be necessary to persist and maintain alldata that is received. Some may be ignored or discarded while others are kept forvarious amounts of time.6

Big Data & Analytics Reference Architecture Multi-Structured Data Organization and Discovery – The ability to navigate and searchacross different forms of data can be enhanced by the capability to organize data ofdifferent structures into a common schema. Using this form of organization, thesystem can relate structured data such as model numbers and specifications, semistructured data such as product documents, and unstructured data such asinstallation videos. In addition, new business opportunities can be discovered bylooking at different forms of data in new ways.Low Latency Data Processing – Data processing can occur at many stages of thearchitecture. In order to support the processing requirements of Big Data, the systemmust be fast and efficient.Single Version of the Truth – When two people perform the same form of analysisthey should get the same result. As obvious as this seems, it isn’t necessarily a smallfeat, especially if the two people belong to different departments or divisions of acompany. Single version of truth requires architecture consistency and governance.Real-Time AnalyticsReal-Time Analytics enables the business to leverage information and analysis as events areunfolding. At a high level this includes: Speed of Thought Analysis – Analysis is often a journey of discovery, where the resultsof one query determine the content of the next. The system must support thisjourney in an expeditious manner. System performance must keep pace with theusers’ thought process.Interactive Dashboards – Dashboards provide a heads-up display of information andanalysis that is most pertinent to the user. Interactive dashboards allow the user toimmediately react to information being displayed, providing the ability to drill downand perform root cause analysis of situations at hand.Advanced Analytics – Advanced forms of analytics, including data mining, machinelearning, and statistical analysis enable businesses to better understand past activitiesand spot trends that can carry forward into the future. Applied in real-time, advancedanalytics can enhance customer interactions and buying decisions, detect fraud andwaste, and enable the business to make adjustments according to current conditions.Event Processing – Real-time processing of events enables immediate responses toexisting problems and opportunities. It filters through large quantities of streamingdata, triggering predefined responses to known data patterns.7

Big Data & Analytics Reference ArchitectureIntelligent ProcessesA key objective for any Big Data and Analytics program is to execute business processes moreeffectively and efficiently. This means channeling the intelligence one gains from analysisdirectly into the processes that the business is performing. At a high level this includes: Application-Embedded Analysis – Many workers today can be classified as knowledgeworkers; they routinely make decisions that affect business performance. Embeddinganalysis into the applications they use helps them to make more informed decisions.Optimized Rules and Recommendations – Automated processes can also benefit fromanalysis. This form of business process executes using pre-defined business logic.With optimized rules and recommendations, insight from analysis is used to influencethe decision logic as the process is being executed.Guided User Navigation – Some processes require users to take self-directed action inorder to investigate an issue and determine a course of action. Whenever possiblethe system should leverage the information available in order to guide the user alongthe most appropriate path of investigation.Performance and Strategy Management – Analytics can also provide insight to guideand support the performance and strategy management processes of a business. Itcan help to ensure that strategy is based on sound analysis. Likewise, it can trackbusiness performance versus objectives in order to provide insight on strategyachievement.InformationThe Big Data and Analytics architecture incorporates many different types of data, including: Operational Data – Data residing in operational systems such as CRM, ERP,warehouse management systems, etc., is typically very well structured. This data,when gathered, cleansed, and formatted for reporting and analysis purposes,constitutes the bulk of traditional structured data warehouses, data marts, and OLAPcubes.COTS Data – Custom off-the-shelf (COTS) software is frequently used to supportstandard business processes that do not differentiate the business from other similarbusinesses. COTS applications often include analytical packages that function as preengineered data marts. COTS analytical data, transformed from operational data, canalso be incorporated into the data warehouse to support analysis across businessprocesses.Content – Documents, videos, presentations, etc., are typically managed by a contentmanagement system. These forms of information can be linked to other forms of datato support navigation, search, analysis, and discovery across data types.8

Big Data & Analytics Reference Arc