DWH News
Closing the Small Business Cloud Knowledge Gap
According to AMI-Partners, worldwide small businesses (SBs) (1-99 employees) represent a significant opportunity for Cloud services, particularly Software as a Service (SaaS). However, one in five SBs currently not using SaaS report that their businesses are too small for such applications. Significant subsets of these small businesses believe that they do not have the appropriate expertise to make the migration to hosted applications.
What's Essential -- And What's Not -- In Big Data Analytics
Big Data is hot. Advanced Analytics is hot. The combination (which we’ll call Big Analytics) is blazing. For that very reason, no one seems to agree on the right way to do Big Analytics.
Some vendors -- whom we’ll call the Columnar-Haves, including vendors marketing columnar analytic databases -- claim that columnar is the structure for Big Analytics. Others, whom we’ll call the Columnar-Have-Nots, claim that traditional row-based data stores (coupled with massively parallel processing, or MPP capability) give shops considerably more flexibility than their column-based counterparts.
Talend Enhances Data Integration with MapReduce Support
Open source data integration (DI) specialist Talend recently announced support for Apache Hadoop, the increasingly ubiquitous open source implementation of Google Inc.'s MapReduce algorithm.
Talend isn't the first business intelligence (BI) vendor to embrace MapReduce. Its peers in the analytic database arena -- namely, Aster Data Systems Inc. and the former Greenplum Software Inc. (which was acquired by EMC Corp.) -- first announced MapReduce support nearly two years ago. (Greenplum also supported Hadoop before it announced its own -- i.e., native -- implementation of the MapReduce algorithm.)
THE DANGERS OF MAKING A REFERRAL
Oracle BI EE 11g – Handling Double Columns – ID/Description interoperability
The other big change in BI EE 11g as mentioned before here, is the ability to assign ID columns for descriptive columns more commonly known as Double Columns. This feature has 2 fold advantages
1. In BI EE 10g, there was no automated way of filtering on ID’s when end users chose the description values in the prompts. The Double column feature provides this ability in 11g.
2. In many implementations where data is captured in multiple languages, the descriptions might be stored in different languages. But the filtering of data will be on ID’s (which will be the same across languages). Double column feature provides that ability now.
Lets try to put this feature to use by using a simple example. We shall be using the Sales Warehouse schema (SH) that comes by default with an Oracle database installation. The screenshot below shows 3 columns from the CHANNELS dimension. One column is CHANNEL_DESC which contains the channel details in English. The second column in CHANNEL_DESC_FR which contains the channel details in French. And the third column is the CHANNEL_ID that acts as an id to both the french and english descriptions.
Our idea is to create 2 prompts, one in french and the other in English, and then using these 2 to filter on 2 separate reports. To do this we start with assigning the CHANNEL_ID column as the descriptor ID column in the Business Model and Mapping layer for both CHANNEL_DESC and CHANNEL_DESC_FR columns.
Once this is done, lets go to BI EE UI and create 2 dashboard prompts. Ideally its not necessary to use 2 prompts as we can use INDEXCOL function in the repository to switch between columns based on the user preference language, but for demonstration, i will create 2 prompts.
The first dashboard prompt will point to the French description field. When we include this column in the prompt, you will notice that the prompt will automatically show the Included ID column as well.
Now, if you look at the options section, we now have the ability to display the Descriptor ID as well.
Lets enable that option as well so that users who are more familiar with the ID’s will have the ability to toggle between the description and the ID.
Lets save this prompt and create another prompt that is similar to the above prompt but with CHANNEL_DESC as the source column.
Lets now create a simple report in Answers containing Year, Product Category and Sales columns. And lets apply the channel filter (on English) as well. When you create the prompt, you will notice that for static filters you can now enforce the filtering on IDs directly. But for this blog entry, lets use the is prompted filter.
If you now, bring the report and the prompt in a dashboard, end users will now have the ability to filter on the description as well as the ID.
If you enable the Select by ID check box, you will notice that the drop down will now have the ID and the description concatenated for easy selection.
Same will be the case for French descriptions as well.
Lets first choose the French Descriptions and then see, in terms of SQL how the query filter is generated
WITH SAWITH0 AS (select sum(T69590.AMOUNT_SOLD) as c1, T69588.PROD_CATEGORY as c2, T69591.CALENDAR_YEAR as c3, T69591.CALENDAR_YEAR_ID as c4 from SH.TIMES T69591, SH.PRODUCTS T69588, SH.CHANNELS T69584, SH.SALES T69590 where ( T69584.CHANNEL_ID = T69590.CHANNEL_ID and T69588.PROD_ID = T69590.PROD_ID and T69590.TIME_ID = T69591.TIME_ID and (T69584.CHANNEL_ID in (2, 4)) and (T69590.CHANNEL_ID in (2, 4)) ) group by T69588.PROD_CATEGORY, T69591.CALENDAR_YEAR, T69591.CALENDAR_YEAR_ID), SAWITH1 AS (select distinct 0 as c1, D1.c2 as c2, D1.c3 as c3, D1.c1 as c4, D1.c4 as c5 from SAWITH0 D1) select D1.c1 as c1, D1.c2 as c2, D1.c3 as c3, D1.c4 as c4 from SAWITH1 D1 order by c1, c3, c2As you notice, though we have chosen the descriptions in the UI, the filters are automatically pushed to the IDs. Same will be the case for filtering on IDs as well.
Dirty Data's Domino Effect
DataMentors Releases Low-Cost Data Quality Solution
DataMentors LLC, a full-service Data Quality and Data Integration Solutions Company, today announced the launch of DataPoint, a bundled data cleansing and business intelligence solution suited to companies with 5,000 to 100,000 customers.
DataPoint combines the power of DataMentors' DataFuse, a world-class data quality and householding software with PinPoint, its state-of-the-art marketing analytics software solution. However, DataPoint is a lighter version of these more robust products, offered at a fraction of the cost of typical enterprise-sized solutions.
Lafayette Federal Credit Union picks SAS® for predictive insight from operational data
Facing increased competition, credit unions struggle to work smarter while still providing superior member service. Lafayette Federal Credit Union (LFCU) wanted to predict and prepare for events and issues rather than simply react. With an influx of valuable data from multiple sources including core processing systems, LFCU needed to revamp its data infrastructure so it could consolidate and analyze data better. The company selected SAS Banking Intelligence Architecture from SAS, the leader in business analytics, to seamlessly integrate data with comprehensive reporting and predictive analysis.
Business Intelligence Fuels Growth for Small and Midsize Customers
To keep growth on track in a challenging economic climate, small and midsize companies need immediate, holistic insight across their organizations. As they grow successfully, these companies also need solutions that can easily scale to keep pace. In order to address these requirements, SAP AG delivers market-leading business intelligence (BI) solutions that empower employees with intuitive access to information, giving them the visibility necessary to make smarter decisions. Recognizing that small and midsize customers often have limited IT budgets, these BI solutions can be simply deployed and administered with minimal resources. Customers such as Anna's Linens, Central Indiana Community Foundation, GENBAND, OraSure Technologies and NC4 currently rely on BI solutions from the SAP® BusinessObjects™ portfolio to keep business strategy aligned with execution.
Oracle BI EE 11g – Lookup Tables – Sparse and Dense Lookups
A very important feature that has been introduced in 11g is the ability to model lookup tables in the repository. If you have worked with ETL tools before, lookup tables are quite common especially when we want to do a lot of lookup operations (id to description mappings). In 10g, to model lookup tables the only way was to make inner joins (equi join or outer joins) to the lookup tables through the Logical Table sources. But now in 11g, this ability has been added to reference both physical tables and logical tables.
There are 2 types of lookup tables.
1. Sparse Lookups – A sparse lookup basically means that the main driving table does not necessarily have corresponding lookup values in the lookup table for all the id values. This can be considered to be an equivalent of a Left Outer Join.
2. Dense Lookups – A dense lookup basically means that the main driving table will have matching lookup values in the lookup table for each of its unique id value. This can be considered to be an equivalent of an inner join.
There are 2 different ways of modeling lookup tables. Lets go through each one of them in this blog entry.
1. Physical Lookups – To understand physical lookups, lets take a very simple example given below
We have 2 tables, CUSTOMERS and CUSTOMER_LKP. CUSTOMERS table has all the details of a customer with CUST_ID being the unique primary key. CUSTOMER_LKP has 3 columns CUST_ID, CUST_INCOME_LEVEL and CUST_MARITAL_STATUS with CUST_ID being the primary key. The main difference between these 2 tables is, not all customers in the main CUSTOMERS table have a corresponding income level and marital status.
To model the CUSTOMER_LKP as a lookup table, we need to first define a primary key. If there is more than one column acting as a primary key, ensure that the key contains all the columns. In the physical layer, there is no join needed on the CUSTOMERS and the CUSTOMER_LKP table.
Now, in the Business Model and Mapping layer, lets create a new column called Customer Income Level. After that is created, lets go to the LTS mapping and apply the following the function. If you have more than one column as a primary key, the order of columns used in the key should match with the column order in the Lookup function.
Lookup( SPARSE "ORCL - SH".."SH"."CUSTOMER_LKP"."CUST_INCOME_LEVEL" , 'No Income Defined', "ORCL - SH".."SH"."CUSTOMERS"."CUST_ID")What we have done here is we have directly referenced the lookup value column in our lookup function. Since not all customers have an income level set, SPARSE is used. The basic syntax of lookup functions is given below
LOOKUP ( SPARSE/DENSE #Lookup Value column from the Lookup Table#, #Default Value if there is no lookup value in the Lookup table# (only needed for SPARSE lookups), #Primary key columns from the main table )Lets now create a report as shown below
Wherever the Customer Income Level is not defined those customers get defaulted to No Income Defined value. If we look at the underlying database query, there would be a left outer join that would be pushed automatically into the query due to this lookup function.
In the Physical Lookups option, the lookup operations are pushed to the database layer (wherever possible). Lets now look at the 2nd approach where the lookup operations will be pushed to the BI Server layer.
2. BMM Lookups:
11g now supports doing lookup operation in its own memory by modeling a logical table as a lookup table. For example, lets consider the below currency conversion exchange rates table.
This rates table has a composite primary key. To use the BMM lookups, lets create a new logical table in the BMM layer to hold the exchange rates table as shown below
To denote a table as a BMM lookup table, we need to enable the lookup option (if you notice, there is no more bridge table check box option like in 10g but just a lookup option)
When we enable a logical table as a lookup table, that means that this logical table does not require any BMM joins to either the fact or the dimension. So we can now have standalone tables in the BMM layer in 11g.
After enabling this, lets create a new logical column for extracting the rates. Since we will have a rate for every country and day, we will use DENSE lookups in this case. The function used for doing this is given below
Lookup ( DENSE "SH - Lookups"."Rates Lookup"."RATE" , "SH - Lookups"."Customers"."COUNTRY_ISO_CODE", "SH - Lookups"."Times"."TIME_ID" )The syntax for the lookup functions remain the same as the one that we used in the physical lookups. But here we need to use logical table names instead of physical table names. To use this rate as part of a FX restatement, multiply the measures with the above rate lookup column. Lets now create a report and look at the query generated
The BMM lookup is now fired as a separate query. BI Server will do the in-memory joins between the rates and the measures & will do the aggregation as well.
Suresh Chandrasekaran, Denodo Technologies
A Problem with Success
How SaaS Has Changed The BI Landscape
Stonefield Query Eliminates Business Reporting Bottlenecks
Stonefield Software Inc. announces the latest release of its award winning software, Stonefield Query Version 4.0. The new release of Stonefield Query helps to eliminate IT bottlenecks by creating a self-service business reporting solution for the casual-user.
“Power users and casual users. Power users, typically make up less than 20% of the people in an organization, these include IT developers, super users, and analytical modelers. Casual users, make up 80% or more of the people in an organization, include executives, managers, staff, and business analysts.”
San Antonio Spurs select SAS® Analytics to capitalize on player stats
Like most NBA teams, the San Antonio Spurs spend the offseason poring over player statistics to find better ways to run their sport operations. Using SAS Visual Data Discovery from the leader in business analytics software and services, Spurs staff has found an analytics advantage. SAS Visual Data Discovery brings the Spurs advanced analytics and exploratory data analysis with interactive data visualization, leading to better analyses, faster decisions and more effective presentations of analytic results.
San Antonio Spurs select SAS® Analytics to capitalize on player stats
NCN® Selects InetSoft’s Business Intelligence Software
Piscataway, NJ – Aug 16, 2010 - InetSoft Technology, an innovator in dashboard, reporting and mashup solutions, announced that NCN® has selected InetSoft’s Style Intelligence in order to search for additional savings opportunities and provide automated reports to clients externally. NCN is the national leader in cost management for out-of-network claims and network enhancement and uses cost-based data and transparent reporting to maximize savings on healthcare claims.
