Tuesday, July 04, 2006

A Brief on Unstructured Data & Text Analytics - The Next Gen Analytics Niche

(From "Patterns for Success – Options for Analyzing Unstructured Information"; Dr. Fern Halper; Hurwitz & Associates; 6/21/06)
Text analytics is the process of extracting unstructured text and transforming it into structured information that can then be mined and analyzed in various ways. This transformed information can be combined with additional structured data a company owns (e.g. sales, demographic data) and analyzed using various predictive and automated discovery techniques. Or, the text can be extracted and transformed and then analyzed interactively to determine relationships and trends, look for clusters and so on. The actual extraction of the information is accomplished via techniques from the fields of computational linguistics, statistics, and other computer science disciplines. For example, computational linguistic algorithms can enable the parsing of sentences to extract the who, what where, when and why in text.

Text analytics differs from search, although it can be used to augment search. In basic search technologies, end users know what they are looking for. Interestingly, search is now evolving and converging with business intelligence to provide applications that might, for example, monitor news feeds to understand what competitors are doing.

While the field is still evolving, there are a number of players out there worth noting.

  • Business intelligence powerhouses SPSS and SAS both offer solutions in this space tied to their data mining and predictive analysis products. SPSS Predictive Text Analytics solution combines the linguistic technologies of their LexiQuest text mining products with the data mining capabilities of Clementine. SAS Text Miner is integrated with its Enterprise Miner product and provides users with the ability to mine structured and unstructured information. SAS also has technologies to deal with finding relationships between documents.
  • Other companies such as Attensity, Inxight, Clear Forest and nStein provide information extraction technologies that can be leveraged in various analytical activities. For example, Attensity offers a number of different extraction techniques together with a series of its own applications that allow users to interactively explore information found in text and also analyze it. Attensity also works with other third party software. Inxight provides text extraction software that can be used with its visualization technologies to determine relationships and trends in text data. It also has applications to augment the capabilities of search engines.
  • Companies such as Clarabridge Inc. deal with the preprocessing of text data in order to make it more useful in business intelligence packages. The product, Clarabridge Content Mining Platform, provides connectors to source information, transforms the information using various extraction techniques, then performs data quality and staging work on the data, and provides a schema that can serve the information up to various BI packages.
  • Even the big players like IBM, Oracle, and Microsoft are making moves to offer solutions in the text analytics space. IBM has developed the Unstructured Information Management Architecture (UIMA), an open-source framework that defines a common set of interfaces for integrating different text analytic components and applications.

Monday, March 27, 2006

Trend 24: Data Integration Is the Technology Pressure Point

Trend for 2006: Data Integration Is the Technology Pressure Point: "Data Integration Is the Technology Pressure Point

Data quality will go from nice-to-have to need-to-have. Flexibility and openness remain nothing but talk, while standardization and virtualization remain useless, unless companies can reliably integrate their systems and ensure the quality of their data. Not only do strategic systems such as business intelligence depend on integration and accurate, up-to-date and unambiguous data, but so does compliance with Sarbanes-Oxley and other regulations.

Companies are currently attempting to introduce new approaches to integration—Web services, SOA—while wrestling with new technologies such as RFID and the mobile Web that will introduce new data streams and data quality problems. But the business stakes are high, and huge volumes of data and new technology make for a volatile mix; not all companies will be able to master the shift, and some will stumble. To achieve integration and quality, companies will focus on information governance—the question of who has control and responsibility for data."

Tuesday, April 12, 2005

"The Great Divide": Standardized Definitions in Enterprise Analytics Applications

One of the challenges in building an enterprise analytics application is the "great divide" that exists between the definitions of business terms. A 'billing fee' might be the incarnation of different measures to different people depending on the sub-stream they are working in, and it becomes the job of the analytics architects to decipher the appropriate definitions. The great divide becomes even deeper when the technical team of architects and developers, often brought on to work on the analytics project on a temporary basis, are too aligned with the technology rather than the client's business to know the subtle differences in the definitions. A well-laid metadata layer can alleviate the divide, although again it depends on the technical and functional teams to hammer out the exact definitions in the analysis and design phases. Often the confusion in definitions stems in the functional teams themselves, when two teams are in fact referring to the same entity having the same definition in slightly different ways. Thus it might be that two layers of analysis is required: the first between the technical team and each of the functional teams to identify the attributes and measures that need to be defined in the analytical application, an internal assessment by the technical team to analyze possible overlap in definitions, followed by a cross-team joint effort to seek clarifications and standardization of the definitions. It might be worthwhile to dedicate substantial time to the cross-team effort to iron out the inefficiencies due to conflicting and overlapping definitions at the project outset, rather than followed a siloed approach to the development effort as far as the definitions are concerned.

Tuesday, November 16, 2004

The OLAP Report: How not to buy an OLAP product

The OLAP Report: How not to buy an OLAP product: "OLAP products differ from each other much more than do, for example, relational databases, programming languages, word processors or presentation graphics packages which greatly increases the scope for confusion when selecting an OLAP product. Just to add to the problems, neither IT professionals nor end-users are fully equipped, on their own, to make properly informed OLAP selections (whereas each of the other products listed above could be chosen by one group without help from the other). This means that, unlike most other software, OLAP evaluations must involve both users and IT. But in many organizations, IT and end-users have trouble communicating with each other and are often barely on speaking terms. Consequently, we frequently come across companies who have made strange product choices, often because they started with a bizarre shortlist consisting of the contradictory preferences of the technical and business groups."

The OLAP Report: Consolidated BI

The OLAP Report: Consolidated BI: "The BI industry has seen a wave of acquisitions since the mid 1990s, with takeovers occurring every few months. The first wave was mainly other companies who were attracted by the higher growth rates in the BI industry and preferred to buy an existing vendor rather than to develop their own product. These changes of ownership did not produce any consolidation because there was no net reduction in the number of BI vendors or products. There was also no reduction in competition as market shares were not concentrated in fewer and fewer hands. Examples of such non-consolidating acquisitions include the entry of the various database vendors into the BI market, including Oracle's purchase of the Express business from IRI Software, Informix buying STG MetaCube and Microsoft's purchase of the Panorama technology."

Thursday, October 28, 2004

BW Online | July 9, 2004 | Software Makers: Soon They'll Be Fewer

BW Online | July 9, 2004 | Software Makers: Soon They'll Be Fewer: "Business-intelligence specialists like Business Objects (BOBJ ) or Hyperion Solutions (HYSL ) would make sense paired with a broader application provider like Microsoft, SAP, or Oracle, investment bankers say.
Software companies will have to work hard to make sure deals don't create more problems than they solve. 'The track record for such deals in the IT sector is horrible,' warns Heap, the Bain merger consultant. These deals are difficult to execute because they often force the acquiring company to enter a new and unfamiliar market. Acquirers have one thing working in their favor this time around, however: The price of target companies has fallen since the '90s boom, reducing the risk of overpaying. And given the passage of time, it's easier to evaluate the strength of a company that's being acquired. Consolidation won't be a panacea for software's woes, but it's bound to help the industry adjust to the realities of a maturing market. "

Return of the "Big Four" (Did They Ever Really Leave?) - META Group

The 1990s were glory years for the "Big 8/6/5" consultancies, firms that rapidly expanded from their audit roots to build out business advisory and IT services capabilities and drove megatrends like BPR, ERP implementation, and e-business. The bust that followed the boom, and questions about conflicts of interest from selling non-audit services to audit accounts (and subsequent regulations to curtail that practice), led to the splitting out of many of the IT-intensive services (KPMG spun out KPMG Consulting, now BearingPoint; E&Y sold its practice to Capgemini, and PwC to IBM; and Andersen's staff scattered after its demise). Deloitte did not spin out its IT practice, though it tried, and now looks quite smart for having maintained those capabilities. Indeed, Deloitte's integrated service offering business (fusing IT with risk, tax, and financial domain expertise) is booming. PwC is building up its IT practice - one that never totally disappeared - E&Y has announced its intent to do the same, and KPMG's Risk Advisory Service that includes all non-tax/audit/financial services including IT is rapidly growing. What has changed is that, with the exception of Deloitte, these firms are not targeting enterprise packaged application deployment or development. Rather, the emphasis is on embedding IT advisory capabilities within other practice areas, particularly around risk and compliance, to create new hybrid offerings. While this may not prove as lucrative as work performed in the glory years, it does create compelling (anything to make compliance more viable and less costly is compelling) and differentiating service offerings that will likely prove less conflicting. (Stan Lepeak)

Thursday, October 14, 2004

My thoughts on Nicholas Carr's proposition of how "IT Doesn't Matter"

I want to write today about the information technology industry as a whole, and not just restrict myself to the BI/EA world. Nicholas Carr recently wrote a ground-breaking article about how "IT Doesn't Matter" (http://www.nicholasgcarr.com/articles/matter.html) in which he provokes the thought that IT has evolved into a kind of ubiquitous industry that does not provide competitive advantage anymore. One of the main things that made me to think about information systems in a different way as far as strategic implications are concerned is the concept proposed by Carr that IT systems have evolved to have a ubiquitous presence in today's businesses. This concept merits some thought, since I agree with him to the extent of saying that the IT systems are so readily available and easily implemented that IT by itself cannot be a source of competitive advantage. Up until the late 90's companies implementing IT systems automatically enjoyed a leadership position in the market. However, in today's rapidly evolving business world, the mere presence of IT systems have diminished in providing a source of competitive advantage. While I agree, though a bit more reservedly, with that facet of Carr's thought, I disagree with him on his statements that IT has evolved into an industry synonymous with the utility companies of the early 20th century. I think that though technology by itself will not provide competitive advantage in the future, the implementation methodology and strategy behind the technology will definitely have strong strategic influence for companies. This is in stark contrast to the utility companies that Carr describes. Systems, like business intelligence that provide insights inside the gut of a business to reveal fundamental aspects of the working of a business (kind of like an X-Ray of the business) will be a source of completive advantage as long as careful thought is given to the measures, dimensions, scope and breadth of definition of the project.

Wednesday, October 13, 2004

The CFO Project: Business Intelligence: Solving the ERP Overload

The CFO Project: "Organizations realize there is a wealth of information in ERP data, but the difficulty is finding and leveraging it. To truly maximize return on investment, a business intelligence solution on top of an ERP system is required. Business intelligence is a broad category of applications, including technologies for reporting, analysis, and sharing of information that helps users make better business decisions."

Business Intelligence Market Analysis: A Quick Take

The specialized players in the Business Intelligence market (Business Objects, Cognos etc.) are facing increased threat of forward integration from the major players in the database world. This is not surprising, given that a majority of analytical applications rely on RDBMS as the data source. The recent offerings by Oracle via their BI Beans, IBM with their boosting the DB2 functionality significantly with OLAP capabilities signal the turn that the industry is making. It will be interesting to see the strategies of the niche BI vendors to effectively compete with the database vendors. The other threat of new entrants into the BI market comes from vendors specializing in technologies that hereto supported the BI function. For example, Informatica was content with the ETL function supporting the underlying data warehouse, however they have made a move to transgress into the dashboard world via their Power Analyzer tool. Though not a direct threat to the entrenched players like Cognos, Business Objects and MicroStrategy, I consider this move to be a sort of a "test the waters before the plunge" tactic. There has been diversification from the core competency by the niche players themselves: observe Business Objects' Data Integrator and Cognos' Decision Stream which transgresses on Informatica's turf. What does all this mean to me as a worker bee in the analytics space? Well, the added competition on one hand dilutes my own "product offering" i.e. my skill set. To keep up with the evolving market, I need to evolve my own skills to at least understand the integration and differentiation of these products. On the other hand, the increased competition promotes product innovation and imposes a downward pressure on prices (something that is not happened yet, but will eventually come about). Keep an eye out for the evolution of this space, I am predicting there will be another round of a consolidation exercise very soon.