Enterprises Don’t Have Big Data, They Have Bad Data

 

PayPal co-founder and venture capitalist Peter Thiel commonly harps on the tech community for overusing buzzwords like “cloud” and “big data.” He’s not the only one who’s been saying this, but the message still doesn’t appear to be sinking in with most enterprises.
 
Companies often tout all their terabytes and petabytes of data, and their massive teams of data scientists running huge Hadoop clusters with Apache Kafka streams that are such a competitive advantage.

The truth is, most of them suffer from one of the old adages in computing: garbage in, garbage out. Not only do most of them actually not have Big Data in terms of data complexity or volume, but most of them actually have Crappy Data, and it’s probably hurting their business. According to Experian Data Quality, inaccurate data affects the bottom line of 88 percent of organizations and impacts up to 12 percent of revenues.
Good Big Data
Some companies actually have good data and know how to use it. From mature, web-native companies like Google to engineering-based companies like Boeing, the companies listed below have successfully managed enormous amounts of data and used it to make true data-driven decisions.

Netflix: Giving Its Users What They Want. Accounting for a third of peak-time Internet traffic in the U.S., Netflix collects massive amounts of data about its users’ viewing habits, and can break it down by region, time of day, watching hours and a plethora of other data. This has put them in a unique position of being able to accurately predict what viewers want.
Case in point, Netflix has expanded well beyond a DVD and streaming service to becoming its own production company, with hit shows like House of Cards and Orange Is the New Black. They’ve also shirked the traditional pilot episode model to confidently produce full seasons of their original series.
IBM And The Weather Company: Understanding How Weather Affects Business. IBM has teamed up with the Weather Company to combine two very large sets of data and accurately analyze how the weather impacts business. Spanning everything from retail to insurance, they’ll be able to accurately provide real-time insights into how temperature changes impact sales or how insurance companies can save dollars by advising their clients to move their cars.
Icahn School Of Medicine At Mount Sinai: Predicting Patients’ Health. The New York City-based school has tasked Jeff Hammerbacher, famously known as Facebook’s first data scientist, to lead the development of a computer that analyzes the medical information they’ve collected from the half a million patients they treat per year.
Working with the head of Mount Sinai’s Institute for Genomics and Multiscale Biology, they’re working to make predictions that could cut the cost of healthcare — from assessing a patient’s medical history and risk factors to determine how often they’ll need healthcare to allowing doctors to prescribe treatments based on risk models gathered from genomics and lab data.
Amazon: Setting A New Bar For Customer Service. Amazon has access to unprecedented insights about its users — from what books they’re reading to how often they’re restocking cotton balls. While other companies have backburner customer support, Amazon has made it a key to its business by emphasizing the importance of communication and direct relationships with their consumers. Amazon uses its wealth of data about their users to immediately provide representatives with relevant information about a customer the moment they need support, streamlining the process and solidifying their loyalties.
Xerox: Improving Employee Retention. Whereas past work experience has often been the model for hiring new employees, Xerox found that hiring for its call centers had an entirely different basis for success. Using big data, the organization found that a potential employee’s personality was the real predictor of whether they would stay — creative people tended to stick it out, inquisitive people did not. Armed with this information, and a hire survey rather than a hiring manager, they were able to cut their employee turnover rate at all their call centers by 20 percent in six months.

However, most companies don’t use data well - Bad Big Data
Enterprises have historically spent far too little time thinking about what data they should be collecting and how they should be collecting it. Instead of spear fishing, they’ve taken to trawling the data ocean, collecting untold amounts of junk without any forethought or structure. Deferring these hard decisions has resulted in data science teams in large enterprises spending the majority of their time cleaning, processing and structuring data with manual and semi-automated methods.

DJ Patil, the recently appointed Chief Data Scientist of the White House, summarizes the data problem well, noting that “you have to start with a very basic idea: Data is super messy, and data cleanup will always be literally 80 percent of the work. In other words, data is the problem.”

But it’s not all bad news. According to the industry research firm Wikibon, 52 percent of data tool investments are being spent on technologies for ingesting and organizing data so that it can be more readily accessible and prepared for analysis. The key to tackling this properly isn’t just spending on more or better tools.
Applying Big Data To Your Business
To truly turn an enterprise into a data company, here are some guidelines and methods that have been performed by some of the best data companies in the world.
Know Thyself. Start by understanding the type of data you need to analyze first — is it event data, financial data, graph data or something else? This is the most important factor in determining whether you a need to capture data at the most atomic level or in some other format.
Don’t Over-Delegate. Many businesses hand off setting up analysis to developers or IT without involving the actual business users — it’s critical that those who are actually going to be using the data are involved with understanding exactly how it is being collected and aggregated to avoid critical problems down the road.
Define The Use Cases. As a corollary to don’t over-delegate, don’t let business users either give generic use cases (e.g. “we want to track lead sources”) or spec out irrelevant use cases. Every piece of data needs to fit into an analytical framework and be part of solving  a problem. Appoint either a highly technical business user or business-savvy tech lead to own the final signoff here.
Stop At The Source. Garbage in, garbage out; make sure you understand the source and types of data. Where does your data originate? Is it accurate? If you don’t know the answers to these questions, start looking into it now.
Use The Right Tool For The Job. There are many great analytical tools out there. Undertake a formal “bake-off” process once you’ve defined your key use cases for your business and end users, and evaluate against your needs versus potential cool features you may never end up using.
Big data alone is silly. Building an enterprise with smart, usable data is what every company should strive to create.
Techcrunch: http://tcrn.ch/1KuFsav

 

« How are Businesses Responding to Cyber Risks?
Spyware Rises in Popularity with Governments »

CyberSecurity Jobsite
Perimeter 81

Directory of Suppliers

Perimeter 81 / How to Select the Right ZTNA Solution

Perimeter 81 / How to Select the Right ZTNA Solution

Gartner insights into How to Select the Right ZTNA offering. Download this FREE report for a limited time only.

Cyber Security Supplier Directory

Cyber Security Supplier Directory

Our Supplier Directory lists 6,000+ specialist cyber security service providers in 128 countries worldwide. IS YOUR ORGANISATION LISTED?

FT Cyber Resilience Summit: Europe

FT Cyber Resilience Summit: Europe

27 November 2024 | In-Person & Digital | 22 Bishopsgate, London. Business leaders, Innovators & Experts address evolving cybersecurity risks.

DigitalStakeout

DigitalStakeout

DigitalStakeout enables cyber security professionals to reduce cyber risk to their organization with proactive security solutions, providing immediate improvement in security posture and ROI.

NordLayer

NordLayer

NordLayer is an adaptive network access security solution for modern businesses — from the world’s most trusted cybersecurity brand, Nord Security. 

Ericsson

Ericsson

Ericsson is a leading provider of telecommunications services and network infrastructure solutions including all aspects of network security.

ITC Secure Networking

ITC Secure Networking

ITC are a leading cloud-based MSSP delivering service innovation in cyber security analytics & cloud technology.

PSYND

PSYND

PSYND is a Swiss consultancy company based in Geneva specialized in CyberSecurity and Identity & Access Management.

Aristi Technologies

Aristi Technologies

Aristi provides cybersecurity risk and compliance services to help manage your unique cyber risks, safeguarding your systems and data and complying with government and industry standards.

ADVA Optical Networking

ADVA Optical Networking

ADVA is a company founded on innovation and focused on helping our customers succeed. Our technology forms the building blocks of a shared digital future and empowers networks across the globe.

Conseal Security

Conseal Security

Mobile app security testing done well. Conseal Security are specialists in mobile app penetration testing. Our expert-led security analysis quickly finds security vulnerabilities in your apps.

Port443

Port443

Port443 specialises in providing Security Orchestration, Automation and Remediation (SOAR) "as a service".

B2Bcert

B2Bcert

B2BCERT one of the top companies offering ISO 9001, ISO 14001, ISO 45001, ISO 22000, ISO 27001, ISO 20000,CE Marking, HACCP, and other globally accepted standards and Management solutions.

Zenzero

Zenzero

Zenzero simplifies technology adoption and supports our customers through managed and outsourced IT support.

Denodo

Denodo

Denodo transforms the way organizations operate by unifying their data assets in real time and making data ubiquitous and secure to all users and business applications.

Silobreaker

Silobreaker

Silobreaker is a SaaS platform that enables threat intelligence teams to produce high-quality and relevant intelligence at a faster pace.

LevelBlue

LevelBlue

LevelBlue simplify cybersecurity through award-winning managed security services, experienced strategic consulting, threat intelligence and renowned research.

TraitWare

TraitWare

The TraitWare mission is to increase user and company security while simplifying access to digital and physical resources through the elimination of the need for usernames and passwords.

Invisinet Technologies

Invisinet Technologies

Invisinet is a cybersecurity technology company specializing in innovative solutions that protect network infrastructure and critical assets from advanced threats.

SafeAeon

SafeAeon

SafeAeon is a leading Cybersecurity-as-a-Service provider, offering 24x7 premium Managed Security Services with AI-powered and Human-driven 24x7 SOC.

Gathid

Gathid

Gathid is a unique and versatile identity governance platform providing organizations with the ability to model, explore, audit, and track complex access-related scenarios.