Loading...

Big Data for Dummies

Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data m...

Full description

Bibliographic Details
Main Author: Judith Hurwitz
Other Authors: Alan Nugent; Fern Halper; Marcia Kaufman
Format: Printed Book
Published: New Delhi Wiley India 2013
Series:--For dummies.
Subjects:
Table of Contents:
  • pt. I. Getting started with big data
  • 1. Grasping the fundamentals of big data
  • Evolution of data management
  • Understanding the waves of managing data
  • Creating manageable data structures
  • Web and content management
  • Managing big data
  • Defining big data
  • Building a successful big data management architecture
  • Capture, organize, integrate, analyze and act
  • Architectural foundation
  • Performance matters
  • Traditional and advanced analytics
  • 2. Examining big data types
  • Defining structured data
  • Exploring sources of big structured data
  • Understanding the role of relational databases in big data
  • Defining unstructured data
  • Exploring sources of unstructured data
  • Understanding the role of a CMS in big data management
  • Looking at real-time and non-real-time requirements
  • Putting big data together
  • Managing different data types
  • Integrating data types into a big data environment
  • 3. Old meets new: distributed computing
  • Brief history of distributed computing
  • Giving hanks to DARPA
  • The value of a consistent model
  • Understanding the basics of distributed computing
  • Why we need distributed computing for big data
  • The changing economics of computing
  • The problem with latency
  • Demand meets solutions
  • Getting performance right. pt. II. technology foundations for big data
  • 4. Digging into the big data technology components
  • Exploring the big data stack
  • Redundant physical infrastructure
  • Physical redundant networks
  • Managing hardware : storage and servers
  • Infrastructure operations
  • Security infrastructure
  • Interfaces and feeds to and from applications and the internet
  • Operational databases
  • Organizing data services and tools
  • Analytical data warehouses
  • Big data analytics
  • Big data applications
  • 5. Virtualization and how it supports distributed computing
  • Understanding the basics of virtualization
  • The importance of virtualization to big data
  • Server virtualization
  • Application virtualization
  • Network virtualization
  • Processor and memory virtualization
  • Data and storage virtualization
  • Managing virtualization with the Hypervisor
  • Abstraction and virtualization
  • Implementing virtualization to work with big data
  • 6. Examining the cloud and big data
  • Defining the cloud in the context of big data
  • Understanding cloud deployment and delivery models
  • The cloud as an imperative for big data
  • Making use of the cloud for big data
  • Providers in the big data cloud market
  • Amazon's public Elastic Compute Cloud
  • Google big data services
  • Microsoft Azure
  • OpenStack
  • Where to be careful when using cloud services. pt. III. Big data management
  • 7. Operational databases
  • RDBMs are important in a big data environment
  • PostgreSQL relational database
  • Nonrelational databases
  • Key-value pair databases
  • Riak key-value database
  • Document databases
  • MongoDB
  • CouchDB
  • Columnar databases
  • HBase columnar database
  • Graph databases
  • Neo4J graph database
  • Spatial databases
  • PostGIS/OpenGEO Suite
  • Polyglot persistence
  • 8. MapReduce fundamentals
  • Tracing the origins of MapReduce
  • Understanding the map function
  • Adding the reduce function
  • Putting map and reduce together
  • Optimizing MapReduce tasks
  • Hardware/network topology
  • Synchronization
  • File system
  • 9. Exploring the world of Hadoop
  • Explaining Hadoop
  • Understanding the Hadoop Distributed File system (HDFS)
  • NameNodes
  • Data nodes
  • Under the covers of HDFS
  • Hadoop MapReduce
  • 10. The Hadoop foundation and ecosystem
  • Building a big data foundation with the Hadoop ecosystem
  • Managing resources and applications with Hadoop YARN
  • Storing big data with HBase
  • Mining big data with Hive
  • Interacting with the Hadoop ecosystem
  • Pig and Pig Latin
  • Sqoop
  • Zookeeper
  • 11. Appliances and big data warehouses
  • Integrating big data with the traditional data warehouse
  • Optimizing the data warehouse
  • Differentiating big data structures from data warehouse data
  • Examining a hybrid process case study
  • Big data analysis and the data warehouse
  • The integration lynchpin
  • Rethinking extraction, transformation, and loading
  • Changing the role of the data warehouse
  • Changing deployment models in the Big data era
  • The appliance model
  • The cloud model
  • Examining the future of data warehouses. pt. IV. Analytics and big data
  • 12. Defining big data analytics
  • Using big data to get results
  • Basic analytics
  • Advanced analytics
  • Operationalized analytics
  • Monetizing analytics
  • Modifying business intelligence products to handle big data
  • Data
  • Analytical algorithms
  • Infrastructure support
  • Studying big data analytics examples
  • Orbitz
  • Nokia
  • NASA
  • Big data analytics solutions
  • 13. Understanding text analytics and big data
  • Exploring unstructured data
  • Understanding text analytics
  • Difference between text analytics and search
  • Analysis and extraction techniques
  • Understanding the extracted information
  • Taxonomies
  • Putting your results together with structured data
  • Putting big data to use
  • Voice of the customer
  • Social media analytics
  • Text analytics tools for big data
  • Attensity
  • Clarabridge
  • IBM
  • OpenText
  • SAS
  • 14. Customized approaches for analysis of big data. pt. V. Big data implementation
  • 15. Integrating data sources
  • Identifying the data you need
  • Exploratory stage
  • Codifying stage
  • Integration and incorporation stage
  • Understanding the fundamentals of big data integration
  • Defining traditional ETL
  • Data transformation
  • Understanding ELT : extract, load, and transform
  • Prioritizing big data quality
  • Using Hadoop as ETL
  • Best practices for data integration in a big data world
  • 16. Dealing with real-time data streams and complex event processing
  • Explaining streaming data and complex event processing
  • Using streaming data
  • Data streaming
  • The need for metadata in streams
  • Using complex event processing
  • Differentiating CEP from streams
  • Understanding the impact of streaming data and CEP on business
  • 17. Operationalizing big data
  • Making big data a part of your operational process
  • Integrating big data
  • Incorporating big data into the diagnosis of diseases
  • Understanding big data workflows
  • Workload in context to the business problem
  • Ensuring the validity, veracity, and volatility of big data
  • 18. Applying big data within your organization
  • Figuring the economics of big data
  • Identification of data types and sources
  • Business process modifications or new process creation
  • The technology impact of big data workflows
  • Finding the talent to support big data projects
  • Calculating the return on investment (ROI) from big data investments
  • Enterprise data management and big data
  • Defining enterprise data management
  • Creating a big data implementation road map
  • Understanding business urgency
  • Projecting the right amount of capacity
  • Selecting the right software development methodology
  • Balancing budgets and skill sets
  • Determining your appetite for risk
  • Starting your big data road map
  • 19. Security and governance for big data environments
  • Security in context with big data
  • Assessing the risk for the business
  • Risks lurking inside big data
  • Understanding data protection options
  • The data governance challenge
  • Auditing your big data process
  • Identifying the key stakeholders
  • Putting th right organizational structure in place
  • Preparing for stewardship and management of risk
  • Setting the right governance and quality policies
  • Developing a well-governed and secure big data environment. pt. VI. Big data solutions in the real world
  • 20. The importance of big data to business
  • Big data as business planning tool
  • Planning with data
  • Doing the analysis
  • Checking the results
  • Acting on the plan
  • Adding new dimensions to the planning cycle
  • Monitoring in real time
  • Adjusting the impact
  • Enabling experimentation
  • Keeping data analytics in perspective
  • Getting started with the right foundation
  • Getting your big data strategy started
  • Planning for big data
  • Transforming business processes with big data
  • 21.
  • Analyzing data in motion : a real-world view
  • Understanding companies' needs for data in motion
  • The value of streaming data
  • Streaming data with an environmental impact
  • Using sensors to provide real-time information about rivers and oceans
  • The benefits of real-time data
  • Streaming data with a public policy impact
  • Streaming data in the healthcare industry
  • Capturing the data stream
  • Streaming data in the energy industry
  • Using streaming data to increase energy efficiency
  • Using streaming data to advance the production of alternative sources of energy
  • Connecting streaming data to historical and other real-time data sources
  • 22. Improving business processes with big data analytics: a real-world view
  • Understanding companies' needs for big data analytics
  • Improving the customer experience with text analytics
  • The business value to the big data analytics implementation
  • Using big data analytics to determine next best action
  • Preventing fraud with big data analytics
  • The business benefit of integrating new sources of data. pt. VII. The part of tens
  • 23. Ten big data best practices
  • 24. Ten great big data resources
  • Hurwitz & Associates
  • Standards organizations
  • The Open Data Foundation
  • The Cloud Security Alliance
  • National Institute of Standards and Technology
  • apache Software Foundation
  • OASIS
  • Vendor sites
  • Online collaborative sites
  • Big data conferences
  • 25. Ten big data do's and don'ts
  • Glossary.