Preview

Data Science and Prediction

Powerful Essays
Open Document
Open Document
5377 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Science and Prediction
Working paper CeDER-12-01 May 2012 http:// http://hdl.handle.net/2451/31553

Data Science and Prediction Vasant Dhar Professor, Stern School of Business Director, Center for Digital Economy Research March 29, 2012

Abstract The use of the term “Data Science” is becoming increasingly common along with “Big Data.” What does Data Science mean? Is there something unique about it? What skills should a “data scientist” possess to be productive in the emerging digital age characterized by a deluge of data? What are the implications for business and for scientific inquiry? In this brief monograph I address these questions from a predictive modeling perspective.

Electronic copy available at: http://ssrn.com/abstract=2086734

1. Introduction The use of the term “Data Science” is becoming increasingly common along with “Big Data.” What does Data Science mean? Is there something unique about it? What skills should a “data scientist” possess to be productive in the emerging digital age characterized by a deluge of data? What are the implications for scientific inquiry? The term “Science” implies knowledge gained by systematic study. According to one definition, it is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe.1 Data Science might therefore imply a focus around data and by extension, Statistics, which is a systematic study about the organization, properties, and analysis of data and their role in inference, including our confidence in such inference. Why then do we need a new term, when Statistics has been around for centuries? The fact that we now have huge amounts of data should not in and of itself justify the need for a new term. The short answer is that it is different in several ways. First, the raw material, the “data” part of Data Science, is increasingly heterogeneous and unstructured – text, images, and video, often emanating from networks with complex relationships

You May Also Find These Documents Helpful

  • Satisfactory Essays

    ITT Tech MA3110 Vocab 1

    • 539 Words
    • 3 Pages

    Statistics – the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.…

    • 539 Words
    • 3 Pages
    Satisfactory Essays
  • Powerful Essays

    P1 Unit 4 Business Research

    • 2470 Words
    • 10 Pages

    Data is simply a "scientific" term for facts, figures, information and measurements. Example; People with white hair.…

    • 2470 Words
    • 10 Pages
    Powerful Essays
  • Good Essays

    A common misconception is that statistics provide a measure of proof that something is true. Instead, statistics provide a measure of the probability of observing a certain result. It is easy to misuse the statistics in data analysis even to the point of misconception because statistics do not introduce systematic error which can be introduced into the data intentionally or accidentally. There are many associated variables in statistical numbers that the person analyzing the data does not see, and without further explanation or supportive data, one can easily come to the wrong conclusion and the scientist data could be presented as facts rather than probability. If the source from which the data was gathered was not factual, then this will reflect a statistic that is misleading, biased, and based on false information, but those persons who might later interpret the data had no idea that the source was not factual, and as a result wrong information is publicized. Because statistics deal with numbers they often seem to be more convincing and less suspicious of false claim than descriptive arguments, but numbers can be easily manipulated in favor of someone’s opinion.…

    • 498 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Population and Sampling

    • 737 Words
    • 3 Pages

    Statistical data dates back to as early as Ancient Greek time, where it was introduced by John Graunt, William Petty and Pascal in the 16th century. It was then re-introduced by Gottfriend Achenwall in the 17th century. This was a very exciting time for scientists, astronomers and physicists alike as it raised the confidence and knowing that the laws of nature were not of divine intervention. As the years went on, new mathematical discoveries were made such as census data, economy, mortality demographics, and the International Statistical Congresses, which all led to changing its name to “statistics”.…

    • 737 Words
    • 3 Pages
    Good Essays
  • Good Essays

    It is well documented the value of the web for finding information on businesses, governments, and economics—just about any type of information that 's useful for our research. Many Big Data projects focus on this type of information, attempting to gain unique insights and actionable strategies from big picture perspectives that escape the notice of individual searchers who are limited in the amounts of information they can process. Many firms are mining sites such as Facebook, Pinterest, LinkedIn, and others to glean insights into the needs and want of the users who are generating content on their sites (Epstein, 2010). And they 're also observing the behavior of users as they interact with this content and with other users to leverage this knowledge to better-target marketing and sales campaigns.…

    • 592 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Week 4. Team Reflection

    • 609 Words
    • 2 Pages

    Team A’s members range from a project manager who attained their Associate Degree in Computer Programming, a logistics specialists for Amazon.com, a employee in the Hilton Hotel industry, and a wine educator working in hospitality—all very different fields with varying levels of information systems background. While for some it was a review, Team A can all agree that each and every one of us gained a better understanding on how and why information systems accomplish business objectives. Cheryl knew the degree in which wireless technologies kept users plugged into the World Wide Web. She was aware that smartphones and their many accessories allowed users to access their emails, schedules, mobile banking and participate in e-commerce as well as make online payments—she learned that M-commerce another growing trend. Due to telemedicine, modern technology has allowed the medical world to provide assistance via videoconferencing. In addition, she learned that setting up and using access points to create meshed networks called a Wide Area Network (WAN) (Rainer & Cegielski, 2011). Xavier learned the relevance of wireless technology in everyday life. More specifically, he learned of the different functions of varying satellite types to communicate information. Kelly learned about the two basic operations of data mining. According to Rainer and Cegielski (2011), data mining functions in “predicting trends and behaviors and identifying previously unknown patterns”…

    • 609 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Big Data In Healthcare

    • 177 Words
    • 1 Page

    Using big data, analysts can can develop statistical methods that study trends and predict risks to provide target care. Healthcare organizations are giving an advanced care and prevent unnecessary utilization by using predictive models (El-Emam, Gray, Grealy, Hogle, Lichtenfeld, McGraw, & Grumet, 2013). Mt. Sinai Medical Center in New York decrease its thirty-day readmission rate from thirty percent to twelve percent by target treatment effectively to high-risk patients using a predictive model. And the hospital also benefited from the decreasing number of high-risk patient to the emergency department by sixty-three percent. Besides, big data approach offers an opportunity to government healthcare organizations or others private organizations…

    • 177 Words
    • 1 Page
    Good Essays
  • Good Essays

    form of “big data,” so called not only for its sheer volume but for its complexity, diversity, and timeliness.1…

    • 9306 Words
    • 38 Pages
    Good Essays
  • Good Essays

    test

    • 981 Words
    • 4 Pages

    But our education and training programs haven’t adapted to develop the skills big data demands. US students are shuttled along a familiar path in mathematics: first stop algebra, then geometry and trigonometry, and finally, the ultimate destination, calculus. This time-honored curriculum seems increasingly out of touch in a world that is flooded with noisy and voluminous data. The majority of students need to be immersed in the more practical discipline of statistics, which has greater relevance for the jobs being generated by a digital economy.…

    • 981 Words
    • 4 Pages
    Good Essays
  • Better Essays

    This research study was significant to the field of nursing. This article was attempting to outline why the nursing field needs big data and data science while also advocating that big data still needs improvement and has a long ways to come in the current nursing world. The authors did a good job of pointing out how modern nursing and big data are really interrelated. I did not see any assumptions mentioned by the author. I did not recognize any limitations to this…

    • 1108 Words
    • 5 Pages
    Better Essays
  • Better Essays

    Big Data Google Flu

    • 3122 Words
    • 15 Pages

    have asserted that there are enormous scientific possibilities in big data (9–11). However, quantity of data does not mean that…

    • 3122 Words
    • 15 Pages
    Better Essays
  • Powerful Essays

    Data Collection Methods

    • 4882 Words
    • 20 Pages

    The term data means groups of information that represent the qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and knowledge are derived.…

    • 4882 Words
    • 20 Pages
    Powerful Essays
  • Powerful Essays

    MIS IT

    • 2947 Words
    • 9 Pages

    One of the things you often find people arguing about is what a data model is, and what it is for. Here’s one of the secrets of analysis: when you find people arguing passionately about something, try to discover why they are both right. So it is with data models. Data models have many purposes. These cause differences in both style and content, which can cause confusion, surprise, and disagreement. This section looks at some different types of data models (I do not claim necessarily to have exhausted the possibilities) and how their purposes might lead them to differ for nominally the same scope. A particular data model may be of more than one of the types identified.…

    • 2947 Words
    • 9 Pages
    Powerful Essays
  • Powerful Essays

    Predicitve Analytics

    • 3454 Words
    • 14 Pages

    A secondary research paper on Predictive analytics; which is a mix of tools and techniques that support organizations to identify probability in data that can be used find out the future outcomes. The scope this study is to identify the potential of predictive analytics to leverage advertising, marketing campaign and business development initiatives thereby understanding the customer behavior, customer preferences, change, attitudes, purchase behaviors and attaining a high degree of confidence in their decisions about what to do differently for each segment, as potential moves have been “pre-tested.”…

    • 3454 Words
    • 14 Pages
    Powerful Essays
  • Satisfactory Essays

    Since the day I graduated from my college, I always thought that with time, while attaining some professional experience, I would unveil the subject I want to be prowess. Now when I am sitting on a professional experience of more than 2 years, I realised I am somehow always working on the quantitative data in order to come out with a qualitative research. With a moderate amount of knowledge attained in data management in my academics, I embarked upon the journey of my professional career, which up until now made me a realization that I have sparked the interest in myself for working on the data. And, gradually with my working experience, I have honed my statistical and basic programming skills in analytical tools (like SAS, SPSS and MS excel) with the virtue of being working on such job profiles, where it is an expectation to use these skills in order to deliver the business. I want to further strengthen my ability to glean business-shaping insights out of data, by acquiring in-depth technical knowledge and improving understanding of business data models. This will prepare me for business leadership positions in industries where data analysis is a key driver.…

    • 578 Words
    • 2 Pages
    Satisfactory Essays