GUIDED BY: | | SUBMITTED BY:| Jayshri Patel| | Hardik Barfiwala|
Sr No| Title| Page No.|
1| Introduction Wine Production| |
2| Objectives| |
3| Introduction To Dataset| |
4| Pre-Processing| |
5| Statistics Used In Algorithms| |
6| Algorithms Applied On Dataset| |
7| Comparison Of Applied Algorithm | |
8| Applying Testing Dataset| |
9| Achievements| |
1. INTRODUCTION TO
* Wine industry is currently growing well in the market since the last decade. However, the quality factor in wine has become the main issue in wine making and selling. * To meet the increasing demand, assessing the quality of wine is necessary for the wine industry to prevent tampering of wine quality as well as maintaining it. * To remain competitive, wine industry is investing in new technologies like data mining for analyzing taste and other properties in wine. * Data mining techniques provide more than summary, but valuable information such as patterns and relationships between wine properties and human taste, all of which can be used to improve decision making and optimize chances of success in both marketing and selling. * Two key elements in wine industry are wine certification and quality assessment, which are usually conducted via physicochemical and sensory tests. * Physicochemical tests are lab-based and are used to characterize physicochemical properties in wine such as its density, alcohol or pH values. * Meanwhile, sensory tests such as taste preference are performed by human experts. Taste is a particular property that indicates quality in wine, the success of wine industry will be greatly determined by consumer satisfaction in taste requirements. * Physicochemical data are also found useful in predicting human wine taste preference and classifying wine based on aroma chromatograms.
* Modeling the complex human taste is an important focus in wine industries. * The main purpose of this study was to predict wine quality based on physicochemical data. * This study was also conducted to identify outlier or anomaly in sample wine set in order to detect ruining of wine.
3. INTRODUCTION TO DATASET
To evaluate the performance of data mining dataset is taken into consideration. The present content describes the source of data.
* Source Of Data
Prior to the experimental part of the research, the data is gathered. It is gathered from the UCI Data Repository.
The UCI Repository of Machine Learning Databases and Domain Theories is a free Internet repository of analytical datasets from several areas. All datasets are in text files format provided with a short description. These datasets received recognition from many scientists and are claimed to be a valuable source of data.
* Overview Of Dataset
INFORMATION OF DATASET|
Title:| Wine Quality|
Data Set Characteristics:| Multivariate|
Number Of Instances:| WHITE-WINE : 4898 RED-WINE : 1599 | Area:| Business|
Attribute Characteristic:| Real|
Number Of Attribute:| 11 + Output Attribute|
Missing Value:| N/A|
* Attribute Information
* Input variables (based on physicochemical tests)
* Fixed Acidity: Amount of Tartaric Acid present in wine. (In mg per liter)
Used for taste, feel and color of wine.
* Volatile Acidity: Amount of Acetic Acid present in wine. (In mg per liter)
Its presence in wine is mainly due to yeast and bacterial metabolism.
* Citric Acid: Amount of Citric Acid present in wine. (In mg per liter)
Used to acidify wine that are too basic and as a flavor additive. * Residual Sugar: The concentration of sugar remaining after fermentation.